1
|
Xie X, Li T, Ma L, Wu J, Qi Y, Yang B, Li Z, Yang Z, Zhang K, Chu Z, Ngai T, Xia J, Wang Y, Zhao P, Bian L. A designer minimalistic model parallels the phase-separation-mediated assembly and biophysical cues of extracellular matrix. Nat Chem 2025:10.1038/s41557-025-01837-5. [PMID: 40490569 DOI: 10.1038/s41557-025-01837-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2024] [Accepted: 04/22/2025] [Indexed: 06/11/2025]
Abstract
The propensity for controlled liquid-liquid phase separation and subsequent directed phase transition are crucial for the coacervation-mediated assembly of extracellular matrix (ECM). This spatiotemporally controlled ECM assembly can be used to develop coacervate-based polymer assembly strategies to generate biomimetic materials that can emulate the complex structures and biophysical cues of the ECM. Inspired by the tropoelastin structure, here we develop a designer minimalistic model consisting of alternating hydrophobic moieties and covalent crosslinking domains. By increasing the valence and enhancing the interaction strength of the hydrophobic moieties, we can control the degree of the assembly to enhance the propensity for phase separation and thus emulate the extracellular coacervation process of tropoelastin, including droplet formation, coalescence and maturation. The subsequent covalent-bonding-triggered coacervate-hydrogel transition with enhanced assembly order stabilizes the phase-separated structure in the form of a heterogeneous hydrogel, thereby mimicking covalent crosslinking-derived elastin fibrillation. Furthermore, the heterogeneous hydrogel network establishes a biomimetic matrix that can effectively promote the mechanosensing of adherent stem cells.
Collapse
Affiliation(s)
- Xian Xie
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Hong Kong, P.R. China
- Department of Chemistry, The Chinese University of Hong Kong, Hong Kong, P.R. China
| | - Tianjie Li
- Department of Physics, The Chinese University of Hong Kong, Hong Kong, P.R. China
| | - Linjie Ma
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, P.R. China
| | - Jiahao Wu
- Department of Chemistry, The Chinese University of Hong Kong, Hong Kong, P.R. China
| | - Yajing Qi
- Department of Physics, The Chinese University of Hong Kong, Hong Kong, P.R. China
| | - Boguang Yang
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Hong Kong, P.R. China
| | - Zhuo Li
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Hong Kong, P.R. China
| | - Zhinan Yang
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Hong Kong, P.R. China
| | - Kunyu Zhang
- School of Biomedical Sciences and Engineering, Guangzhou International Campus, South China University of Technology, Guangzhou, P.R. China
- Guangdong Provincial Key Laboratory of Biomedical Engineering, South China University of Technology, Guangzhou, P.R. China
- Key Laboratory of Biomedical Materials and Engineering of the Ministry of Education, South China University of Technology, Guangzhou, P.R. China
- National Engineering Research Center for Tissue Restoration and Reconstruction, South China University of Technology, Guangzhou, P.R. China
| | - Zhiqin Chu
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, P.R. China
| | - To Ngai
- Department of Chemistry, The Chinese University of Hong Kong, Hong Kong, P.R. China
| | - Jiang Xia
- Department of Chemistry, The Chinese University of Hong Kong, Hong Kong, P.R. China
| | - Yi Wang
- Department of Physics, The Chinese University of Hong Kong, Hong Kong, P.R. China.
| | - Pengchao Zhao
- School of Biomedical Sciences and Engineering, Guangzhou International Campus, South China University of Technology, Guangzhou, P.R. China.
- Guangdong Provincial Key Laboratory of Biomedical Engineering, South China University of Technology, Guangzhou, P.R. China.
- Key Laboratory of Biomedical Materials and Engineering of the Ministry of Education, South China University of Technology, Guangzhou, P.R. China.
- National Engineering Research Center for Tissue Restoration and Reconstruction, South China University of Technology, Guangzhou, P.R. China.
| | - Liming Bian
- School of Biomedical Sciences and Engineering, Guangzhou International Campus, South China University of Technology, Guangzhou, P.R. China.
- Guangdong Provincial Key Laboratory of Biomedical Engineering, South China University of Technology, Guangzhou, P.R. China.
- Key Laboratory of Biomedical Materials and Engineering of the Ministry of Education, South China University of Technology, Guangzhou, P.R. China.
- National Engineering Research Center for Tissue Restoration and Reconstruction, South China University of Technology, Guangzhou, P.R. China.
| |
Collapse
|
2
|
Mehdiabadi M, Blum M, Tesei G, von Bülow S, Lindorff-Larsen K, Tosatto SCE, Piovesan D. MobiDB-lite 4.0: faster prediction of intrinsic protein disorder and structural compactness. Bioinformatics 2025; 41:btaf297. [PMID: 40347452 PMCID: PMC12122076 DOI: 10.1093/bioinformatics/btaf297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2025] [Revised: 04/18/2025] [Accepted: 05/09/2025] [Indexed: 05/14/2025] Open
Abstract
MOTIVATION In recent years, many disorder predictors have been developed to identify intrinsically disordered regions (IDRs) in proteins, achieving high accuracy. However, it may be difficult to interpret differences in predictions across methods. Consensus methods offer a simple solution, highlighting reliable predictions while filtering out uncertain positions. Here, we present a new version of MobiDB-lite, a consensus method designed to predict long IDRs and classify them based on compositional biases and conformational properties. RESULTS MobiDB-lite 4.0 pipeline was optimized to be ten times faster than the previous version. It now provides compactness annotations based on predicted apparent scaling exponent. The newly added features and disorder subclassifications allow the users to get a comprehensive insight into the protein's function and characteristics. MobiDB-lite 4.0 is integrated into the MobiDB and DisProt databases. A version without the compactness predictor is integrated into InterProScan, propagating MobiDB-lite annotations to UniProtKB. AVAILABILITY AND IMPLEMENTATION The MobiDB-lite 4.0 source code and a Docker container are available from the GitHub repository: https://github.com/BioComputingUP/MobiDB-lite.
Collapse
Affiliation(s)
- Mahta Mehdiabadi
- Department of Biomedical Sciences, University of Padova, 35131 Padova, Italy
| | - Matthias Blum
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Giulio Tesei
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Sören von Bülow
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padova, 35131 Padova, Italy
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR-IBIOM), 70126 Bari, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, 35131 Padova, Italy
| |
Collapse
|
3
|
Roy M, Sanchez A, Guerois R, Senoussi I, Cerana A, Sgrignani J, Cavalli A, Rinaldi A, Cejka P. EXO1 promotes the meiotic MLH1-MLH3 endonuclease through conserved interactions with MLH1, MSH4 and DNA. Nat Commun 2025; 16:4141. [PMID: 40319035 PMCID: PMC12049449 DOI: 10.1038/s41467-025-59470-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Accepted: 04/22/2025] [Indexed: 05/07/2025] Open
Abstract
The endonuclease activity of MLH1-MLH3 (MutLγ) is stimulated by MSH4-MSH5 (MutSγ), EXO1, and RFC-PCNA to resolve meiotic recombination intermediates such as double Holliday junctions (HJs) into crossovers. We show that EXO1 directly interacts with MLH1 via the EXO1 MIP motif and a patch centered around EXO1-I403. Disrupting this interaction unexpectedly only partially inhibited MutLγ. We found that EXO1 also directly interacts with MutSγ. Crucially, a single point mutation in EXO1 (W371E) impairs its interaction with MSH4 and completely abolished its ability to activate DNA nicking by MutLγ without affecting its intrinsic nuclease function. Finally, disrupting magnesium coordinating residues in the nuclease domain of EXO1 has no impact on MutSγ-MutLγ activity, while the integrity of EXO1 residues mediating interactions with double-stranded DNA (dsDNA) is important. Our findings suggest EXO1 is an integral structural component of the meiotic resolvase complex, supported by conserved interactions with MutSγ, MutLγ and dsDNA. We propose that EXO1 helps tether MutSγ-MutLγ to dsDNA downstream of HJ recognition to promote DNA cleavage.
Collapse
Affiliation(s)
- Megha Roy
- Institute for Research in Biomedicine, Università della Svizzera italiana (USI), Faculty of Biomedical Sciences, Bellinzona, Switzerland
| | - Aurore Sanchez
- Institute for Research in Biomedicine, Università della Svizzera italiana (USI), Faculty of Biomedical Sciences, Bellinzona, Switzerland.
- Institut Curie, PSL University, Sorbonne Université, CNRS UMR3244, Paris, France.
| | - Raphael Guerois
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), Gif-sur-Yvette, France
| | - Issam Senoussi
- Institute for Research in Biomedicine, Università della Svizzera italiana (USI), Faculty of Biomedical Sciences, Bellinzona, Switzerland
- Department of Biology, Institute of Biochemistry, Eidgenössische Technische Hochschule (ETH), Zürich, Switzerland
| | - Arianna Cerana
- Institute of Oncology Research, Università della Svizzera italiana (USI), Faculty of Biomedical Sciences, Bellinzona, Switzerland
| | - Jacopo Sgrignani
- Institute for Research in Biomedicine, Università della Svizzera italiana (USI), Faculty of Biomedical Sciences, Bellinzona, Switzerland
| | - Andrea Cavalli
- Institute for Research in Biomedicine, Università della Svizzera italiana (USI), Faculty of Biomedical Sciences, Bellinzona, Switzerland
| | - Andrea Rinaldi
- Institute of Oncology Research, Università della Svizzera italiana (USI), Faculty of Biomedical Sciences, Bellinzona, Switzerland
| | - Petr Cejka
- Institute for Research in Biomedicine, Università della Svizzera italiana (USI), Faculty of Biomedical Sciences, Bellinzona, Switzerland.
| |
Collapse
|
4
|
Lemke MC, Avala NR, Rader MT, Hargett SR, Lank DS, Seltzer BD, Harris TE. MAST Kinases' Function and Regulation: Insights from Structural Modeling and Disease Mutations. Biomedicines 2025; 13:925. [PMID: 40299535 PMCID: PMC12024977 DOI: 10.3390/biomedicines13040925] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2025] [Revised: 04/01/2025] [Accepted: 04/03/2025] [Indexed: 04/30/2025] Open
Abstract
Background/Objectives: The MAST kinases are ancient AGC kinases associated with many human diseases, such as cancer, diabetes, and neurodevelopmental disorders. We set out to describe the origins and diversification of MAST kinases from a structural and bioinformatic perspective to inform future research directions. Methods: We investigated MAST-lineage kinases using database and sequence analysis. We also estimate the functional consequences of disease point mutations on protein stability by integrating predictive algorithms and AlphaFold. Results: Higher-order organisms often have multiple MASTs and a single MASTL kinase. MAST proteins conserve an AGC kinase domain, a domain of unknown function 1908 (DUF), and a PDZ binding domain. D. discoideum contains MAST kinase-like proteins that exhibit a characteristic insertion within the T-loop but do not conserve DUF or PDZ domains. While the DUF domain is conserved in plants, the PDZ domain is not. The four mammalian MASTs demonstrate tissue expression heterogeneity by mRNA and protein. MAST1-4 are likely regulated by 14-3-3 proteins based on interactome data and in silico predictions. Comparative ΔΔG estimation identified that MAST1-L232P and G522E mutations are likely destabilizing. Conclusions: We conclude that MAST and MASTL kinases diverged from the primordial MAST, which likely operated in both biological niches. The number of MAST paralogs then expanded to the heterogeneous subfamily seen in mammals that are all likely regulated by 14-3-3 protein interaction. The reported pathogenic mutations in MASTs primarily represent alterations to post-translational modification topology in the DUF and kinase domains. Our report outlines a computational basis for future work in MAST kinase regulation and drug discovery.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Thurl E. Harris
- Department of Pharmacology, University of Virginia, Charlottesville, VA 22903, USA; (M.C.L.)
| |
Collapse
|
5
|
Alanazi W, Meng D, Pollastri G. Advancements in one-dimensional protein structure prediction using machine learning and deep learning. Comput Struct Biotechnol J 2025; 27:1416-1430. [PMID: 40242292 PMCID: PMC12002955 DOI: 10.1016/j.csbj.2025.04.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2025] [Revised: 04/01/2025] [Accepted: 04/02/2025] [Indexed: 04/18/2025] Open
Abstract
The accurate prediction of protein structures remains a cornerstone challenge in structural bioinformatics, essential for understanding the intricate relationship between protein sequence, structure, and function. Recent advancements in Machine Learning (ML) and Deep Learning (DL) have revolutionized this field, offering innovative approaches to tackle one- dimensional (1D) protein structure annotations, including secondary structure, solvent accessibility, and intrinsic disorder. This review highlights the evolution of predictive methodologies, from early machine learning models to sophisticated deep learning frameworks that integrate sequence embeddings and pretrained language models. Key advancements, such as AlphaFold's transformative impact on structure prediction and the rise of protein language models (PLMs), have enabled unprecedented accuracy in capturing sequence-structure relationships. Furthermore, we explore the role of specialized datasets, benchmarking competitions, and multimodal integration in shaping state-of-the-art prediction models. By addressing challenges in data quality, scalability, interpretability, and task-specific optimization, this review underscores the transformative impact of ML, DL, and PLMs on 1D protein prediction while providing insights into emerging trends and future directions in this rapidly evolving field.
Collapse
Affiliation(s)
- Wafa Alanazi
- School of Computer Science, University College Dublin, Belfield, Dublin D04 C1P1, Ireland
- Department of Computer Science, College of Science, Northern Border University, Arar, Saudi Arabia
| | - Di Meng
- School of Computer Science, University College Dublin, Belfield, Dublin D04 C1P1, Ireland
| | - Gianluca Pollastri
- School of Computer Science, University College Dublin, Belfield, Dublin D04 C1P1, Ireland
| |
Collapse
|
6
|
Orlando M, Marchetti A, Bombardi L, Lotti M, Fusco S, Mangiagalli M. Polysaccharide degradation in an Antarctic bacterium: Discovery of glycoside hydrolases from remote regions of the sequence space. Int J Biol Macromol 2025; 299:140113. [PMID: 39842586 DOI: 10.1016/j.ijbiomac.2025.140113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2024] [Revised: 01/13/2025] [Accepted: 01/18/2025] [Indexed: 01/24/2025]
Abstract
Glycoside hydrolases (GHs) are enzymes involved in the degradation of oligosaccharides and polysaccharides. The sequence space of GHs is rapidly expanding due to the increasing number of available sequences. This expansion paves the way for the discovery of novel enzymes with peculiar structural and functional properties. This work is focused on two GHs, Ps_GH5 and Ps_GH50, from the genome of the Antarctic bacterium Pseudomonas sp. ef1. These enzymes are in an unexplored region of the sequence space of their respective GH families, not allowing a reliable sequence-based function prediction. For this reason, a computational pipeline was developed that combines deep learning "dynamic docking" on AlphaFold 3D models with physics-based molecular dynamics simulations to infer their substrate specificity. From in silico screening of a repertoire of potential oligosaccharides, only xylooligosaccharides for Ps_GH5 and galactooligosaccharides for Ps_GH50 emerged as catalytically competent substrates. Biochemical characterization agrees with computational simulations indicating that Ps_GH5 is an endo-β-xylanase, and Ps_GH50 is active mainly on small galactooligosaccharides. In conclusion, this study identifies two novel GHs subfamilies placed in remote regions of the sequence space and highlights the efficacy of substrate specificity prediction by computational approaches in the discovery of new enzymes.
Collapse
Affiliation(s)
- Marco Orlando
- Department of Biotechnology and Biosciences, University of Milano Bicocca, Piazza della Scienza 2, Milano 20126, Italy
| | - Alessandro Marchetti
- Department of Biotechnology and Biosciences, University of Milano Bicocca, Piazza della Scienza 2, Milano 20126, Italy
| | - Luca Bombardi
- Biochemistry and Industrial Biotechnology (BIB) Laboratory, Department of Biotechnology, University of Verona, Verona, Italy
| | - Marina Lotti
- Department of Biotechnology and Biosciences, University of Milano Bicocca, Piazza della Scienza 2, Milano 20126, Italy
| | - Salvatore Fusco
- Biochemistry and Industrial Biotechnology (BIB) Laboratory, Department of Biotechnology, University of Verona, Verona, Italy.
| | - Marco Mangiagalli
- Department of Biotechnology and Biosciences, University of Milano Bicocca, Piazza della Scienza 2, Milano 20126, Italy.
| |
Collapse
|
7
|
Zhu X, Wang W, Sun S, Chng CP, Xie Y, Zhu K, He D, Liang Q, Ma Z, Wu X, Zheng X, Gao W, Miserez A, Gao C, Yu J, Huang C, Groves JT, Miao Y. Bacterial XopR subverts RIN4 complex-mediated plant immunity via plasma membrane-associated percolation. Dev Cell 2025:S1534-5807(25)00123-6. [PMID: 40139193 DOI: 10.1016/j.devcel.2025.03.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 05/17/2024] [Accepted: 03/03/2025] [Indexed: 03/29/2025]
Abstract
Phytobacteria release type 3 effectors (T3Es) abundant in intrinsically disordered regions (IDRs) to undermine plant defenses. How flexible IDRs contribute to T3Es' function in subverting plant immunity remains unclear. Here, we identify a plant plasma membrane (PM)-associated macromolecular condensation mechanism that governs the sophisticated interplay between T3E XopR and the plant's Resistance to Pseudomonas syringae pv. maculicola 1 (RPM1)-interacting protein 4 (RIN4) immune complex. Upon deployment into plants, XopR undergoes PM association, percolation clustering, and spanning networking on the PM, ranging from subnanomolar to tens of nanomolar. This spatiotemporal building of the XopR network enables an efficient manipulation of plant surface immune regulators, including a coiled-coil nucleotide-binding leucine-rich repeat receptor (CNL)-guardee complex with highly disordered RIN4. When XopR hijacks and fluidizes the RIN4-RPM1 condensates, Arabidopsis shows reduced RIN4 phosphorylation and diminished RPM1-activated defense in vivo, consistent with XopR-impaired RIN4 phosphorylation by RPM1-interacting protein kinase (RIPK). Our research illuminates the mechanism underlying the dynamic interplay between bacterial T3Es and plant receptor complex condensates during infection.
Collapse
Affiliation(s)
- Xinlu Zhu
- School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore
| | - Weibing Wang
- School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore
| | - Simou Sun
- Institute for Digital Molecular Analytics and Science, Nanyang Technological University, Singapore 636921, Singapore
| | - Choon-Peng Chng
- School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore 639798, Singapore
| | - Yi Xie
- School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore
| | - Kexin Zhu
- School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore
| | - Danxia He
- School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore
| | - Qiyu Liang
- School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore; Division of Physics and Applied Physics, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371, Singapore
| | - Zhiming Ma
- School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore
| | - Xi Wu
- School of Materials Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore
| | - Xuanang Zheng
- School of Life Sciences, South China Normal University, Guangzhou 510631, China
| | - Weibo Gao
- Division of Physics and Applied Physics, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371, Singapore
| | - Ali Miserez
- School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore; School of Materials Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore
| | - Caiji Gao
- School of Life Sciences, South China Normal University, Guangzhou 510631, China
| | - Jing Yu
- School of Materials Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore
| | - Changjin Huang
- School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore 639798, Singapore
| | - Jay T Groves
- Institute for Digital Molecular Analytics and Science, Nanyang Technological University, Singapore 636921, Singapore; Department of Chemistry, University of California Berkeley, Berkeley, CA 94720, USA
| | - Yansong Miao
- School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore; Institute for Digital Molecular Analytics and Science, Nanyang Technological University, Singapore 636921, Singapore.
| |
Collapse
|
8
|
Lanave G, Pellegrini F, Diakoudi G, Catella C, Cavalli A, Capozza P, Elia G, Di Martino B, Zini E, Pollicino G, Zatelli A, Bányai K, Lavazza A, Decaro N, Camero M, Martella V. Discovery of a human parvovirus B19 analog (Erythroparvovirus) in cats. Sci Rep 2025; 15:9650. [PMID: 40113872 PMCID: PMC11926167 DOI: 10.1038/s41598-025-94123-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2024] [Accepted: 03/11/2025] [Indexed: 03/22/2025] Open
Abstract
Two feral cats (from the same colony) were presented to the veterinary clinic for weakness, weight loss, and anorexia. The cats were part of a study on feline hepatotropic viruses (collection A, 43 animals). On metaviromic investigation, parvoviral reads were identified in the sera of the two cats. The feline parvovirus genome was 5.3 kb long with an organization similar to other members of the Erythroparvovirus genus. In the ORF1 (nonstructural proteins) and ORF2 (VP1/VP2 precursor) the feline virus displayed 43.1% and 49.1% nucleotide identity to human parvovirus B19, and 48.9% and 56.6% to chipmunk parvovirus. Sequence identity to canine/feline protoparvovirus (Protoparvovirus carnivoran 1) was as low as 36.5% % and 29.2% in the ORF1 and ORF2, respectively. Using a quantitative PCR assay, the virus was also identified in an additional ten cats (prevalence 27.6%, 12/43) from collection A and in 15/1150 (1.3%) of archival sera (collection B), revealing a higher infection rate in cats with altered hepatic markers, suggestive of hepatic distress. The findings of our study extend the list of known parvoviruses in the feline host.
Collapse
Affiliation(s)
- Gianvito Lanave
- Department of Veterinary Medicine, University of Bari Aldo Moro, Valenzano, Bari, Italy.
| | - Francesco Pellegrini
- Department of Veterinary Medicine, University of Bari Aldo Moro, Valenzano, Bari, Italy
| | - Georgia Diakoudi
- Department of Veterinary Medicine, University of Bari Aldo Moro, Valenzano, Bari, Italy
| | - Cristiana Catella
- Department of Veterinary Medicine, University of Bari Aldo Moro, Valenzano, Bari, Italy
| | - Alessandra Cavalli
- Department of Veterinary Medicine, University of Bari Aldo Moro, Valenzano, Bari, Italy
| | - Paolo Capozza
- Department of Veterinary Medicine, University of Bari Aldo Moro, Valenzano, Bari, Italy
| | - Gabriella Elia
- Department of Veterinary Medicine, University of Bari Aldo Moro, Valenzano, Bari, Italy
| | | | - Eric Zini
- AniCura Istituto Veterinario Novara, Granozzo Con Monticello, Novara, Italy
- Department of Animal Medicine, Productions and Health University of Padua, Padua, Italy
- Clinic for Small Animal Internal Medicine, Vetsuisse Faculty, University of Zurich, Zürich, Switzerland
| | - Giuseppe Pollicino
- AniCura Istituto Veterinario Novara, Granozzo Con Monticello, Novara, Italy
| | - Andrea Zatelli
- Department of Veterinary Medicine, University of Bari Aldo Moro, Valenzano, Bari, Italy
| | - Krisztián Bányai
- Department of Pharmacology and Toxicology, University of Veterinary Medicine, Budapest, Hungary
- Szentágothai Research Centre, University of Pécs, Pécs, Hungary
| | - Antonio Lavazza
- Experimental Zooprophylactic Institute of Lombardia and Emilia Romagna, Brescia, Italy
| | - Nicola Decaro
- Department of Veterinary Medicine, University of Bari Aldo Moro, Valenzano, Bari, Italy
| | - Michele Camero
- Department of Veterinary Medicine, University of Bari Aldo Moro, Valenzano, Bari, Italy
| | - Vito Martella
- Department of Veterinary Medicine, University of Bari Aldo Moro, Valenzano, Bari, Italy
- Department of Pharmacology and Toxicology, University of Veterinary Medicine, Budapest, Hungary
| |
Collapse
|
9
|
Zhang K, Cai Y, Chen Y, Fu Y, Zhu Z, Huang J, Qin H, Yang Q, Li X, Wu Y, Suo X, Jiang Y, Zhang L. Chromosome-level genome assembly of Eimeria tenella at the single-oocyst level. BMC Genomics 2025; 26:257. [PMID: 40097928 PMCID: PMC11912684 DOI: 10.1186/s12864-025-11423-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2024] [Accepted: 02/28/2025] [Indexed: 03/19/2025] Open
Abstract
BACKGROUND Eimeria are obligate protozoan parasites, and more than 1,500 species have been reported. However, Eimeria genomes lag behind many other eukaryotes since obtaining many oocysts is difficult due to a lack of sustainable in vitro culture, highly repetitive sequences, and mixed species infections. To address this challenge, we used whole-genome amplification of a single oocyst followed by long-read sequencing and obtained a chromosome-level genome of Eimeria tenella. RESULTS The assembled genome was 52.13 Mb long, encompassing 15 chromosomes and 46.94% repeat sequences. In total, 7,296 protein-coding genes were predicted, exhibiting high completeness, with 92.00% single-copy BUSCO genes. To the best of our knowledge, this is the first chromosome-level assembly of E. tenella using a combination of single-oocyst whole-genome amplification and long-read sequencing. Comparative genomic and transcriptome analyses confirmed evolutionary relationship and supported estimates of divergence time of apicomplexan parasites and identified AP2 and Myb gene families that may play indispensable roles in regulating the growth and development of E. tenella. CONCLUSION This high-quality genome assembly and the established sequencing strategy provide valuable community resources for comparative genomic and evolutionary analyses of the Eimeria clade. Additionally, our study also provides a valuable resource for exploring the roles of AP2 and Myb transcription factor genes in regulating the development of Eimeria parasites.
Collapse
Affiliation(s)
- Kaihui Zhang
- College of Veterinary Medicine, Henan Agricultural University, No. 15 Longzihu University Area, Zhengzhou New District, Zhengzhou, 450046, P.R. China
- International Joint Research Laboratory for Zoonotic Diseases of Henan, Zhengzhou, 450002, Henan Province, China
- Key Laboratory of Quality and Safety Control of Poultry Products (Zhengzhou), Ministry of Agriculture and Rural Affairs, Zhengzhou, P.R. China
| | - Yudong Cai
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, No. 22, Xinong Road, Agricultural High-tech Industrial Demonstration Zone, Yangling, 712100, China
| | - Yuancai Chen
- College of Veterinary Medicine, Henan Agricultural University, No. 15 Longzihu University Area, Zhengzhou New District, Zhengzhou, 450046, P.R. China
- International Joint Research Laboratory for Zoonotic Diseases of Henan, Zhengzhou, 450002, Henan Province, China
- Key Laboratory of Quality and Safety Control of Poultry Products (Zhengzhou), Ministry of Agriculture and Rural Affairs, Zhengzhou, P.R. China
| | - Yin Fu
- College of Veterinary Medicine, Henan Agricultural University, No. 15 Longzihu University Area, Zhengzhou New District, Zhengzhou, 450046, P.R. China
- International Joint Research Laboratory for Zoonotic Diseases of Henan, Zhengzhou, 450002, Henan Province, China
- Key Laboratory of Quality and Safety Control of Poultry Products (Zhengzhou), Ministry of Agriculture and Rural Affairs, Zhengzhou, P.R. China
| | - Ziqi Zhu
- College of Veterinary Medicine, Henan Agricultural University, No. 15 Longzihu University Area, Zhengzhou New District, Zhengzhou, 450046, P.R. China
- International Joint Research Laboratory for Zoonotic Diseases of Henan, Zhengzhou, 450002, Henan Province, China
- Key Laboratory of Quality and Safety Control of Poultry Products (Zhengzhou), Ministry of Agriculture and Rural Affairs, Zhengzhou, P.R. China
| | - Jianying Huang
- College of Veterinary Medicine, Henan Agricultural University, No. 15 Longzihu University Area, Zhengzhou New District, Zhengzhou, 450046, P.R. China
- International Joint Research Laboratory for Zoonotic Diseases of Henan, Zhengzhou, 450002, Henan Province, China
- Key Laboratory of Quality and Safety Control of Poultry Products (Zhengzhou), Ministry of Agriculture and Rural Affairs, Zhengzhou, P.R. China
| | - Huikai Qin
- College of Veterinary Medicine, Henan Agricultural University, No. 15 Longzihu University Area, Zhengzhou New District, Zhengzhou, 450046, P.R. China
- International Joint Research Laboratory for Zoonotic Diseases of Henan, Zhengzhou, 450002, Henan Province, China
- Key Laboratory of Quality and Safety Control of Poultry Products (Zhengzhou), Ministry of Agriculture and Rural Affairs, Zhengzhou, P.R. China
| | - Qimeng Yang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, No. 22, Xinong Road, Agricultural High-tech Industrial Demonstration Zone, Yangling, 712100, China
| | - Xinmei Li
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, No. 22, Xinong Road, Agricultural High-tech Industrial Demonstration Zone, Yangling, 712100, China
| | - Yayun Wu
- College of Veterinary Medicine, Henan Agricultural University, No. 15 Longzihu University Area, Zhengzhou New District, Zhengzhou, 450046, P.R. China
- International Joint Research Laboratory for Zoonotic Diseases of Henan, Zhengzhou, 450002, Henan Province, China
- Key Laboratory of Quality and Safety Control of Poultry Products (Zhengzhou), Ministry of Agriculture and Rural Affairs, Zhengzhou, P.R. China
| | - Xun Suo
- National Key Laboratory of Veterinary Public Health Security, Key Laboratory of Animal Epidemiology and Zoonosis of Ministry of Agriculture, National Animal Protozoa Laboratory, College of Veterinary Medicine, China Agricultural University, Beijing, 100193, China
| | - Yu Jiang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, No. 22, Xinong Road, Agricultural High-tech Industrial Demonstration Zone, Yangling, 712100, China.
| | - Longxian Zhang
- College of Veterinary Medicine, Henan Agricultural University, No. 15 Longzihu University Area, Zhengzhou New District, Zhengzhou, 450046, P.R. China.
- International Joint Research Laboratory for Zoonotic Diseases of Henan, Zhengzhou, 450002, Henan Province, China.
- Key Laboratory of Quality and Safety Control of Poultry Products (Zhengzhou), Ministry of Agriculture and Rural Affairs, Zhengzhou, P.R. China.
| |
Collapse
|
10
|
Arora S, Nagarkar P, D'Souza JS. Recombinant human FOXJ1 protein binds DNA, forms higher-order oligomers, has gel-shifting domains and contains intrinsically disordered regions. Protein Expr Purif 2025; 227:106622. [PMID: 39549898 DOI: 10.1016/j.pep.2024.106622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2024] [Revised: 11/06/2024] [Accepted: 11/10/2024] [Indexed: 11/18/2024]
Abstract
Forkhead box protein J1 (FOXJ1) is the key transcriptional regulator during the conversion of mammalian primary cilium with a 9 + 0 architecture to the motile (9 + 2) one. The nucleotide sequences of the full-length and DNA-binding domain (DBD) of the open reading frame (ORF) were isolated and expressed into E. coli as 6xHis-tagged proteins. Upon induction, the DBD formed inclusion bodies that solubilized with 8 M urea. No induction of 6xHis-FOXJ1 protein was seen despite sub-cloning into several expression vectors and E. coli host strains. To improve induction and solubility, the 6xHis tag was substituted with Glutathione S-transferase (GST), and weak induction was seen in E. coli BL21(DE3). The GST-FOXJ1 showed anomalous migration on denaturing gel electrophoresis (AM-DRE), migrating at approximately 83 kDa instead of its calculated molecular weight (Mr) of 72.4 kDa. It was also unstable and led to degradation products. The 6xHis tag was substituted with Glutathione S-transferase (GST) to improve induction and solubility. Codon-optimization improved the induction, but the protein still showed AM-DRE and instability. It seemed that the recombinant protein was either toxic or posed a metabolic burden to the E. coli cells or, once produced was prone to degradation due mainly to the lack of post-translational modification (PTM). This process is required for some eukaryotic proteins after they are manufactured in the ribosomal factory. Both the purified recombinant proteins exhibited cysteine-induced oligomerization via the formation of disulphide bridges since this was reduced using dithiothreitol (DTT). Both were equally functional as these individually bound to an oligonucleotide, a consensus DNA-binding sequence for FOX proteins. Further, the recombinant polypeptides corresponding to the C-terminus and N-terminus show anomalies indicating that the highly acidic residues (known as polyacidic gel-shifting domains) in these polypeptides contribute to the AM-DRE. We demonstrate for the first time that the recombinant HsFOXJ1 and its DBD bind to DNA, its polyacidic gel-shifting domains are the reason for the AM-DRE, is unstable leading to degradation products, exhibits cysteine-induced oligomerization and harbours intrinsically disordered regions.
Collapse
Affiliation(s)
- Shashank Arora
- School of Biological Sciences, UM-DAE Center for Excellence in Basic Sciences, University of Mumbai, Kalina Campus, Santacruz (E), Mumbai, 400098, India
| | - Pawan Nagarkar
- School of Biological Sciences, UM-DAE Center for Excellence in Basic Sciences, University of Mumbai, Kalina Campus, Santacruz (E), Mumbai, 400098, India
| | - Jacinta S D'Souza
- School of Biological Sciences, UM-DAE Center for Excellence in Basic Sciences, University of Mumbai, Kalina Campus, Santacruz (E), Mumbai, 400098, India.
| |
Collapse
|
11
|
Piovesan D, Del Conte A, Mehdiabadi M, Aspromonte M, Blum M, Tesei G, von Bülow S, Lindorff-Larsen K, Tosatto SE. MOBIDB in 2025: integrating ensemble properties and function annotations for intrinsically disordered proteins. Nucleic Acids Res 2025; 53:D495-D503. [PMID: 39470701 PMCID: PMC11701742 DOI: 10.1093/nar/gkae969] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Revised: 10/07/2024] [Accepted: 10/11/2024] [Indexed: 10/30/2024] Open
Abstract
The MobiDB database (URL: https://mobidb.org/) aims to provide structural and functional information about intrinsic protein disorder, aggregating annotations from the literature, experimental data, and predictions for all known protein sequences. Here, we describe the improvements made to our resource to capture more information, simplify access to the aggregated data, and increase documentation of all MobiDB features. Compared to the previous release, all underlying pipeline modules were updated. The prediction module is ten times faster and can detect if a predicted disordered region is structurally extended or compact. The PDB component is now able to process large cryo-EM structures extending the number of processed entries. The entry page has been restyled to highlight functional aspects of disorder and all graphical modules have been completely reimplemented for better flexibility and faster rendering. The server has been improved to optimise bulk downloads. Annotation provenance has been standardised by adopting ECO terms. Finally, we propagated disorder function (IDPO and GO terms) from the DisProt database exploiting sequence similarity and protein embeddings. These improvements, along with the addition of comprehensive training material, offer a more intuitive interface and novel functional knowledge about intrinsic disorder.
Collapse
Affiliation(s)
- Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padua 35131, Italy
| | - Alessio Del Conte
- Department of Biomedical Sciences, University of Padova, Padua 35131, Italy
| | - Mahta Mehdiabadi
- Department of Biomedical Sciences, University of Padova, Padua 35131, Italy
| | | | - Matthias Blum
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Giulio Tesei
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Sören von Bülow
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padova, Padua 35131, Italy
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR-IBIOM), Bari, Italy
| |
Collapse
|
12
|
Blum M, Andreeva A, Florentino L, Chuguransky S, Grego T, Hobbs E, Pinto B, Orr A, Paysan-Lafosse T, Ponamareva I, Salazar G, Bordin N, Bork P, Bridge A, Colwell L, Gough J, Haft D, Letunic I, Llinares-López F, Marchler-Bauer A, Meng-Papaxanthos L, Mi H, Natale D, Orengo C, Pandurangan A, Piovesan D, Rivoire C, Sigrist CA, Thanki N, Thibaud-Nissen F, Thomas P, Tosatto SE, Wu C, Bateman A. InterPro: the protein sequence classification resource in 2025. Nucleic Acids Res 2025; 53:D444-D456. [PMID: 39565202 PMCID: PMC11701551 DOI: 10.1093/nar/gkae1082] [Citation(s) in RCA: 48] [Impact Index Per Article: 48.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Revised: 10/11/2024] [Accepted: 10/23/2024] [Indexed: 11/21/2024] Open
Abstract
InterPro (https://www.ebi.ac.uk/interpro) is a freely accessible resource for the classification of protein sequences into families. It integrates predictive models, known as signatures, from multiple member databases to classify sequences into families and predict the presence of domains and significant sites. The InterPro database provides annotations for over 200 million sequences, ensuring extensive coverage of UniProtKB, the standard repository of protein sequences, and includes mappings to several other major resources, such as Gene Ontology (GO), Protein Data Bank in Europe (PDBe) and the AlphaFold Protein Structure Database. In this publication, we report on the status of InterPro (version 101.0), detailing new developments in the database, associated web interface and software. Notable updates include the increased integration of structures predicted by AlphaFold and the enhanced description of protein families using artificial intelligence. Over the past two years, more than 5000 new InterPro entries have been created. The InterPro website now offers access to 85 000 protein families and domains from its member databases and serves as a long-term archive for retired databases. InterPro data, software and tools are freely available.
Collapse
Affiliation(s)
- Matthias Blum
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Antonina Andreeva
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Laise Cavalcanti Florentino
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Sara Rocio Chuguransky
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Tiago Grego
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Emma Hobbs
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Beatriz Lazaro Pinto
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Ailsa Orr
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Typhaine Paysan-Lafosse
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Irina Ponamareva
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Gustavo A Salazar
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Nicola Bordin
- Department of Structural and Molecular Biology, University College London, Gower St, Bloomsbury, London WC1E 6BT, UK
| | - Peer Bork
- European Molecular Biology Laboratory, Structural and Computational Biology Unit, Meyerhofstraße 1, 69117 Heidelberg, Germany
| | - Alan Bridge
- Swiss-Prot Group, Swiss Institute of Bioinformatics, CMU, 1 rue Michel Servet, CH-1211, Geneva, Switzerland
| | | | - Julian Gough
- Medical Research Council Laboratory of Molecular Biology, Cambridge Biomedical Campus, Francis Crick Ave, Trumpington, Cambridge CB2 0QH, UK
| | - Daniel H Haft
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Ivica Letunic
- Biobyte Solutions GmbH, Bothestr 142, 69126 Heidelberg, Germany
| | | | - Aron Marchler-Bauer
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | | | - Huaiyu Mi
- Division of Bioinformatics, Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA 90033, USA
| | - Darren A Natale
- Protein Information Resource, Georgetown University Medical Center, WA, DC 20007, USA
| | - Christine A Orengo
- Department of Structural and Molecular Biology, University College London, Gower St, Bloomsbury, London WC1E 6BT, UK
| | - Arun P Pandurangan
- Medical Research Council Laboratory of Molecular Biology, Cambridge Biomedical Campus, Francis Crick Ave, Trumpington, Cambridge CB2 0QH, UK
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy
| | - Catherine Rivoire
- Swiss-Prot Group, Swiss Institute of Bioinformatics, CMU, 1 rue Michel Servet, CH-1211, Geneva, Switzerland
| | - Christian J A Sigrist
- Swiss-Prot Group, Swiss Institute of Bioinformatics, CMU, 1 rue Michel Servet, CH-1211, Geneva, Switzerland
| | - Narmada Thanki
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Paul D Thomas
- Division of Bioinformatics, Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA 90033, USA
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR-IBIOM), Bari 70126, Italy
| | - Cathy H Wu
- Protein Information Resource, Georgetown University Medical Center, WA, DC 20007, USA
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| |
Collapse
|
13
|
Kazancev M, Merkulov P, Tiurin K, Demurin Y, Soloviev A, Kirov I. Comparative Analysis of Active LTR Retrotransposons in Sunflower ( Helianthus annuus L.): From Extrachromosomal Circular DNA Detection to Protein Structure Prediction. Int J Mol Sci 2024; 25:13615. [PMID: 39769378 PMCID: PMC11728184 DOI: 10.3390/ijms252413615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2024] [Revised: 11/30/2024] [Accepted: 12/17/2024] [Indexed: 01/16/2025] Open
Abstract
Plant genomes possess numerous transposable element (TE) insertions that have occurred during evolution. Most TEs are silenced or diverged; therefore, they lose their ability to encode proteins and are transposed in the genome. Knowledge of active plant TEs and TE-encoded proteins essential for transposition and evasion of plant cell transposon silencing mechanisms remains limited. This study investigated active long terminal repeat (LTR) retrotransposons (RTEs) in sunflowers (Helianthus annuus), revealing heterogeneous and phylogenetically distinct RTEs triggered by epigenetic changes and heat stress. Many of these RTEs belong to three distinct groups within the Tekay clade, showing significant variations in chromosomal insertion distribution. Through protein analysis of these active RTEs, it was found that Athila RTEs and Tekay group 2 elements possess additional open reading frames (aORFs). The aORF-encoded proteins feature a transposase domain, a transmembrane domain, and nuclear localization signals. The aORF proteins of the Tekay subgroup exhibited remarkable conservation among over 500 Tekay members, suggesting their functional importance in RTE mobility. The predicted 3D structure of the sunflower Tekay aORF protein showed significant homology with Tekay proteins in rice, maize, and sorghum. Additionally, the structural features of aORF proteins resemble those of plant DRBM-containing proteins, suggesting their potential role in RNA-silencing modulation. These findings offer insights into the diversity and activity of sunflower RTEs, emphasizing the conservation and structural characteristics of aORF-encoded proteins.
Collapse
Affiliation(s)
- Mikhail Kazancev
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia; (M.K.); (P.M.); (K.T.); (A.S.)
| | - Pavel Merkulov
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia; (M.K.); (P.M.); (K.T.); (A.S.)
| | - Kirill Tiurin
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia; (M.K.); (P.M.); (K.T.); (A.S.)
| | - Yakov Demurin
- Pustovoit All-Russia Research Institute of Oilseed Crops, Filatova St. 17, 350038 Krasnodar, Russia;
| | - Alexander Soloviev
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia; (M.K.); (P.M.); (K.T.); (A.S.)
- All-Russia Center for Plant Quarantine, 140150 Ramenski, Russia
| | - Ilya Kirov
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia; (M.K.); (P.M.); (K.T.); (A.S.)
| |
Collapse
|
14
|
Chen Y, Huang J, Qin H, Zhang K, Fu Y, Li J, Wang R, Chen K, Xiong J, Miao W, Wang G, Zhang L. Chromosome-level genome assembly of Cryptosporidium parvum by long-read sequencing of ten oocysts. Sci Data 2024; 11:1287. [PMID: 39592642 PMCID: PMC11599830 DOI: 10.1038/s41597-024-04150-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 11/19/2024] [Indexed: 11/28/2024] Open
Abstract
Cryptosporidium parvum is a zoonotic parasite of the intestine and poses a threat to human and animal health. However, it is difficult to obtain a large number of oocysts for genome sequencing using in vitro culture. To address this challenge, we employed the strategy of whole-genome amplification of 10 oocysts followed by long-read sequencing and obtained a high-quality genome assembly of C. parvum IIdA19G1 subtype isolated from a pre-weaning calf with diarrhea. The assembled genome was 9.13 Mb long and encompassed eight chromosomes with six capped by telomeric sequences at one or both ends. In total, 3,915 protein-coding genes were predicted, exhibiting a high completeness with 98.2% single-copy BUSCO genes. To our current knowledge, this represents the first chromosome-level genome assembly of C. parvum achieved through the combined use of whole-genome amplification of 10 oocysts and long-read sequencing. This achievement not only advances our understanding of the genomic landscape of this zoonotic intestinal parasite, but also provides valuable resources for comparative genomics and evolutionary analyses within the Cryptosporidium clade.
Collapse
Affiliation(s)
- Yuancai Chen
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou, 450046, P. R. China
| | - Jianying Huang
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou, 450046, P. R. China
| | - Huikai Qin
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou, 450046, P. R. China
| | - Kaihui Zhang
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou, 450046, P. R. China
| | - Yin Fu
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou, 450046, P. R. China
| | - Junqiang Li
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou, 450046, P. R. China
| | - Rongjun Wang
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou, 450046, P. R. China
| | - Kai Chen
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, 430072, China
| | - Jie Xiong
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, 430072, China
- Key Laboratory of Breeding Biotechnology and Sustainable Aquaculture, Chinese Academy of Sciences, Wuhan, 430072, China
| | - Wei Miao
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, 430072, China
- Key laboratory of Lake and Watershed Science for Water Security, Chinese Academy of Sciences, Nanjing, 210008, China
| | - Guangying Wang
- Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, 430072, China.
| | - Longxian Zhang
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou, 450046, P. R. China.
- International Joint Research Laboratory for Zoonotic Diseases of Henan, Zhengzhou, 450046, P. R. China.
- Key Laboratory of Quality and Safety Control of Poultry Products (Zhengzhou), Ministry of Agriculture and Rural Affairs, Zhengzhou, 450046, China.
| |
Collapse
|
15
|
Karavaeva V, Sousa FL. Navigating the archaeal frontier: insights and projections from bioinformatic pipelines. Front Microbiol 2024; 15:1433224. [PMID: 39380680 PMCID: PMC11459464 DOI: 10.3389/fmicb.2024.1433224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Accepted: 08/28/2024] [Indexed: 10/10/2024] Open
Abstract
Archaea continues to be one of the least investigated domains of life, and in recent years, the advent of metagenomics has led to the discovery of many new lineages at the phylum level. For the majority, only automatic genomic annotations can provide information regarding their metabolic potential and role in the environment. Here, genomic data from 2,978 archaeal genomes was used to perform automatic annotations using bioinformatics tools, alongside synteny analysis. These automatic classifications were done to assess how good these different tools perform in relation to archaeal data. Our study revealed that even with lowered cutoffs, several functional models do not capture the recently discovered archaeal diversity. Moreover, our investigation revealed that a significant portion of archaeal genomes, approximately 42%, remain uncharacterized. In comparison, within 3,235 bacterial genomes, a diverse range of unclassified proteins is obtained, with well-studied organisms like Escherichia coli having a substantially lower proportion of uncharacterized regions, ranging from <5 to 25%, and less studied lineages being comparable to archaea with the range of 35-40% of unclassified regions. Leveraging this analysis, we were able to identify metabolic protein markers, thereby providing insights into the metabolism of the archaea in our dataset. Our findings underscore a substantial gap between automatic classification tools and the comprehensive mapping of archaeal metabolism. Despite advances in computational approaches, a significant portion of archaeal genomes remains unexplored, highlighting the need for extensive experimental validation in this domain, as well as more refined annotation methods. This study contributes to a better understanding of archaeal metabolism and underscores the importance of further research in elucidating the functional potential of archaeal genomes.
Collapse
Affiliation(s)
- Val Karavaeva
- Genome Evolution and Ecology Group, Department of Functional and Evolutionary Ecology, University of Vienna, Vienna, Austria
- Vienna Doctoral School of Ecology and Evolution, University of Vienna, Vienna, Austria
| | - Filipa L. Sousa
- Genome Evolution and Ecology Group, Department of Functional and Evolutionary Ecology, University of Vienna, Vienna, Austria
| |
Collapse
|
16
|
De La Cruz N, Pradhan P, Veettil RT, Conti BA, Oppikofer M, Sabari BR. Disorder-mediated interactions target proteins to specific condensates. Mol Cell 2024; 84:3497-3512.e9. [PMID: 39232584 DOI: 10.1016/j.molcel.2024.08.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 05/03/2024] [Accepted: 08/10/2024] [Indexed: 09/06/2024]
Abstract
Selective compartmentalization of cellular contents is fundamental to the regulation of biochemistry. Although membrane-bound organelles control composition by using a semi-permeable barrier, biomolecular condensates rely on interactions among constituents to determine composition. Condensates are formed by dynamic multivalent interactions, often involving intrinsically disordered regions (IDRs) of proteins, yet whether distinct compositions can arise from these dynamic interactions is not known. Here, by comparative analysis of proteins differentially partitioned by two different condensates, we find that distinct compositions arise through specific IDR-mediated interactions. The IDRs of differentially partitioned proteins are necessary and sufficient for selective partitioning. Distinct sequence features are required for IDRs to partition, and swapping these sequence features changes the specificity of partitioning. Swapping whole IDRs retargets proteins and their biochemical activity to different condensates. Our results demonstrate that IDR-mediated interactions can target proteins to specific condensates, enabling the spatial regulation of biochemistry within the cell.
Collapse
Affiliation(s)
- Nancy De La Cruz
- Laboratory of Nuclear Organization, Cecil H. and Ida Green Center for Reproductive Biology Sciences, Division of Basic Research, Department of Obstetrics and Gynecology, Department of Molecular Biology, Hamon Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Prashant Pradhan
- Laboratory of Nuclear Organization, Cecil H. and Ida Green Center for Reproductive Biology Sciences, Division of Basic Research, Department of Obstetrics and Gynecology, Department of Molecular Biology, Hamon Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Reshma T Veettil
- Laboratory of Nuclear Organization, Cecil H. and Ida Green Center for Reproductive Biology Sciences, Division of Basic Research, Department of Obstetrics and Gynecology, Department of Molecular Biology, Hamon Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Brooke A Conti
- Pfizer Centers for Therapeutic Innovation, Pfizer Inc., New York, NY 10016, USA
| | - Mariano Oppikofer
- Pfizer Centers for Therapeutic Innovation, Pfizer Inc., New York, NY 10016, USA
| | - Benjamin R Sabari
- Laboratory of Nuclear Organization, Cecil H. and Ida Green Center for Reproductive Biology Sciences, Division of Basic Research, Department of Obstetrics and Gynecology, Department of Molecular Biology, Hamon Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA.
| |
Collapse
|
17
|
Heredia-Torrejón M, Montañez R, González-Meneses A, Carcavilla A, Medina MA, Lechuga-Sancho AM. VUS next in rare diseases? Deciphering genetic determinants of biomolecular condensation. Orphanet J Rare Dis 2024; 19:327. [PMID: 39243101 PMCID: PMC11380411 DOI: 10.1186/s13023-024-03307-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Accepted: 08/06/2024] [Indexed: 09/09/2024] Open
Abstract
The diagnostic odysseys for rare disease patients are getting shorter as next-generation sequencing becomes more widespread. However, the complex genetic diversity and factors influencing expressivity continue to challenge accurate diagnosis, leaving more than 50% of genetic variants categorized as variants of uncertain significance.Genomic expression intricately hinges on localized interactions among its products. Conventional variant prioritization, biased towards known disease genes and the structure-function paradigm, overlooks the potential impact of variants shaping the composition, location, size, and properties of biomolecular condensates, genuine membraneless organelles swiftly sensing and responding to environmental changes, and modulating expressivity.To address this complexity, we propose to focus on the nexus of genetic variants within biomolecular condensates determinants. Scrutinizing variant effects in these membraneless organelles could refine prioritization, enhance diagnostics, and unveil the molecular underpinnings of rare diseases. Integrating comprehensive genome sequencing, transcriptomics, and computational models can unravel variant pathogenicity and disease mechanisms, enabling precision medicine. This paper presents the rationale driving our proposal and describes a protocol to implement this approach. By fusing state-of-the-art knowledge and methodologies into the clinical practice, we aim to redefine rare diseases diagnosis, leveraging the power of scientific advancement for more informed medical decisions.
Collapse
Affiliation(s)
- María Heredia-Torrejón
- Inflammation, Nutrition, Metabolism and Oxidative Stress Research Laboratory, Biomedical Research and Innovation Institute of Cadiz (INiBICA), Cadiz, Spain
- Mother and Child Health and Radiology Department. Area of Clinical Genetics, University of Cadiz. Faculty of Medicine, Cadiz, Spain
| | - Raúl Montañez
- Inflammation, Nutrition, Metabolism and Oxidative Stress Research Laboratory, Biomedical Research and Innovation Institute of Cadiz (INiBICA), Cadiz, Spain.
- Department of Molecular Biology and Biochemistry, University of Malaga, Andalucía Tech, E-29071, Málaga, Spain.
| | - Antonio González-Meneses
- Division of Dysmorphology, Department of Paediatrics, Virgen del Rocio University Hospital, Sevilla, Spain
- Department of Paediatrics, Medical School, University of Sevilla, Sevilla, Spain
| | - Atilano Carcavilla
- Pediatric Endocrinology Department, Hospital Universitario La Paz, 28046, Madrid, Spain
- Multidisciplinary Unit for RASopathies, Hospital Universitario La Paz, 28046, Madrid, Spain
| | - Miguel A Medina
- Department of Molecular Biology and Biochemistry, University of Malaga, Andalucía Tech, E-29071, Málaga, Spain.
- Biomedical Research Institute and nanomedicine platform of Málaga IBIMA-BIONAND, E-29071, Málaga, Spain.
- CIBER de Enfermedades Raras (CIBERER), Instituto de Salud Carlos III, E-28029, Madrid, Spain.
| | - Alfonso M Lechuga-Sancho
- Inflammation, Nutrition, Metabolism and Oxidative Stress Research Laboratory, Biomedical Research and Innovation Institute of Cadiz (INiBICA), Cadiz, Spain
- Division of Endocrinology, Department of Paediatrics, Puerta del Mar University Hospital, Cádiz, Spain
- Area of Paediatrics, Department of Child and Mother Health and Radiology, Medical School, University of Cadiz, Cadiz, Spain
| |
Collapse
|
18
|
Calia G, Cestaro A, Schuler H, Janik K, Donati C, Moser M, Bottini S. Definition of the effector landscape across 13 phytoplasma proteomes with LEAPH and EffectorComb. NAR Genom Bioinform 2024; 6:lqae087. [PMID: 39081684 PMCID: PMC11287381 DOI: 10.1093/nargab/lqae087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 06/24/2024] [Accepted: 07/16/2024] [Indexed: 08/02/2024] Open
Abstract
'Candidatus Phytoplasma' genus, a group of fastidious phloem-restricted bacteria, can infect a wide variety of both ornamental and agro-economically important plants. Phytoplasmas secrete effector proteins responsible for the symptoms associated with the disease. Identifying and characterizing these proteins is of prime importance for expanding our knowledge of the molecular bases of the disease. We faced the challenge of identifying phytoplasma's effectors by developing LEAPH, a machine learning ensemble predictor composed of four models. LEAPH was trained on 479 proteins from 53 phytoplasma species, described by 30 features. LEAPH achieved 97.49% accuracy, 95.26% precision and 98.37% recall, ensuring a low false-positive rate and outperforming available state-of-the-art methods. The application of LEAPH to 13 phytoplasma proteomes yields a comprehensive landscape of 2089 putative pathogenicity proteins. We identified three classes according to different secretion models: 'classical', 'classical-like' and 'non-classical'. Importantly, LEAPH identified 15 out of 17 known experimentally validated effectors belonging to the three classes. Furthermore, to help the selection of novel candidates for biological validation, we applied the Self-Organizing Maps algorithm and developed a Shiny app called EffectorComb. LEAPH and the EffectorComb app can be used to boost the characterization of putative effectors at both computational and experimental levels, and can be employed in other phytopathological models.
Collapse
Affiliation(s)
- Giulia Calia
- Faculty of Agricultural, Environmental and Food Sciences, Free University of Bolzano, 39100 Bolzano, Italy
- Research and Innovation Centre, Fondazione Edmund Mach, 38010 San Michele all’Adige, Italy
- INRAE, Institut Sophia Agrobiotech, Université Côte d’Azur, CNRS, 06903 Sophia-Antipolis, France
| | - Alessandro Cestaro
- Research and Innovation Centre, Fondazione Edmund Mach, 38010 San Michele all’Adige, Italy
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies (IBIOM), National Research Council (CNR), 70126 Bari, Italy
| | - Hannes Schuler
- Faculty of Agricultural, Environmental and Food Sciences, Free University of Bolzano, 39100 Bolzano, Italy
- Competence Centre for Plant Health, Free University of Bolzano, 39100 Bolzano, Italy
| | - Katrin Janik
- Institute for Plant Health, Molecular Biology and Microbiology, Laimburg Research Centre, 47141 Pfatten-Vadena, Italy
| | - Claudio Donati
- Research and Innovation Centre, Fondazione Edmund Mach, 38010 San Michele all’Adige, Italy
| | - Mirko Moser
- Research and Innovation Centre, Fondazione Edmund Mach, 38010 San Michele all’Adige, Italy
| | - Silvia Bottini
- INRAE, Institut Sophia Agrobiotech, Université Côte d’Azur, CNRS, 06903 Sophia-Antipolis, France
| |
Collapse
|
19
|
Wassmer E, Koppány G, Hermes M, Diederichs S, Caudron-Herger M. Refining the pool of RNA-binding domains advances the classification and prediction of RNA-binding proteins. Nucleic Acids Res 2024; 52:7504-7522. [PMID: 38917322 PMCID: PMC11260472 DOI: 10.1093/nar/gkae536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 05/31/2024] [Accepted: 06/13/2024] [Indexed: 06/27/2024] Open
Abstract
From transcription to decay, RNA-binding proteins (RBPs) influence RNA metabolism. Using the RBP2GO database that combines proteome-wide RBP screens from 13 species, we investigated the RNA-binding features of 176 896 proteins. By compiling published lists of RNA-binding domains (RBDs) and RNA-related protein family (Rfam) IDs with lists from the InterPro database, we analyzed the distribution of the RBDs and Rfam IDs in RBPs and non-RBPs to select RBDs and Rfam IDs that were enriched in RBPs. We also explored proteins for their content in intrinsically disordered regions (IDRs) and low complexity regions (LCRs). We found a strong positive correlation between IDRs and RBDs and a co-occurrence of specific LCRs. Our bioinformatic analysis indicated that RBDs/Rfam IDs were strong indicators of the RNA-binding potential of proteins and helped predicting new RBP candidates, especially in less investigated species. By further analyzing RBPs without RBD, we predicted new RBDs that were validated by RNA-bound peptides. Finally, we created the RBP2GO composite score by combining the RBP2GO score with new quality factors linked to RBDs and Rfam IDs. Based on the RBP2GO composite score, we compiled a list of 2018 high-confidence human RBPs. The knowledge collected here was integrated into the RBP2GO database at https://RBP2GO-2-Beta.dkfz.de.
Collapse
Affiliation(s)
- Elsa Wassmer
- Research Group “RNA-Protein Complexes & Cell Proliferation”, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - Gergely Koppány
- Research Group “RNA-Protein Complexes & Cell Proliferation”, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - Malte Hermes
- Research Group “RNA-Protein Complexes & Cell Proliferation”, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - Sven Diederichs
- Division of Cancer Research, Department of Thoracic Surgery, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, and German Cancer Consortium (DKTK), partner site Freiburg, a partnership between DKFZ and University Medical Center Freiburg, 79106 Freiburg, Germany
| | - Maïwen Caudron-Herger
- Research Group “RNA-Protein Complexes & Cell Proliferation”, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| |
Collapse
|
20
|
Wan CY, Davis J, Chauhan M, Gleeson J, Prawer YJ, De Paoli-Iseppi R, Wells C, Choi J, Clark M. IsoVis - a webserver for visualization and annotation of alternative RNA isoforms. Nucleic Acids Res 2024; 52:W341-W347. [PMID: 38709877 PMCID: PMC11223830 DOI: 10.1093/nar/gkae343] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 03/28/2024] [Accepted: 04/18/2024] [Indexed: 05/08/2024] Open
Abstract
Genes commonly express multiple RNA products (RNA isoforms), which differ in exonic content and can have different functions. Making sense of the plethora of known and novel RNA isoforms being identified by transcriptomic approaches requires a user-friendly way to visualize gene isoforms and how they differ in exonic content, expression levels and potential functions. Here we introduce IsoVis, a freely available webserver that accepts user-supplied transcriptomic data and visualizes the expressed isoforms in a clear, intuitive manner. IsoVis contains numerous features, including the ability to visualize all RNA isoforms of a gene and their expression levels; the annotation of known isoforms from external databases; mapping of protein domains and features to exons, allowing changes to protein sequence and function between isoforms to be established; and extensive species compatibility. Datasets visualised on IsoVis remain private to the user, allowing analysis of sensitive data. IsoVis visualisations can be downloaded to create publication-ready figures. The IsoVis webserver enables researchers to perform isoform analyses without requiring programming skills, is free to use, and available at https://isomix.org/isovis/.
Collapse
Affiliation(s)
- Ching Yin Wan
- Department of Anatomy and Physiology, The University of Melbourne, Parkville, Victoria, 3010, Australia
| | - Jack Davis
- Department of Anatomy and Physiology, The University of Melbourne, Parkville, Victoria, 3010, Australia
| | - Manveer Chauhan
- Department of Anatomy and Physiology, The University of Melbourne, Parkville, Victoria, 3010, Australia
| | - Josie Gleeson
- Department of Anatomy and Physiology, The University of Melbourne, Parkville, Victoria, 3010, Australia
| | - Yair D J Prawer
- Department of Anatomy and Physiology, The University of Melbourne, Parkville, Victoria, 3010, Australia
| | - Ricardo De Paoli-Iseppi
- Department of Anatomy and Physiology, The University of Melbourne, Parkville, Victoria, 3010, Australia
| | - Christine A Wells
- Department of Anatomy and Physiology, The University of Melbourne, Parkville, Victoria, 3010, Australia
| | - Jarny Choi
- Department of Anatomy and Physiology, The University of Melbourne, Parkville, Victoria, 3010, Australia
| | - Michael B Clark
- Department of Anatomy and Physiology, The University of Melbourne, Parkville, Victoria, 3010, Australia
| |
Collapse
|
21
|
Jahn LR, Marquet C, Heinzinger M, Rost B. Protein embeddings predict binding residues in disordered regions. Sci Rep 2024; 14:13566. [PMID: 38866950 PMCID: PMC11169622 DOI: 10.1038/s41598-024-64211-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 06/06/2024] [Indexed: 06/14/2024] Open
Abstract
The identification of protein binding residues helps to understand their biological processes as protein function is often defined through ligand binding, such as to other proteins, small molecules, ions, or nucleotides. Methods predicting binding residues often err for intrinsically disordered proteins or regions (IDPs/IDPRs), often also referred to as molecular recognition features (MoRFs). Here, we presented a novel machine learning (ML) model trained to specifically predict binding regions in IDPRs. The proposed model, IDBindT5, leveraged embeddings from the protein language model (pLM) ProtT5 to reach a balanced accuracy of 57.2 ± 3.6% (95% confidence interval). Assessed on the same data set, this did not differ at the 95% CI from the state-of-the-art (SOTA) methods ANCHOR2 and DeepDISOBind that rely on expert-crafted features and evolutionary information from multiple sequence alignments (MSAs). Assessed on other data, methods such as SPOT-MoRF reached higher MCCs. IDBindT5's SOTA predictions are much faster than other methods, easily enabling full-proteome analyses. Our findings emphasize the potential of pLMs as a promising approach for exploring and predicting features of disordered proteins. The model and a comprehensive manual are publicly available at https://github.com/jahnl/binding_in_disorder .
Collapse
Affiliation(s)
- Laura R Jahn
- School of Computation, Information, and Technology (CIT), Department of Informatics, Bioinformatics and Computational Biology, TUM (Technical University of Munich), 85748, Garching/Munich, Germany
| | - Céline Marquet
- School of Computation, Information, and Technology (CIT), Department of Informatics, Bioinformatics and Computational Biology, TUM (Technical University of Munich), 85748, Garching/Munich, Germany.
| | - Michael Heinzinger
- School of Computation, Information, and Technology (CIT), Department of Informatics, Bioinformatics and Computational Biology, TUM (Technical University of Munich), 85748, Garching/Munich, Germany
| | - Burkhard Rost
- School of Computation, Information, and Technology (CIT), Department of Informatics, Bioinformatics and Computational Biology, TUM (Technical University of Munich), 85748, Garching/Munich, Germany
- Institute for Advanced Study (TUM-IAS), Lichtenbergstr. 2a, 85748, Garching/Munich, Germany
- TUM School of Life Sciences Weihenstephan (TUM-WZW), Alte Akademie 8, Freising, Germany
| |
Collapse
|
22
|
Xu S, Onoda A. Accurate and Fast Prediction of Intrinsically Disordered Protein by Multiple Protein Language Models and Ensemble Learning. J Chem Inf Model 2024; 64:2901-2911. [PMID: 37883249 DOI: 10.1021/acs.jcim.3c01202] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2023]
Abstract
Intrinsically disordered proteins (IDPs) play a vital role in various biological processes and have attracted increasing attention in the past few decades. Predicting IDPs from the primary structures of proteins offers a rapid and facile means of protein analysis without necessitating crystal structures. In particular, machine learning methods have demonstrated their potential in this field. Recently, protein language models (PLMs) are emerging as a promising approach to extracting essential information from protein sequences and have been employed in protein modeling to utilize their advantages of precision and efficiency. In this article, we developed a novel IDP prediction method named IDP-ELM to predict the intrinsically disordered regions (IDRs) as well as their functions including disordered flexible linkers and disordered protein binding. This method utilizes high-dimensional representations extracted from several state-of-the-art PLMs and predicts IDRs by ensemble learning based on bidirectional recurrent neural networks. The performance of the method was evaluated on two independent test data sets from CAID (critical assessment of protein intrinsic disorder prediction) and CAID2, indicating notable improvements in terms of area under the receiver operating characteristic (AUC), Matthew's correlation coefficient (MCC), and F1 score. Moreover, IDP-ELM requires solely protein sequences as inputs and does not entail a time-consuming process of protein profile generation, which is a prerequisite for most existing state-of-the-art methods, enabling an accurate, fast, and convenient tool for proteome-level analysis. The corresponding reproducible source code and model weights are available at https://github.com/xu-shi-jie/idp-elm.
Collapse
Affiliation(s)
- Shijie Xu
- Graduate School of Environmental Science, Hokkaido University, Sapporo 060-0810, Japan
| | - Akira Onoda
- Graduate School of Environmental Science, Hokkaido University, Sapporo 060-0810, Japan
- Faculty of Environmental Earth Science, Hokkaido University, Sapporo 060-0810, Japan
| |
Collapse
|
23
|
Nishio S, Emori C, Wiseman B, Fahrenkamp D, Dioguardi E, Zamora-Caballero S, Bokhove M, Han L, Stsiapanava A, Algarra B, Lu Y, Kodani M, Bainbridge RE, Komondor KM, Carlson AE, Landreh M, de Sanctis D, Yasumasu S, Ikawa M, Jovine L. ZP2 cleavage blocks polyspermy by modulating the architecture of the egg coat. Cell 2024; 187:1440-1459.e24. [PMID: 38490181 PMCID: PMC10976854 DOI: 10.1016/j.cell.2024.02.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 11/07/2023] [Accepted: 02/09/2024] [Indexed: 03/17/2024]
Abstract
Following the fertilization of an egg by a single sperm, the egg coat or zona pellucida (ZP) hardens and polyspermy is irreversibly blocked. These events are associated with the cleavage of the N-terminal region (NTR) of glycoprotein ZP2, a major subunit of ZP filaments. ZP2 processing is thought to inactivate sperm binding to the ZP, but its molecular consequences and connection with ZP hardening are unknown. Biochemical and structural studies show that cleavage of ZP2 triggers its oligomerization. Moreover, the structure of a native vertebrate egg coat filament, combined with AlphaFold predictions of human ZP polymers, reveals that two protofilaments consisting of type I (ZP3) and type II (ZP1/ZP2/ZP4) components interlock into a left-handed double helix from which the NTRs of type II subunits protrude. Together, these data suggest that oligomerization of cleaved ZP2 NTRs extensively cross-links ZP filaments, rigidifying the egg coat and making it physically impenetrable to sperm.
Collapse
Affiliation(s)
- Shunsuke Nishio
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
| | - Chihiro Emori
- Department of Experimental Genome Research, Research Institute for Microbial Diseases, Osaka University, Suita, Osaka, Japan; Immunology Frontier Research Center, Osaka University, Suita, Osaka, Japan
| | - Benjamin Wiseman
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
| | - Dirk Fahrenkamp
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
| | - Elisa Dioguardi
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
| | | | - Marcel Bokhove
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
| | - Ling Han
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
| | - Alena Stsiapanava
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
| | - Blanca Algarra
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
| | - Yonggang Lu
- Department of Experimental Genome Research, Research Institute for Microbial Diseases, Osaka University, Suita, Osaka, Japan; Immunology Frontier Research Center, Osaka University, Suita, Osaka, Japan
| | - Mayo Kodani
- Department of Experimental Genome Research, Research Institute for Microbial Diseases, Osaka University, Suita, Osaka, Japan; Graduate School of Pharmaceutical Sciences, Osaka University, Suita, Osaka, Japan
| | - Rachel E Bainbridge
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | - Kayla M Komondor
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | - Anne E Carlson
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | - Michael Landreh
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden; Department of Cell and Molecular Biology, Uppsala University, 75124 Uppsala, Sweden
| | | | - Shigeki Yasumasu
- Department of Materials and Life Sciences, Faculty of Science and Technology, Sophia University, Tokyo, Japan
| | - Masahito Ikawa
- Department of Experimental Genome Research, Research Institute for Microbial Diseases, Osaka University, Suita, Osaka, Japan; Immunology Frontier Research Center, Osaka University, Suita, Osaka, Japan; Graduate School of Pharmaceutical Sciences, Osaka University, Suita, Osaka, Japan; Center for Infectious Disease Education and Research (CiDER), Osaka University, Suita, Osaka, Japan
| | - Luca Jovine
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden.
| |
Collapse
|
24
|
Botkin JR, Farmer AD, Young ND, Curtin SJ. Genome assembly of Medicago truncatula accession SA27063 provides insight into spring black stem and leaf spot disease resistance. BMC Genomics 2024; 25:204. [PMID: 38395768 PMCID: PMC10885650 DOI: 10.1186/s12864-024-10112-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Accepted: 02/10/2024] [Indexed: 02/25/2024] Open
Abstract
Medicago truncatula, model legume and alfalfa relative, has served as an essential resource for advancing our understanding of legume physiology, functional genetics, and crop improvement traits. Necrotrophic fungus, Ascochyta medicaginicola, the causal agent of spring black stem (SBS) and leaf spot is a devasting foliar disease of alfalfa affecting stand survival, yield, and forage quality. Host resistance to SBS disease is poorly understood, and control methods rely on cultural practices. Resistance has been observed in M. truncatula accession SA27063 (HM078) with two recessively inherited quantitative-trait loci (QTL), rnpm1 and rnpm2, previously reported. To shed light on host resistance, we carried out a de novo genome assembly of HM078. The genome, referred to as MtHM078 v1.0, is comprised of 23 contigs totaling 481.19 Mbp. Notably, this assembly contains a substantial amount of novel centromere-related repeat sequences due to deep long-read sequencing. Genome annotation resulted in 98.4% of BUSCO fabales proteins being complete. The assembly enabled sequence-level analysis of rnpm1 and rnpm2 for gene content, synteny, and structural variation between SBS-resistant accession SA27063 (HM078) and SBS-susceptible accession A17 (HM101). Fourteen candidate genes were identified, and some have been implicated in resistance to necrotrophic fungi. Especially interesting candidates include loss-of-function events in HM078 because they fit the inverse gene-for-gene model, where resistance is recessively inherited. In rnpm1, these include a loss-of-function in a disease resistance gene due to a premature stop codon, and a 10.85 kbp retrotransposon-like insertion disrupting a ubiquitin conjugating E2. In rnpm2, we identified a frameshift mutation causing a loss-of-function in a glycosidase, as well as a missense and frameshift mutation altering an F-box family protein. This study generated a high-quality genome of HM078 and has identified promising candidates, that once validated, could be further studied in alfalfa to enhance disease resistance.
Collapse
Affiliation(s)
- Jacob R Botkin
- Department of Plant Pathology, University of Minnesota, St. Paul, MN, 55108, USA
| | - Andrew D Farmer
- National Center for Genome Resources, Santa Fe, NM, 87505, USA
| | - Nevin D Young
- Department of Plant Pathology, University of Minnesota, St. Paul, MN, 55108, USA
| | - Shaun J Curtin
- United States Department of Agriculture, Plant Science Research Unit, St Paul, MN, 55108, USA.
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN, 55108, USA.
- Center for Plant Precision Genomics, University of Minnesota, St. Paul, MN, 55108, USA.
- Center for Genome Engineering, University of Minnesota, St. Paul, MN, 55108, USA.
| |
Collapse
|
25
|
Minguet-Lobato M, Cervantes FV, Míguez N, Plou FJ, Fernández-Lobato M. Chitinous material bioconversion by three new chitinases from the yeast Mestchnikowia pulcherrima. Microb Cell Fact 2024; 23:31. [PMID: 38245740 PMCID: PMC10799394 DOI: 10.1186/s12934-024-02300-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 01/09/2024] [Indexed: 01/22/2024] Open
Abstract
BACKGROUND Chitinases are widely distributed enzymes that perform the biotransformation of chitin, one of the most abundant polysaccharides on the biosphere, into useful value-added chitooligosaccharides (COS) with a wide variety of biotechnological applications in food, health, and agricultural fields. One of the most important group of enzymes involved in the degradation of chitin comprises the glycoside hydrolase family 18 (GH18), which harbours endo- and exo-enzymes that act synergistically to depolymerize chitin. The secretion of a chitinase activity from the ubiquitous yeast Mestchnikowia pulcherrima and their involvement in the post-harvest biological control of fungal pathogens was previously reported. RESULTS Three new chitinases from M. pulcherrima, MpChit35, MpChit38 and MpChit41, were molecularly characterized and extracellularly expressed in Pichia pastoris to about 91, 90 and 71 mU ml- 1, respectively. The three enzymes hydrolysed colloidal chitin with optimal activity at 45 ºC and pH 4.0-4.5, increased 2-times their activities using 1 mM of Mn2+ and hydrolysed different types of commercial chitosan. The partial separation and characterization of the complex COS mixtures produced from the hydrolysis of chitin and chitosan were achieved by a new anionic chromatography HPAEC-PAD method and mass spectrometry assays. An overview of the predicted structures of these proteins and their catalytic modes of action were also presented. Depicted their high sequence and structural homology, MpChit35 acted as an exo-chitinase producing di-acetyl-chitobiose from chitin while MpChit38 and MpChit41 both acted as endo-chitinases producing tri-acetyl-chitotriose as main final product. CONCLUSIONS Three new chitinases from the yeast M. pulcherrima were molecularly characterized and their enzymatic and structural characteristics analysed. These enzymes transformed chitinous materials to fully and partially acetylated COS through different modes of splitting, which make them interesting biocatalysts for deeper structural-function studies on the challenging enzymatic conversion of chitin.
Collapse
Affiliation(s)
- Marina Minguet-Lobato
- Department of Molecular Biology, Centre for Molecular Biology Severo Ochoa (CBMSO, CSIC-UAM), University Autonomous from Madrid, C/ Nicolás Cabrera, 1. Cantoblanco, Madrid, 28049, Spain
- Institute of Catalysis and Petrochemistry, CSIC. C/ Marie Curie, 2. Cantoblanco, Madrid, 28049, Spain
| | - Fadia V Cervantes
- Institute of Catalysis and Petrochemistry, CSIC. C/ Marie Curie, 2. Cantoblanco, Madrid, 28049, Spain
| | - Noa Míguez
- Institute of Catalysis and Petrochemistry, CSIC. C/ Marie Curie, 2. Cantoblanco, Madrid, 28049, Spain
| | - Francisco J Plou
- Institute of Catalysis and Petrochemistry, CSIC. C/ Marie Curie, 2. Cantoblanco, Madrid, 28049, Spain.
| | - María Fernández-Lobato
- Department of Molecular Biology, Centre for Molecular Biology Severo Ochoa (CBMSO, CSIC-UAM), University Autonomous from Madrid, C/ Nicolás Cabrera, 1. Cantoblanco, Madrid, 28049, Spain.
| |
Collapse
|
26
|
Baltoumas FA, Karatzas E, Liu S, Ovchinnikov S, Sofianatos Y, Chen IM, Kyrpides N, Pavlopoulos G. NMPFamsDB: a database of novel protein families from microbial metagenomes and metatranscriptomes. Nucleic Acids Res 2024; 52:D502-D512. [PMID: 37811892 PMCID: PMC10767849 DOI: 10.1093/nar/gkad800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 09/19/2023] [Indexed: 10/10/2023] Open
Abstract
The Novel Metagenome Protein Families Database (NMPFamsDB) is a database of metagenome- and metatranscriptome-derived protein families, whose members have no hits to proteins of reference genomes or Pfam domains. Each protein family is accompanied by multiple sequence alignments, Hidden Markov Models, taxonomic information, ecosystem and geolocation metadata, sequence and structure predictions, as well as 3D structure models predicted with AlphaFold2. In its current version, NMPFamsDB hosts over 100 000 protein families, each with at least 100 members. The reported protein families significantly expand (more than double) the number of known protein sequence clusters from reference genomes and reveal new insights into their habitat distribution, origins, functions and taxonomy. We expect NMPFamsDB to be a valuable resource for microbial proteome-wide analyses and for further discovery and characterization of novel functions. NMPFamsDB is publicly available in http://www.nmpfamsdb.org/ or https://bib.fleming.gr/NMPFamsDB.
Collapse
Affiliation(s)
- Fotis A Baltoumas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, 16672, Greece
| | - Evangelos Karatzas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, 16672, Greece
| | - Sirui Liu
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA 02138, USA
| | - Sergey Ovchinnikov
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA 02138, USA
| | - Yorgos Sofianatos
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, 16672, Greece
| | - I-Min Chen
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720-8150, USA
| | - Nikos C Kyrpides
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720-8150, USA
| | - Georgios A Pavlopoulos
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, 16672, Greece
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720-8150, USA
- Center for New Biotechnologies and Precision Medicine, School of Medicine, National and Kapodistrian University of Athens, 75 Mikras Asias Street, Athens 11527, Greece
| |
Collapse
|
27
|
Aspromonte MC, Nugnes MV, Quaglia F, Bouharoua A, Tosatto SCE, Piovesan D. DisProt in 2024: improving function annotation of intrinsically disordered proteins. Nucleic Acids Res 2024; 52:D434-D441. [PMID: 37904585 PMCID: PMC10767923 DOI: 10.1093/nar/gkad928] [Citation(s) in RCA: 44] [Impact Index Per Article: 44.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 10/05/2023] [Accepted: 10/10/2023] [Indexed: 11/01/2023] Open
Abstract
DisProt (URL: https://disprot.org) is the gold standard database for intrinsically disordered proteins and regions, providing valuable information about their functions. The latest version of DisProt brings significant advancements, including a broader representation of functions and an enhanced curation process. These improvements aim to increase both the quality of annotations and their coverage at the sequence level. Higher coverage has been achieved by adopting additional evidence codes. Quality of annotations has been improved by systematically applying Minimum Information About Disorder Experiments (MIADE) principles and reporting all the details of the experimental setup that could potentially influence the structural state of a protein. The DisProt database now includes new thematic datasets and has expanded the adoption of Gene Ontology terms, resulting in an extensive functional repertoire which is automatically propagated to UniProtKB. Finally, we show that DisProt's curated annotations strongly correlate with disorder predictions inferred from AlphaFold2 pLDDT (predicted Local Distance Difference Test) confidence scores. This comparison highlights the utility of DisProt in explaining apparent uncertainty of certain well-defined predicted structures, which often correspond to folding-upon-binding fragments. Overall, DisProt serves as a comprehensive resource, combining experimental evidence of disorder information to enhance our understanding of intrinsically disordered proteins and their functional implications.
Collapse
Affiliation(s)
| | | | - Federica Quaglia
- Department of Biomedical Sciences, University of Padova, Padova, Italy
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR-IBIOM), Bari, Italy
| | - Adel Bouharoua
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | | | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| |
Collapse
|
28
|
Ghafouri H, Lazar T, Del Conte A, Tenorio Ku LG, Tompa P, Tosatto SCE, Monzon AM. PED in 2024: improving the community deposition of structural ensembles for intrinsically disordered proteins. Nucleic Acids Res 2024; 52:D536-D544. [PMID: 37904608 PMCID: PMC10767937 DOI: 10.1093/nar/gkad947] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 10/10/2023] [Accepted: 10/13/2023] [Indexed: 11/01/2023] Open
Abstract
The Protein Ensemble Database (PED) (URL: https://proteinensemble.org) is the primary resource for depositing structural ensembles of intrinsically disordered proteins. This updated version of PED reflects advancements in the field, denoting a continual expansion with a total of 461 entries and 538 ensembles, including those generated without explicit experimental data through novel machine learning (ML) techniques. With this significant increment in the number of ensembles, a few yet-unprecedented new entries entered the database, including those also determined or refined by electron paramagnetic resonance or circular dichroism data. In addition, PED was enriched with several new features, including a novel deposition service, improved user interface, new database cross-referencing options and integration with the 3D-Beacons network-all representing efforts to improve the FAIRness of the database. Foreseeably, PED will keep growing in size and expanding with new types of ensembles generated by accurate and fast ML-based generative models and coarse-grained simulations. Therefore, among future efforts, priority will be given to further develop the database to be compatible with ensembles modeled at a coarse-grained level.
Collapse
Affiliation(s)
| | - Tamas Lazar
- VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnologie (VIB), Brussels, Belgium
- Structural Biology Brussels, Department of Bioengineering, Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | - Alessio Del Conte
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | | | - Peter Tompa
- VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnologie (VIB), Brussels, Belgium
- Structural Biology Brussels, Department of Bioengineering, Vrije Universiteit Brussel (VUB), Brussels, Belgium
- Institute of Enzymology, Research Centre for Natural Sciences (RCNS), Budapest, Hungary
| | | | | |
Collapse
|
29
|
McConnell BS, Parker MW. Protein intrinsically disordered regions have a non-random, modular architecture. Bioinformatics 2023; 39:btad732. [PMID: 38039154 PMCID: PMC10719218 DOI: 10.1093/bioinformatics/btad732] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 11/03/2023] [Accepted: 11/30/2023] [Indexed: 12/03/2023] Open
Abstract
MOTIVATION Protein sequences can be broadly categorized into two classes: those which adopt stable secondary structure and fold into a domain (i.e. globular proteins), and those that do not. The sequences belonging to this latter class are conformationally heterogeneous and are described as being intrinsically disordered. Decades of investigation into the structure and function of globular proteins has resulted in a suite of computational tools that enable their sub-classification by domain type, an approach that has revolutionized how we understand and predict protein functionality. Conversely, it is unknown if sequences of disordered protein regions are subject to broadly generalizable organizational principles that would enable their sub-classification. RESULTS Here, we report the development of a statistical approach that quantifies linear variance in amino acid composition across a sequence. With multiple examples, we provide evidence that intrinsically disordered regions are organized into statistically non-random modules of unique compositional bias. Modularity is observed for both low and high-complexity sequences and, in some cases, we find that modules are organized in repetitive patterns. These data demonstrate that disordered sequences are non-randomly organized into modular architectures and motivate future experiments to comprehensively classify module types and to determine the degree to which modules constitute functionally separable units analogous to the domains of globular proteins. AVAILABILITY AND IMPLEMENTATION The source code, documentation, and data to reproduce all figures are freely available at https://github.com/MWPlabUTSW/Chi-Score-Analysis.git. The analysis is also available as a Google Colab Notebook (https://colab.research.google.com/github/MWPlabUTSW/Chi-Score-Analysis/blob/main/ChiScore_Analysis.ipynb).
Collapse
Affiliation(s)
- Brendan S McConnell
- Department of Biophysics, , University of Texas Southwestern Medical Center, Dallas, TX 75235, United States
| | - Matthew W Parker
- Department of Biophysics, , University of Texas Southwestern Medical Center, Dallas, TX 75235, United States
| |
Collapse
|
30
|
Gonzalez JP, Frandsen KEH, Kesten C. The role of intrinsic disorder in binding of plant microtubule-associated proteins to the cytoskeleton. Cytoskeleton (Hoboken) 2023; 80:404-436. [PMID: 37578201 DOI: 10.1002/cm.21773] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 07/28/2023] [Accepted: 07/30/2023] [Indexed: 08/15/2023]
Abstract
Microtubules (MTs) represent one of the main components of the eukaryotic cytoskeleton and support numerous critical cellular functions. MTs are in principle tube-like structures that can grow and shrink in a highly dynamic manner; a process largely controlled by microtubule-associated proteins (MAPs). Plant MAPs are a phylogenetically diverse group of proteins that nonetheless share many common biophysical characteristics and often contain large stretches of intrinsic protein disorder. These intrinsically disordered regions are determinants of many MAP-MT interactions, in which structural flexibility enables low-affinity protein-protein interactions that enable a fine-tuned regulation of MT cytoskeleton dynamics. Notably, intrinsic disorder is one of the major obstacles in functional and structural studies of MAPs and represents the principal present-day challenge to decipher how MAPs interact with MTs. Here, we review plant MAPs from an intrinsic protein disorder perspective, by providing a complete and up-to-date summary of all currently known members, and address the current and future challenges in functional and structural characterization of MAPs.
Collapse
Affiliation(s)
- Jordy Perez Gonzalez
- Department for Plant and Environmental Sciences, University of Copenhagen, Frederiksberg C, Denmark
| | - Kristian E H Frandsen
- Department for Plant and Environmental Sciences, University of Copenhagen, Frederiksberg C, Denmark
| | - Christopher Kesten
- Department for Plant and Environmental Sciences, University of Copenhagen, Frederiksberg C, Denmark
| |
Collapse
|
31
|
Patil A, Strom AR, Paulo JA, Collings CK, Ruff KM, Shinn MK, Sankar A, Cervantes KS, Wauer T, St Laurent JD, Xu G, Becker LA, Gygi SP, Pappu RV, Brangwynne CP, Kadoch C. A disordered region controls cBAF activity via condensation and partner recruitment. Cell 2023; 186:4936-4955.e26. [PMID: 37788668 PMCID: PMC10792396 DOI: 10.1016/j.cell.2023.08.032] [Citation(s) in RCA: 73] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 07/16/2023] [Accepted: 08/24/2023] [Indexed: 10/05/2023]
Abstract
Intrinsically disordered regions (IDRs) represent a large percentage of overall nuclear protein content. The prevailing dogma is that IDRs engage in non-specific interactions because they are poorly constrained by evolutionary selection. Here, we demonstrate that condensate formation and heterotypic interactions are distinct and separable features of an IDR within the ARID1A/B subunits of the mSWI/SNF chromatin remodeler, cBAF, and establish distinct "sequence grammars" underlying each contribution. Condensation is driven by uniformly distributed tyrosine residues, and partner interactions are mediated by non-random blocks rich in alanine, glycine, and glutamine residues. These features concentrate a specific cBAF protein-protein interaction network and are essential for chromatin localization and activity. Importantly, human disease-associated perturbations in ARID1B IDR sequence grammars disrupt cBAF function in cells. Together, these data identify IDR contributions to chromatin remodeling and explain how phase separation provides a mechanism through which both genomic localization and functional partner recruitment are achieved.
Collapse
Affiliation(s)
- Ajinkya Patil
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Program in Virology, Harvard Medical School, Boston, MA 02115, USA
| | - Amy R Strom
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544, USA
| | - Joao A Paulo
- Department of Cell Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Clayton K Collings
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Kiersten M Ruff
- Department of Biomedical Engineering and Center for Biomolecular Condensates, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Min Kyung Shinn
- Department of Biomedical Engineering and Center for Biomolecular Condensates, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Akshay Sankar
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Kasey S Cervantes
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Tobias Wauer
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 02115, USA
| | - Jessica D St Laurent
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 02115, USA; Department of Obstetrics and Gynecology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Grace Xu
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Lindsay A Becker
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544, USA
| | - Steven P Gygi
- Department of Cell Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Rohit V Pappu
- Department of Biomedical Engineering and Center for Biomolecular Condensates, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Clifford P Brangwynne
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544, USA; Howard Hughes Medical Institute, Chevy Chase, MD 21044, USA; Omenn-Darling Bioengineering Institute, Princeton University, Princeton, NJ 08544, USA.
| | - Cigall Kadoch
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Howard Hughes Medical Institute, Chevy Chase, MD 21044, USA.
| |
Collapse
|
32
|
Toledo PL, Vazquez DS, Gianotti AR, Abate MB, Wegbrod C, Torkko JM, Solimena M, Ermácora MR. Condensation of the β-cell secretory granule luminal cargoes pro/insulin and ICA512 RESP18 homology domain. Protein Sci 2023; 32:e4649. [PMID: 37159024 PMCID: PMC10201709 DOI: 10.1002/pro.4649] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 04/28/2023] [Accepted: 04/29/2023] [Indexed: 05/10/2023]
Abstract
ICA512/PTPRN is a receptor tyrosine-like phosphatase implicated in the biogenesis and turnover of the insulin secretory granules (SGs) in pancreatic islet beta cells. Previously we found biophysical evidence that its luminal RESP18 homology domain (RESP18HD) forms a biomolecular condensate and interacts with insulin in vitro at close-to-neutral pH, that is, in conditions resembling those present in the early secretory pathway. Here we provide further evidence for the relevance of these findings by showing that at pH 6.8 RESP18HD interacts also with proinsulin-the physiological insulin precursor found in the early secretory pathway and the major luminal cargo of β-cell nascent SGs. Our light scattering analyses indicate that RESP18HD and proinsulin, but also insulin, populate nanocondensates ranging in size from 15 to 300 nm and 10e2 to 10e6 molecules. Co-condensation of RESP18HD with proinsulin/insulin transforms the initial nanocondensates into microcondensates (size >1 μm). The intrinsic tendency of proinsulin to self-condensate implies that, in the ER, a chaperoning mechanism must arrest its spontaneous intermolecular condensation to allow for proper intramolecular folding. These data further suggest that proinsulin is an early driver of insulin SG biogenesis, in a process in which its co-condensation with RESP18HD participates in their phase separation from other secretory proteins in transit through the same compartments but destined to other routes. Through the cytosolic tail of ICA512, proinsulin co-condensation with RESP18HD may further orchestrate the recruitment of cytosolic factors involved in membrane budding and fission of transport vesicles and nascent SGs.
Collapse
Affiliation(s)
- Pamela L. Toledo
- Departamento de Ciencia y TecnologíaUniversidad Nacional de QuilmesProvincia de Buenos AiresArgentina
- Grupo de Biología Estructural y Biotecnología, IMBICE, CONICETUniversidad Nacional de QuilmesProvincia de Buenos AiresArgentina
| | - Diego S. Vazquez
- Departamento de Ciencia y TecnologíaUniversidad Nacional de QuilmesProvincia de Buenos AiresArgentina
- Grupo de Biología Estructural y Biotecnología, IMBICE, CONICETUniversidad Nacional de QuilmesProvincia de Buenos AiresArgentina
| | - Alejo R. Gianotti
- Departamento de Ciencia y TecnologíaUniversidad Nacional de QuilmesProvincia de Buenos AiresArgentina
- Grupo de Biología Estructural y Biotecnología, IMBICE, CONICETUniversidad Nacional de QuilmesProvincia de Buenos AiresArgentina
| | - Milagros B. Abate
- Departamento de Ciencia y TecnologíaUniversidad Nacional de QuilmesProvincia de Buenos AiresArgentina
- Grupo de Biología Estructural y Biotecnología, IMBICE, CONICETUniversidad Nacional de QuilmesProvincia de Buenos AiresArgentina
| | - Carolin Wegbrod
- Department of Molecular DiabetologyUniversity Hospital and Faculty of Medicine, TU DresdenDresdenGermany
- Paul Langerhans Institute Dresden of Helmholtz Munich at the University Hospital and Faculty of Medicine, TU DresdenDresdenGermany
- German Center for Diabetes Research (DZD e.V.)NeuherbergGermany
| | - Juha M. Torkko
- Department of Molecular DiabetologyUniversity Hospital and Faculty of Medicine, TU DresdenDresdenGermany
- Paul Langerhans Institute Dresden of Helmholtz Munich at the University Hospital and Faculty of Medicine, TU DresdenDresdenGermany
- German Center for Diabetes Research (DZD e.V.)NeuherbergGermany
| | - Michele Solimena
- Department of Molecular DiabetologyUniversity Hospital and Faculty of Medicine, TU DresdenDresdenGermany
- Paul Langerhans Institute Dresden of Helmholtz Munich at the University Hospital and Faculty of Medicine, TU DresdenDresdenGermany
- German Center for Diabetes Research (DZD e.V.)NeuherbergGermany
| | - Mario R. Ermácora
- Departamento de Ciencia y TecnologíaUniversidad Nacional de QuilmesProvincia de Buenos AiresArgentina
- Grupo de Biología Estructural y Biotecnología, IMBICE, CONICETUniversidad Nacional de QuilmesProvincia de Buenos AiresArgentina
| |
Collapse
|
33
|
Ao K, Rohmann PFW, Huang S, Li L, Lipka V, Chen S, Wiermer M, Li X. Puncta-localized TRAF domain protein TC1b contributes to the autoimmunity of snc1. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2023; 114:591-612. [PMID: 36799433 DOI: 10.1111/tpj.16155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 02/07/2023] [Indexed: 05/04/2023]
Abstract
Immune receptors play important roles in the perception of pathogens and initiation of immune responses in both plants and animals. Intracellular nucleotide-binding domain leucine-rich repeat (NLR)-type receptors constitute a major class of receptors in vascular plants. In the Arabidopsis thaliana mutant suppressor of npr1-1, constitutive 1 (snc1), a gain-of-function mutation in the NLR gene SNC1 leads to SNC1 overaccumulation and constitutive activation of defense responses. From a CRISPR/Cas9-based reverse genetics screen in the snc1 autoimmune background, we identified that mutations in TRAF CANDIDATE 1b (TC1b), a gene encoding a protein with four tumor necrosis factor receptor-associated factor (TRAF) domains, can suppress snc1 phenotypes. TC1b does not appear to be a general immune regulator as it is not required for defense mediated by other tested immune receptors. TC1b also does not physically associate with SNC1, affect SNC1 accumulation, or affect signaling of the downstream helper NLRs represented by ACTIVATED DISEASE RESISTANCE PROTEIN 1-L2 (ADR1-L2), suggesting that TC1b impacts snc1 autoimmunity in a unique way. TC1b can form oligomers and localizes to punctate structures of unknown function. The puncta localization of TC1b strictly requires its coiled-coil (CC) domain, whereas the functionality of TC1b requires the four TRAF domains in addition to the CC. Overall, we uncovered the TRAF domain protein TC1b as a novel positive contributor to plant immunity.
Collapse
Affiliation(s)
- Kevin Ao
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, V6T 1Z4, Canada
- Department of Botany, University of British Columbia, Vancouver, British Columbia, V6T 1Z4, Canada
| | - Philipp F W Rohmann
- Molecular Biology of Plant-Microbe Interactions Research Group, Albrecht-von-Haller-Institute for Plant Sciences, University of Goettingen, D-37077, Goettingen, Germany
- Biochemistry of Plant-Microbe Interactions, Dahlem Centre of Plant Sciences, Institute of Biology, Freie Universität Berlin, 14195, Berlin, Germany
| | - Shuai Huang
- Department of Molecular Genetics, College of Arts and Sciences, Ohio State University, Columbus, Ohio, 43210, USA
| | - Lin Li
- National Institute of Biological Sciences, Beijing, 102206, China
| | - Volker Lipka
- Department of Plant Cell Biology, Albrecht-von-Haller-Institute for Plant Sciences, University of Goettingen, D-37077, Goettingen, Germany
- Central Microscopy Facility of the Faculty of Biology and Psychology, University of Goettingen, D-37077, Goettingen, Germany
| | - She Chen
- National Institute of Biological Sciences, Beijing, 102206, China
| | - Marcel Wiermer
- Molecular Biology of Plant-Microbe Interactions Research Group, Albrecht-von-Haller-Institute for Plant Sciences, University of Goettingen, D-37077, Goettingen, Germany
- Biochemistry of Plant-Microbe Interactions, Dahlem Centre of Plant Sciences, Institute of Biology, Freie Universität Berlin, 14195, Berlin, Germany
| | - Xin Li
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, V6T 1Z4, Canada
- Department of Botany, University of British Columbia, Vancouver, British Columbia, V6T 1Z4, Canada
| |
Collapse
|
34
|
Ray D, Laverty KU, Jolma A, Nie K, Samson R, Pour SE, Tam CL, von Krosigk N, Nabeel-Shah S, Albu M, Zheng H, Perron G, Lee H, Najafabadi H, Blencowe B, Greenblatt J, Morris Q, Hughes TR. RNA-binding proteins that lack canonical RNA-binding domains are rarely sequence-specific. Sci Rep 2023; 13:5238. [PMID: 37002329 PMCID: PMC10066285 DOI: 10.1038/s41598-023-32245-9] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 03/23/2023] [Indexed: 04/03/2023] Open
Abstract
Thousands of RNA-binding proteins (RBPs) crosslink to cellular mRNA. Among these are numerous unconventional RBPs (ucRBPs)-proteins that associate with RNA but lack known RNA-binding domains (RBDs). The vast majority of ucRBPs have uncharacterized RNA-binding specificities. We analyzed 492 human ucRBPs for intrinsic RNA-binding in vitro and identified 23 that bind specific RNA sequences. Most (17/23), including 8 ribosomal proteins, were previously associated with RNA-related function. We identified the RBDs responsible for sequence-specific RNA-binding for several of these 23 ucRBPs and surveyed whether corresponding domains from homologous proteins also display RNA sequence specificity. CCHC-zf domains from seven human proteins recognized specific RNA motifs, indicating that this is a major class of RBD. For Nudix, HABP4, TPR, RanBP2-zf, and L7Ae domains, however, only isolated members or closely related homologs yielded motifs, consistent with RNA-binding as a derived function. The lack of sequence specificity for most ucRBPs is striking, and we suggest that many may function analogously to chromatin factors, which often crosslink efficiently to cellular DNA, presumably via indirect recruitment. Finally, we show that ucRBPs tend to be highly abundant proteins and suggest their identification in RNA interactome capture studies could also result from weak nonspecific interactions with RNA.
Collapse
Affiliation(s)
- Debashish Ray
- Donnelly Centre, University of Toronto, Toronto, ON, M5S 3E1, Canada
| | - Kaitlin U Laverty
- Donnelly Centre, University of Toronto, Toronto, ON, M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada
| | - Arttu Jolma
- Donnelly Centre, University of Toronto, Toronto, ON, M5S 3E1, Canada
| | - Kate Nie
- Donnelly Centre, University of Toronto, Toronto, ON, M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada
| | - Reuben Samson
- Donnelly Centre, University of Toronto, Toronto, ON, M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada
| | - Sara E Pour
- Donnelly Centre, University of Toronto, Toronto, ON, M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada
| | - Cyrus L Tam
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Tri-Institutional Training Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Niklas von Krosigk
- Donnelly Centre, University of Toronto, Toronto, ON, M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada
| | - Syed Nabeel-Shah
- Donnelly Centre, University of Toronto, Toronto, ON, M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada
| | - Mihai Albu
- Donnelly Centre, University of Toronto, Toronto, ON, M5S 3E1, Canada
| | - Hong Zheng
- Donnelly Centre, University of Toronto, Toronto, ON, M5S 3E1, Canada
| | - Gabrielle Perron
- Department of Human Genetics, McGill University, Montréal, QC, H3A 0C7, Canada
- McGill Genome Centre, Montréal, QC, H3A 0G1, Canada
| | - Hyunmin Lee
- Donnelly Centre, University of Toronto, Toronto, ON, M5S 3E1, Canada
| | - Hamed Najafabadi
- Department of Human Genetics, McGill University, Montréal, QC, H3A 0C7, Canada
- McGill Genome Centre, Montréal, QC, H3A 0G1, Canada
| | - Benjamin Blencowe
- Donnelly Centre, University of Toronto, Toronto, ON, M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada
| | - Jack Greenblatt
- Donnelly Centre, University of Toronto, Toronto, ON, M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada
| | - Quaid Morris
- Donnelly Centre, University of Toronto, Toronto, ON, M5S 3E1, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada.
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
- Tri-Institutional Training Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, NY, USA.
| | - Timothy R Hughes
- Donnelly Centre, University of Toronto, Toronto, ON, M5S 3E1, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada.
| |
Collapse
|
35
|
Han B, Ren C, Wang W, Li J, Gong X. Computational Prediction of Protein Intrinsically Disordered Region Related Interactions and Functions. Genes (Basel) 2023; 14:432. [PMID: 36833360 PMCID: PMC9956190 DOI: 10.3390/genes14020432] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 02/02/2023] [Accepted: 02/05/2023] [Indexed: 02/11/2023] Open
Abstract
Intrinsically Disordered Proteins (IDPs) and Regions (IDRs) exist widely. Although without well-defined structures, they participate in many important biological processes. In addition, they are also widely related to human diseases and have become potential targets in drug discovery. However, there is a big gap between the experimental annotations related to IDPs/IDRs and their actual number. In recent decades, the computational methods related to IDPs/IDRs have been developed vigorously, including predicting IDPs/IDRs, the binding modes of IDPs/IDRs, the binding sites of IDPs/IDRs, and the molecular functions of IDPs/IDRs according to different tasks. In view of the correlation between these predictors, we have reviewed these prediction methods uniformly for the first time, summarized their computational methods and predictive performance, and discussed some problems and perspectives.
Collapse
Affiliation(s)
- Bingqing Han
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Chongjiao Ren
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Wenda Wang
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Jiashan Li
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Xinqi Gong
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
- Beijing Academy of Intelligence, Beijing 100083, China
| |
Collapse
|
36
|
Klaus L, de Almeida BP, Vlasova A, Nemčko F, Schleiffer A, Bergauer K, Hofbauer L, Rath M, Stark A. Systematic identification and characterization of repressive domains in Drosophila transcription factors. EMBO J 2023; 42:e112100. [PMID: 36545802 PMCID: PMC9890238 DOI: 10.15252/embj.2022112100] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 11/21/2022] [Accepted: 12/01/2022] [Indexed: 12/24/2022] Open
Abstract
All multicellular life relies on differential gene expression, determined by regulatory DNA elements and DNA-binding transcription factors that mediate activation and repression via cofactor recruitment. While activators have been extensively characterized, repressors are less well studied: the identities and properties of their repressive domains (RDs) are typically unknown and the specific co-repressors (CoRs) they recruit have not been determined. Here, we develop a high-throughput, next-generation sequencing-based screening method, repressive-domain (RD)-seq, to systematically identify RDs in complex DNA-fragment libraries. Screening more than 200,000 fragments covering the coding sequences of all transcription-related proteins in Drosophila melanogaster, we identify 195 RDs in known repressors and in proteins not previously associated with repression. Many RDs contain recurrent short peptide motifs, which are conserved between fly and human and are required for RD function, as demonstrated by motif mutagenesis. Moreover, we show that RDs that contain one of five distinct repressive motifs interact with and depend on different CoRs, such as Groucho, CtBP, Sin3A, or Smrter. These findings advance our understanding of repressors, their sequences, and the functional impact of sequence-altering mutations and should provide a valuable resource for further studies.
Collapse
Affiliation(s)
- Loni Klaus
- Research Institute of Molecular Pathology (IMP)Vienna BioCenter (VBC)ViennaAustria
- Vienna BioCenter PhD ProgramDoctoral School of the University of Vienna and Medical University of ViennaViennaAustria
| | - Bernardo P de Almeida
- Research Institute of Molecular Pathology (IMP)Vienna BioCenter (VBC)ViennaAustria
- Vienna BioCenter PhD ProgramDoctoral School of the University of Vienna and Medical University of ViennaViennaAustria
| | - Anna Vlasova
- Research Institute of Molecular Pathology (IMP)Vienna BioCenter (VBC)ViennaAustria
| | - Filip Nemčko
- Research Institute of Molecular Pathology (IMP)Vienna BioCenter (VBC)ViennaAustria
- Vienna BioCenter PhD ProgramDoctoral School of the University of Vienna and Medical University of ViennaViennaAustria
| | - Alexander Schleiffer
- Research Institute of Molecular Pathology (IMP)Vienna BioCenter (VBC)ViennaAustria
- Institute of Molecular Biotechnology (IMBA)Vienna BioCenter (VBC)ViennaAustria
| | - Katharina Bergauer
- Research Institute of Molecular Pathology (IMP)Vienna BioCenter (VBC)ViennaAustria
| | - Lorena Hofbauer
- Research Institute of Molecular Pathology (IMP)Vienna BioCenter (VBC)ViennaAustria
- Vienna BioCenter PhD ProgramDoctoral School of the University of Vienna and Medical University of ViennaViennaAustria
| | - Martina Rath
- Research Institute of Molecular Pathology (IMP)Vienna BioCenter (VBC)ViennaAustria
| | - Alexander Stark
- Research Institute of Molecular Pathology (IMP)Vienna BioCenter (VBC)ViennaAustria
- Medical University of ViennaVienna BioCenter (VBC)ViennaAustria
| |
Collapse
|
37
|
Ohtsuka M, Imafuku J, Hori S, Kurosaki A, Nakamura A, Nakahara T, Yahata T, Bhat K, Papastefan ST, Nakagawa S, Quadros RM, Miura H, Gurumurthy CB. Delivering mRNAs to mouse tissues using the SEND system. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.28.522652. [PMID: 36747769 PMCID: PMC9900891 DOI: 10.1101/2023.01.28.522652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
mRNAs produced in a cell are almost always translated within the same cell. Some mRNAs are transported to other cells of the organism through processes involving membrane nanotubes or extracellular vesicles. A recent report describes a surprising new phenomenon of encapsulating mRNAs inside virus-like particles (VLPs) to deliver them to other cells in a process that was named SEND (Selective Endogenous eNcapsidation for cellular Delivery). Although the seminal work demonstrates the SEND process in cultured cells, it is unknown whether this phenomenon occurs in vivo . Here, we demonstrate the SEND process in living organisms using specially designed genetically engineered mouse models. Our proof of principle study lays a foundation for the SEND-VLP system to potentially be used as a gene therapy tool to deliver therapeutically important mRNAs to tissues.
Collapse
|
38
|
Liaisons dangereuses: Intrinsic Disorder in Cellular Proteins Recruited to Viral Infection-Related Biocondensates. Int J Mol Sci 2023; 24:ijms24032151. [PMID: 36768473 PMCID: PMC9917183 DOI: 10.3390/ijms24032151] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 01/11/2023] [Accepted: 01/19/2023] [Indexed: 01/25/2023] Open
Abstract
Liquid-liquid phase separation (LLPS) is responsible for the formation of so-called membrane-less organelles (MLOs) that are essential for the spatio-temporal organization of the cell. Intrinsically disordered proteins (IDPs) or regions (IDRs), either alone or in conjunction with nucleic acids, are involved in the formation of these intracellular condensates. Notably, viruses exploit LLPS at their own benefit to form viral replication compartments. Beyond giving rise to biomolecular condensates, viral proteins are also known to partition into cellular MLOs, thus raising the question as to whether these cellular phase-separating proteins are drivers of LLPS or behave as clients/regulators. Here, we focus on a set of eukaryotic proteins that are either sequestered in viral factories or colocalize with viral proteins within cellular MLOs, with the primary goal of gathering organized, predicted, and experimental information on these proteins, which constitute promising targets for innovative antiviral strategies. Using various computational approaches, we thoroughly investigated their disorder content and inherent propensity to undergo LLPS, along with their biological functions and interactivity networks. Results show that these proteins are on average, though to varying degrees, enriched in disorder, with their propensity for phase separation being correlated, as expected, with their disorder content. A trend, which awaits further validation, tends to emerge whereby the most disordered proteins serve as drivers, while more ordered cellular proteins tend instead to be clients of viral factories. In light of their high disorder content and their annotated LLPS behavior, most proteins in our data set are drivers or co-drivers of molecular condensation, foreshadowing a key role of these cellular proteins in the scaffolding of viral infection-related MLOs.
Collapse
|
39
|
Sequence-Based Prediction of Protein Phase Separation: The Role of Beta-Pairing Propensity. Biomolecules 2022; 12:biom12121771. [PMID: 36551199 PMCID: PMC9775558 DOI: 10.3390/biom12121771] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 11/10/2022] [Accepted: 11/24/2022] [Indexed: 11/29/2022] Open
Abstract
The formation of droplets of bio-molecular condensates through liquid-liquid phase separation (LLPS) of their component proteins is a key factor in the maintenance of cellular homeostasis. Different protein properties were shown to be important in LLPS onset, making it possible to develop predictors, which try to discriminate a positive set of proteins involved in LLPS against a negative set of proteins not involved in LLPS. On the other hand, the redundancy and multivalency of the interactions driving LLPS led to the suggestion that the large conformational entropy associated with non specific side-chain interactions is also a key factor in LLPS. In this work we build a LLPS predictor which combines the ability to form pi-pi interactions, with an unrelated feature, the propensity to stabilize the β-pairing interaction mode. The cross-β structure is formed in the amyloid aggregates, which are involved in degenerative diseases and may be the final thermodynamically stable state of protein condensates. Our results show that the combination of pi-pi and β-pairing propensity yields an improved performance. They also suggest that protein sequences are more likely to be involved in phase separation if the main chain conformational entropy of the β-pairing maintained droplet state is increased. This would stabilize the droplet state against the more ordered amyloid state. Interestingly, the entropic stabilization of the droplet state appears to proceed according to different mechanisms, depending on the fraction of "droplet-driving" proteins present in the positive set.
Collapse
|
40
|
Verdonk CJ, Marshall AC, Ramsay JP, Bond CS. Crystallographic and X-ray scattering study of RdfS, a recombination directionality factor from an integrative and conjugative element. Acta Crystallogr D Struct Biol 2022; 78:1210-1220. [PMID: 36189741 PMCID: PMC9527761 DOI: 10.1107/s2059798322008579] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Accepted: 08/25/2022] [Indexed: 11/24/2022] Open
Abstract
The recombination directionality factors from Mesorhizobium spp. (RdfS) are involved in regulating the excision and transfer of integrative and conjugative elements. Here, solution small-angle X-ray scattering, and crystallization and preliminary structure solution of RdfS from Mesorhizobium japonicum R7A are presented. RdfS crystallizes in space group P212121, with evidence of eightfold rotational crystallographic/noncrystallographic symmetry. Initial structure determination by molecular replacement using ab initio models yielded a partial model (three molecules), which was completed after manual inspection revealed unmodelled electron density. The finalized crystal structure of RdfS reveals a head-to-tail polymer forming left-handed superhelices with large solvent channels. Additionally, RdfS has significant disorder in the C-terminal region of the protein, which is supported by the solution scattering data and the crystal structure. The steps taken to finalize structure determination, as well as the scattering and crystallographic characteristics of RdfS, are discussed.
Collapse
Affiliation(s)
- Callum J. Verdonk
- School of Molecular Sciences, University of Western Australia, Perth, Western Australia 6009, Australia
- Curtin Health Innovation Research Institute and Curtin Medical School, Curtin University, Perth, Western Australia 6102, Australia
| | - Andrew C. Marshall
- School of Molecular Sciences, University of Western Australia, Perth, Western Australia 6009, Australia
| | - Joshua P. Ramsay
- Curtin Health Innovation Research Institute and Curtin Medical School, Curtin University, Perth, Western Australia 6102, Australia
| | - Charles S. Bond
- School of Molecular Sciences, University of Western Australia, Perth, Western Australia 6009, Australia
- Marshall Centre for Infectious Disease, Research and Training, School of Biomedical Sciences, University of Western Australia, Perth, Western Australia 6009, Australia
| |
Collapse
|
41
|
Accounting for small variations in the tracrRNA sequence improves sgRNA activity predictions for CRISPR screening. Nat Commun 2022; 13:5255. [PMID: 36068235 PMCID: PMC9448816 DOI: 10.1038/s41467-022-33024-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2022] [Accepted: 08/30/2022] [Indexed: 12/17/2022] Open
Abstract
CRISPR technology is a powerful tool for studying genome function. To aid in picking sgRNAs that have maximal efficacy against a target of interest from many possible options, several groups have developed models that predict sgRNA on-target activity. Although multiple tracrRNA variants are commonly used for screening, no existing models account for this feature when nominating sgRNAs. Here we develop an on-target model, Rule Set 3, that makes optimal predictions for multiple tracrRNA variants. We validate Rule Set 3 on a new dataset of sgRNAs tiling essential and non-essential genes, demonstrating substantial improvement over prior prediction models. By analyzing the differences in sgRNA activity between tracrRNA variants, we show that Pol III transcription termination is a strong determinant of sgRNA activity. We expect these results to improve the performance of CRISPR screening and inform future research on tracrRNA engineering and sgRNA modeling.
Collapse
|
42
|
Ramasamy P, Vandermarliere E, Vranken WF, Martens L. Panoramic Perspective on Human Phosphosites. J Proteome Res 2022; 21:1894-1915. [PMID: 35793420 DOI: 10.1021/acs.jproteome.2c00164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Protein phosphorylation is the most common reversible post-translational modification of proteins and is key in the regulation of many cellular processes. Due to this importance, phosphorylation is extensively studied, resulting in the availability of a large amount of mass spectrometry-based phospho-proteomics data. Here, we leverage the information in these large-scale phospho-proteomics data sets, as contained in Scop3P, to analyze and characterize proteome-wide protein phosphorylation sites (P-sites). First, we set out to differentiate correctly observed P-sites from false-positive sites using five complementary site properties. We then describe the context of these P-sites in terms of the protein structure, solvent accessibility, structural transitions and disorder, and biophysical properties. We also investigate the relative prevalence of disease-linked mutations on and around P-sites. Moreover, we assess the structural dynamics of P-sites in their phosphorylated and unphosphorylated states. As a result, we show how large-scale reprocessing of available proteomics experiments can enable a more reliable view on proteome-wide P-sites. Furthermore, adding the structural context of proteins around P-sites helps uncover possible conformational switches upon phosphorylation. Moreover, by placing sites in different biophysical contexts, we show the differential preference in protein dynamics at phosphorylated sites when compared to the nonphosphorylated counterparts.
Collapse
Affiliation(s)
- Pathmanaban Ramasamy
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, 9000 Ghent, Belgium.,Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, 1050 Brussels, Belgium.,Structural Biology Brussels, Vrije Universiteit Brussel, 1050 Brussels, Belgium.,Centre for Structural Biology, VIB, 1050 Brussels, Belgium
| | | | - Wim F Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, 1050 Brussels, Belgium.,Structural Biology Brussels, Vrije Universiteit Brussel, 1050 Brussels, Belgium.,Centre for Structural Biology, VIB, 1050 Brussels, Belgium
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, 9000 Ghent, Belgium
| |
Collapse
|
43
|
Fusto A, Cassandrini D, Fiorillo C, Codemo V, Astrea G, D’Amico A, Maggi L, Magri F, Pane M, Tasca G, Sabbatini D, Bello L, Battini R, Bernasconi P, Fattori F, Bertini ES, Comi G, Messina S, Mongini T, Moroni I, Panicucci C, Berardinelli A, Donati A, Nigro V, Pini A, Giannotta M, Dosi C, Ricci E, Mercuri E, Minervini G, Tosatto S, Santorelli F, Bruno C, Pegoraro E. Expanding the clinical-pathological and genetic spectrum of RYR1-related congenital myopathies with cores and minicores: an Italian population study. Acta Neuropathol Commun 2022; 10:54. [PMID: 35428369 PMCID: PMC9013059 DOI: 10.1186/s40478-022-01357-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Accepted: 03/25/2022] [Indexed: 11/10/2022] Open
Abstract
Mutations in the RYR1 gene, encoding ryanodine receptor 1 (RyR1), are a well-known cause of Central Core Disease (CCD) and Multi-minicore Disease (MmD). We screened a cohort of 153 patients carrying an histopathological diagnosis of core myopathy (cores and minicores) for RYR1 mutation. At least one RYR1 mutation was identified in 69 of them and these patients were further studied. Clinical and histopathological features were collected. Clinical phenotype was highly heterogeneous ranging from asymptomatic or paucisymptomatic hyperCKemia to severe muscle weakness and skeletal deformity with loss of ambulation. Sixty-eight RYR1 mutations, generally missense, were identified, of which 16 were novel. The combined analysis of the clinical presentation, disease progression and the structural bioinformatic analyses of RYR1 allowed to associate some phenotypes to mutations in specific domains. In addition, this study highlighted the structural bioinformatics potential in the prediction of the pathogenicity of RYR1 mutations. Further improvement in the comprehension of genotype-phenotype relationship of core myopathies can be expected in the next future: the actual lack of the human RyR1 crystal structure paired with the presence of large intrinsically disordered regions in RyR1, and the frequent presence of more than one RYR1 mutation in core myopathy patients, require designing novel investigation strategies to completely address RyR1 mutation effect.
Collapse
|
44
|
Reith MEA, Kortagere S, Wiers CE, Sun H, Kurian MA, Galli A, Volkow ND, Lin Z. The dopamine transporter gene SLC6A3: multidisease risks. Mol Psychiatry 2022; 27:1031-1046. [PMID: 34650206 PMCID: PMC9008071 DOI: 10.1038/s41380-021-01341-5] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 09/28/2021] [Accepted: 10/01/2021] [Indexed: 02/02/2023]
Abstract
The human dopamine transporter gene SLC6A3 has been consistently implicated in several neuropsychiatric diseases but the disease mechanism remains elusive. In this risk synthesis, we have concluded that SLC6A3 represents an increasingly recognized risk with a growing number of familial mutants associated with neuropsychiatric and neurological disorders. At least five loci were related to common and severe diseases including alcohol use disorder (high activity variant), attention-deficit/hyperactivity disorder (low activity variant), autism (familial proteins with mutated networking) and movement disorders (both regulatory variants and familial mutations). Association signals depended on genetic markers used as well as ethnicity examined. Strong haplotype selection and gene-wide epistases support multimarker assessment of functional variations and phenotype associations. Inclusion of its promoter region's functional markers such as DNPi (rs67175440) and 5'VNTR (rs70957367) may help delineate condensate-based risk action, testing a locus-pathway-phenotype hypothesis for one gene-multidisease etiology.
Collapse
Affiliation(s)
- Maarten E A Reith
- Department of Psychiatry, New York University School of Medicine, New York City, NY, 10016, USA
| | - Sandhya Kortagere
- Department of Microbiology and Immunology, Drexel University College of Medicine, Philadelphia, PA, 19129, USA
| | - Corinde E Wiers
- Laboratory of Neuroimaging, National Institute on Alcohol Abuse and Alcoholism, Bethesda, MD, 20817, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Hui Sun
- Laboratory of Neuroimaging, National Institute on Alcohol Abuse and Alcoholism, Bethesda, MD, 20817, USA
| | - Manju A Kurian
- Molecular Neurosciences, Developmental Neurosciences, Zayed Centre for Research into Rare Diseases in Children, UCL Great Ormond Street Institute of Child Health, and Department of Neurology, Great Ormond Street Hospital, London, WC1N 1EH, UK
| | - Aurelio Galli
- Department of Surgery, University of Alabama at Birmingham, Birmingham, AL, 35294, USA
| | - Nora D Volkow
- Laboratory of Neuroimaging, National Institute on Alcohol Abuse and Alcoholism, Bethesda, MD, 20817, USA
- National Institute on Drug Abuse, Bethesda, MD, 20817, USA
| | - Zhicheng Lin
- Laboratory of Psychiatric Neurogenomics, McLean Hospital, and Department of Psychiatry, Harvard Medical School, Belmont, MA, 02478, USA.
| |
Collapse
|
45
|
Piovesan D, Monzon AM, Quaglia F, Tosatto SCE. Databases for intrinsically disordered proteins. Acta Crystallogr D Struct Biol 2022; 78:144-151. [PMID: 35102880 PMCID: PMC8805306 DOI: 10.1107/s2059798321012109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Accepted: 11/12/2021] [Indexed: 11/28/2022] Open
Abstract
Intrinsically disordered regions (IDRs) lacking a fixed three-dimensional protein structure are widespread and play a central role in cell regulation. Only a small fraction of IDRs have been functionally characterized, with heterogeneous experimental evidence that is largely buried in the literature. Predictions of IDRs are still difficult to estimate and are poorly characterized. Here, an overview of the publicly available knowledge about IDRs is reported, including manually curated resources, deposition databases and prediction repositories. The types, scopes and availability of the various resources are analyzed, and their complementarity and overlap are highlighted. The volume of information included and the relevance to the field of structural biology are compared.
Collapse
Affiliation(s)
- Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | | | - Federica Quaglia
- Department of Biomedical Sciences, University of Padova, Padova, Italy
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR–IBIOM), Bari, Italy
| | | |
Collapse
|
46
|
Quaglia F, Mészáros B, Salladini E, Hatos A, Pancsa R, Chemes LB, Pajkos M, Lazar T, Peña-Díaz S, Santos J, Ács V, Farahi N, Fichó E, Aspromonte M, Bassot C, Chasapi A, Davey N, Davidović R, Dobson L, Elofsson A, Erdős G, Gaudet P, Giglio M, Glavina J, Iserte J, Iglesias V, Kálmán Z, Lambrughi M, Leonardi E, Longhi S, Macedo-Ribeiro S, Maiani E, Marchetti J, Marino-Buslje C, Mészáros A, Monzon A, Minervini G, Nadendla S, Nilsson JF, Novotný M, Ouzounis C, Palopoli N, Papaleo E, Pereira P, Pozzati G, Promponas V, Pujols J, Rocha AS, Salas M, Sawicki LR, Schad E, Shenoy A, Szaniszló T, Tsirigos K, Veljkovic N, Parisi G, Ventura S, Dosztányi Z, Tompa P, Tosatto SCE, Piovesan D. DisProt in 2022: improved quality and accessibility of protein intrinsic disorder annotation. Nucleic Acids Res 2022; 50:D480-D487. [PMID: 34850135 PMCID: PMC8728214 DOI: 10.1093/nar/gkab1082] [Citation(s) in RCA: 107] [Impact Index Per Article: 35.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/15/2021] [Accepted: 10/20/2021] [Indexed: 02/03/2023] Open
Abstract
The Database of Intrinsically Disordered Proteins (DisProt, URL: https://disprot.org) is the major repository of manually curated annotations of intrinsically disordered proteins and regions from the literature. We report here recent updates of DisProt version 9, including a restyled web interface, refactored Intrinsically Disordered Proteins Ontology (IDPO), improvements in the curation process and significant content growth of around 30%. Higher quality and consistency of annotations is provided by a newly implemented reviewing process and training of curators. The increased curation capacity is fostered by the integration of DisProt with APICURON, a dedicated resource for the proper attribution and recognition of biocuration efforts. Better interoperability is provided through the adoption of the Minimum Information About Disorder (MIADE) standard, an active collaboration with the Gene Ontology (GO) and Evidence and Conclusion Ontology (ECO) consortia and the support of the ELIXIR infrastructure.
Collapse
Affiliation(s)
- Federica Quaglia
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR-IBIOM), Bari, Italy
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Bálint Mészáros
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany
| | - Edoardo Salladini
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - András Hatos
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Rita Pancsa
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest 1117, Hungary
| | - Lucía B Chemes
- Instituto de Investigaciones Biotecnológicas (IIBiO-CONICET), Universidad Nacional de San Martín, Av. 25 de Mayo y Francia, CP1650 Buenos Aires, Argentina
| | - Mátyás Pajkos
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Tamas Lazar
- VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnology, Brussels, Belgium
- Structural Biology Brussels (SBB), Bioengineering Sciences Department, Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | - Samuel Peña-Díaz
- Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Barcelona, Spain
- Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Jaime Santos
- Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Barcelona, Spain
- Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Veronika Ács
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest 1117, Hungary
| | - Nazanin Farahi
- VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnology, Brussels, Belgium
- Structural Biology Brussels (SBB), Bioengineering Sciences Department, Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | - Erzsébet Fichó
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest 1117, Hungary
- Cytocast Kft., Vecsés, Hungary
| | - Maria Cristina Aspromonte
- Department of Woman and Child Health, University of Padova, Padova, Italy
- Pediatric Research Institute, Città della Speranza, Padova, Italy
| | - Claudio Bassot
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, 171 21 Solna, Sweden
| | - Anastasia Chasapi
- Biological Computation & Process Laboratory, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas, Thermi, Thessalonica 57001, Greece
| | - Norman E Davey
- Institute of Cancer Research, Chester Beatty Laboratories, 237 Fulham Rd, Chelsea, London, UK
| | - Radoslav Davidović
- Laboratory for Bioinformatics and Computational Chemistry, Vinča Institute of Nuclear Sciences, National Institute of the Republic of Serbia, University of Belgrade, 11000Belgrade, Serbia
| | - Laszlo Dobson
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest 1117, Hungary
| | - Arne Elofsson
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, 171 21 Solna, Sweden
| | - Gábor Erdős
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Pascale Gaudet
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Michelle Giglio
- Institute for Genome Sciences, University of Maryland School of Medicine 670 W. Baltimore St., Baltimore, MD 21201, USA
| | - Juliana Glavina
- Instituto de Investigaciones Biotecnológicas (IIBiO-CONICET), Universidad Nacional de San Martín, Av. 25 de Mayo y Francia, CP1650 Buenos Aires, Argentina
| | - Javier Iserte
- Bioinformatics Unit, Fundación Instituto Leloir, Buenos Aires, C1405BWE, Argentina
| | - Valentín Iglesias
- Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Barcelona, Spain
- Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Zsófia Kálmán
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Práter u. 50/A, 1083 Budapest, Hungary
| | - Matteo Lambrughi
- Cancer Structural Biology, Danish Cancer Society Research Center, Strandboulevarden 49, 2100 Copenhagen, Denmark
| | - Emanuela Leonardi
- Department of Woman and Child Health, University of Padova, Padova, Italy
- Pediatric Research Institute, Città della Speranza, Padova, Italy
| | - Sonia Longhi
- Lab. Architecture et Fonction des Macromolécules Biologiques (AFMB), UMR 7257, Aix Marseille University and Centre National de la Recherche Scientifique (CNRS), 163 Avenue de Luminy, Case 932, 13288, Marseille, France
| | - Sandra Macedo-Ribeiro
- Instituto de Biologia Molecular e Celular (IBMC), Universidade do Porto, 4200-135 Porto, Portugal
- Instituto de Investigação e Inovação em Saúde (i3S), Universidade do Porto, 4200-135 Porto, Portugal
| | - Emiliano Maiani
- Cancer Structural Biology, Danish Cancer Society Research Center, Strandboulevarden 49, 2100 Copenhagen, Denmark
| | - Julia Marchetti
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires B1876BXD, Argentina
| | | | - Attila Mészáros
- VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnology, Brussels, Belgium
- Structural Biology Brussels (SBB), Bioengineering Sciences Department, Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | | | | | - Suvarna Nadendla
- Institute for Genome Sciences, University of Maryland School of Medicine 670 W. Baltimore St., Baltimore, MD 21201, USA
| | - Juliet F Nilsson
- Lab. Architecture et Fonction des Macromolécules Biologiques (AFMB), UMR 7257, Aix Marseille University and Centre National de la Recherche Scientifique (CNRS), 163 Avenue de Luminy, Case 932, 13288, Marseille, France
| | - Marian Novotný
- Dep. of Cell Biology, Faculty of Science, Vinicna 7, 128 43, Prague, Czech Republic
| | - Christos A Ouzounis
- Biological Computation & Process Laboratory, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas, Thermi, Thessalonica 57001, Greece
- Biological Computation & Computational Biology Group, Artificial Intelligence & Information Analysis Lab, Department of Computer Science, Aristotle University of Thessalonica, Thessalonica 54124, Greece
| | - Nicolás Palopoli
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires B1876BXD, Argentina
| | - Elena Papaleo
- Cancer Structural Biology, Danish Cancer Society Research Center, Strandboulevarden 49, 2100 Copenhagen, Denmark
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and Technology, Technical University of Denmark, Lyngby, Denmark
| | - Pedro José Barbosa Pereira
- Instituto de Biologia Molecular e Celular (IBMC), Universidade do Porto, 4200-135 Porto, Portugal
- Instituto de Investigação e Inovação em Saúde (i3S), Universidade do Porto, 4200-135 Porto, Portugal
| | - Gabriele Pozzati
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, 171 21 Solna, Sweden
| | - Vasilis J Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | - Jordi Pujols
- Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Barcelona, Spain
- Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Barcelona, Spain
| | | | - Martin Salas
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires B1876BXD, Argentina
| | - Luciana Rodriguez Sawicki
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires B1876BXD, Argentina
| | - Eva Schad
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest 1117, Hungary
| | - Aditi Shenoy
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, 171 21 Solna, Sweden
| | - Tamás Szaniszló
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Konstantinos D Tsirigos
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Nevena Veljkovic
- Laboratory for Bioinformatics and Computational Chemistry, Vinča Institute of Nuclear Sciences, National Institute of the Republic of Serbia, University of Belgrade, 11000Belgrade, Serbia
| | - Gustavo Parisi
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires B1876BXD, Argentina
| | - Salvador Ventura
- Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Barcelona, Spain
- Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Barcelona, Spain
- ICREA, Barcelona, Spain
| | - Zsuzsanna Dosztányi
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Peter Tompa
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest 1117, Hungary
- VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnology, Brussels, Belgium
- Structural Biology Brussels (SBB), Bioengineering Sciences Department, Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | | | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| |
Collapse
|
47
|
Tamburrini KC, Pesce G, Nilsson J, Gondelaud F, Kajava AV, Berrin JG, Longhi S. Predicting Protein Conformational Disorder and Disordered Binding Sites. Methods Mol Biol 2022; 2449:95-147. [PMID: 35507260 DOI: 10.1007/978-1-0716-2095-3_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In the last two decades it has become increasingly evident that a large number of proteins adopt either a fully or a partially disordered conformation. Intrinsically disordered proteins are ubiquitous proteins that fulfill essential biological functions while lacking a stable 3D structure. Their conformational heterogeneity is encoded by the amino acid sequence, thereby allowing intrinsically disordered proteins or regions to be recognized based on their sequence properties. The identification of disordered regions facilitates the functional annotation of proteins and is instrumental for delineating boundaries of protein domains amenable to crystallization. This chapter focuses on the methods currently employed for predicting protein disorder and identifying intrinsically disordered binding sites.
Collapse
Affiliation(s)
- Ketty C Tamburrini
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France
- INRAE, Aix Marseille Univ, Biodiversité et Biotechnologie Fongiques (BBF), UMR 1163, Marseille, France
| | - Giulia Pesce
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France
| | - Juliet Nilsson
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France
| | - Frank Gondelaud
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France
| | - Andrey V Kajava
- Centre de Recherche en Biologie cellulaire de Montpellier, UMR 5237, CNRS, Université Montpellier, Montpellier, France
| | - Jean-Guy Berrin
- INRAE, Aix Marseille Univ, Biodiversité et Biotechnologie Fongiques (BBF), UMR 1163, Marseille, France
| | - Sonia Longhi
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France.
| |
Collapse
|
48
|
Manfredi M, Savojardo C, Martelli PL, Casadio R. DeepREx-WS: A web server for characterising protein-solvent interaction starting from sequence. Comput Struct Biotechnol J 2021; 19:5791-5799. [PMID: 34765094 PMCID: PMC8566768 DOI: 10.1016/j.csbj.2021.10.016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Revised: 10/07/2021] [Accepted: 10/07/2021] [Indexed: 11/23/2022] Open
Abstract
Protein–solvent interaction provides important features for protein surface engineering when the structure is absent or partially solved. Presently, we can integrate the notion of solvent exposed/buried residues with that of their flexibility and intrinsic disorder to highlight regions where mutations may increase or decrease protein stability in order to modify proteins for biotechnological reasons, while preserving their functional integrity. Here we describe a web server, which provides the unique possibility of integrating knowledge of solvent and non-solvent exposure with that of residue conservation, flexibility and disorder of a protein sequence, for a better understanding of which regions are relevant for protein integrity. The core of the webserver is DeepREx, a novel deep learning-based tool that classifies each residue in the sequence as buried or exposed. DeepREx is trained on a high-quality, non-redundant dataset derived from the Protein Data Bank comprising 2332 monomeric protein chains and benchmarked on a blind test set including 200 protein sequences unrelated with the training set. Results show that DeepREx performs at the state-of-the-art in the field. In turn, the Web Server, DeepREx-WS, supplements the predictions of DeepREx with features that allow a better characterisation of exposed and buried regions: i) residue conservation derived from multiple sequence alignment; ii) local sequence hydrophobicity; iii) residue flexibility computed with MEDUSA; iv) a predictor of secondary structure; v) the presence of disordered regions as derived from MobiDB-Lite3.0. The web server allows browsing, selecting and intersecting the different features. We demonstrate a possible application of the DeepREx-WS for assisting the identification of residues to be variated in protein surface engineering processes.
Collapse
Affiliation(s)
- Matteo Manfredi
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Castrense Savojardo
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Pier Luigi Martelli
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
- Corresponding author.
| | - Rita Casadio
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies (IBIOM), Italian National Research Council (CNR), Bari, Italy
| |
Collapse
|
49
|
Tamburrini KC, Terrapon N, Lombard V, Bissaro B, Longhi S, Berrin JG. Bioinformatic Analysis of Lytic Polysaccharide Monooxygenases Reveals the Pan-Families Occurrence of Intrinsically Disordered C-Terminal Extensions. Biomolecules 2021; 11:1632. [PMID: 34827630 PMCID: PMC8615602 DOI: 10.3390/biom11111632] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 10/26/2021] [Accepted: 10/30/2021] [Indexed: 01/17/2023] Open
Abstract
Lytic polysaccharide monooxygenases (LPMOs) are monocopper enzymes secreted by many organisms and viruses. LPMOs catalyze the oxidative cleavage of different types of polysaccharides and are today divided into eight families (AA9-11, AA13-17) within the Auxiliary Activity enzyme class of the CAZy database. LPMOs minimal architecture encompasses a catalytic domain, to which can be appended a carbohydrate-binding module. Intriguingly, we observed that some LPMO sequences also display a C-terminal extension of varying length not associated with any known function or fold. Here, we analyzed 27,060 sequences from different LPMO families and show that 60% have a C-terminal extension predicted to be intrinsically disordered. Our analysis shows that these disordered C-terminal regions (dCTRs) are widespread in all LPMO families (except AA13) and differ in terms of sequence length and amino-acid composition. Noteworthily, these dCTRs have so far only been observed in LPMOs. LPMO-dCTRs share a common polyampholytic nature and an enrichment in serine and threonine residues, suggesting that they undergo post-translational modifications. Interestingly, dCTRs from AA11 and AA15 are enriched in redox-sensitive, conditionally disordered regions. The widespread occurrence of dCTRs in LPMOs from evolutionarily very divergent organisms, hints at a possible functional role and opens new prospects in the field of LPMOs.
Collapse
Affiliation(s)
- Ketty C. Tamburrini
- Architecture et Fonction des Macromolécules Biologiques (AFMB), Centre National de la Recherche Scientifique (CNRS), Aix-Marseille Université (AMU), UMR 7257, 13288 Marseille, France; (K.C.T.); (N.T.); (V.L.)
- Biodiversité et Biotechnologie Fongiques (BBF), French National Institute for Agriculture, Food, and Environment (INRAE), Aix-Marseille Université (AMU), UMR 1163, 13288 Marseille, France;
| | - Nicolas Terrapon
- Architecture et Fonction des Macromolécules Biologiques (AFMB), Centre National de la Recherche Scientifique (CNRS), Aix-Marseille Université (AMU), UMR 7257, 13288 Marseille, France; (K.C.T.); (N.T.); (V.L.)
- Architecture et Fonction des Macromolécules Biologiques (AFMB), French National Institute for Agriculture, Food, and Environment (INRAE), USC 1408, 13288 Marseille, France
| | - Vincent Lombard
- Architecture et Fonction des Macromolécules Biologiques (AFMB), Centre National de la Recherche Scientifique (CNRS), Aix-Marseille Université (AMU), UMR 7257, 13288 Marseille, France; (K.C.T.); (N.T.); (V.L.)
- Architecture et Fonction des Macromolécules Biologiques (AFMB), French National Institute for Agriculture, Food, and Environment (INRAE), USC 1408, 13288 Marseille, France
| | - Bastien Bissaro
- Biodiversité et Biotechnologie Fongiques (BBF), French National Institute for Agriculture, Food, and Environment (INRAE), Aix-Marseille Université (AMU), UMR 1163, 13288 Marseille, France;
| | - Sonia Longhi
- Architecture et Fonction des Macromolécules Biologiques (AFMB), Centre National de la Recherche Scientifique (CNRS), Aix-Marseille Université (AMU), UMR 7257, 13288 Marseille, France; (K.C.T.); (N.T.); (V.L.)
| | - Jean-Guy Berrin
- Biodiversité et Biotechnologie Fongiques (BBF), French National Institute for Agriculture, Food, and Environment (INRAE), Aix-Marseille Université (AMU), UMR 1163, 13288 Marseille, France;
| |
Collapse
|
50
|
Emenecker RJ, Griffith D, Holehouse AS. Metapredict: a fast, accurate, and easy-to-use predictor of consensus disorder and structure. Biophys J 2021; 120:4312-4319. [PMID: 34480923 PMCID: PMC8553642 DOI: 10.1016/j.bpj.2021.08.039] [Citation(s) in RCA: 128] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Revised: 08/08/2021] [Accepted: 08/30/2021] [Indexed: 01/02/2023] Open
Abstract
Intrinsically disordered proteins and protein regions make up a substantial fraction of many proteomes in which they play a wide variety of essential roles. A critical first step in understanding the role of disordered protein regions in biological function is to identify those disordered regions correctly. Computational methods for disorder prediction have emerged as a core set of tools to guide experiments, interpret results, and develop hypotheses. Given the multiple different predictors available, consensus scores have emerged as a popular approach to mitigate biases or limitations of any single method. Consensus scores integrate the outcome of multiple independent disorder predictors and provide a per-residue value that reflects the number of tools that predict a residue to be disordered. Although consensus scores help mitigate the inherent problems of using any single disorder predictor, they are computationally expensive to generate. They also necessitate the installation of multiple different software tools, which can be prohibitively difficult. To address this challenge, we developed a deep-learning-based predictor of consensus disorder scores. Our predictor, metapredict, utilizes a bidirectional recurrent neural network trained on the consensus disorder scores from 12 proteomes. By benchmarking metapredict using two orthogonal approaches, we found that metapredict is among the most accurate disorder predictors currently available. Metapredict is also remarkably fast, enabling proteome-scale disorder prediction in minutes. Importantly, metapredict is a fully open source and is distributed as a Python package, a collection of command-line tools, and a web server, maximizing the potential practical utility of the predictor. We believe metapredict offers a convenient, accessible, accurate, and high-performance predictor for single-proteins and proteomes alike.
Collapse
Affiliation(s)
- Ryan J Emenecker
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri; Center for Science and Engineering Living Systems (CSELS), St. Louis, Missouri; Center for Engineering Mechanobiology, Washington University, St. Louis, Missouri
| | - Daniel Griffith
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri; Center for Science and Engineering Living Systems (CSELS), St. Louis, Missouri
| | - Alex S Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri; Center for Science and Engineering Living Systems (CSELS), St. Louis, Missouri.
| |
Collapse
|