1
|
Sivakumar P, Pandey S, Ramesha A, Davda JN, Singh A, Kumar C, Gala H, Subbiah V, Adicherla H, Dhawan J, Aravind L, Siddiqi I. Sporophyte-directed gametogenesis in Arabidopsis. NATURE PLANTS 2025; 11:398-409. [PMID: 40087543 DOI: 10.1038/s41477-025-01932-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Accepted: 01/30/2025] [Indexed: 03/17/2025]
Abstract
Plants alternate between diploid sporophyte and haploid gametophyte generations1. In mosses, which retain features of ancestral land plants, the gametophyte is dominant and has an independent existence. However, in flowering plants the gametophyte has undergone evolutionary reduction to just a few cells enclosed within the sporophyte. The gametophyte is thought to retain genetic control of its development even after reduction2. Here we show that male gametophyte development in Arabidopsis, long considered to be autonomous, is also under genetic control of the sporophyte via a repressive mechanism that includes large-scale regulation of protein turnover. We identify an Arabidopsis gene SHUKR as an inhibitor of male gametic gene expression. SHUKR is unrelated to proteins of known function and acts sporophytically in meiosis to control gametophyte development by negatively regulating expression of a large set of genes specific to postmeiotic gametogenesis. This control emerged late in evolution as SHUKR homologues are found only in eudicots. We show that SHUKR is rapidly evolving under positive selection, suggesting that variation in control of protein turnover during male gametogenesis has played an important role in evolution within eudicots.
Collapse
Affiliation(s)
- Prakash Sivakumar
- Centre for Cellular and Molecular Biology, CSIR, Hyderabad, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Saurabh Pandey
- Centre for Cellular and Molecular Biology, CSIR, Hyderabad, India
- databaum GmbH, Hamburg, Germany
| | - A Ramesha
- Centre for Cellular and Molecular Biology, CSIR, Hyderabad, India
- Seri-Biotech Research Laboratory, Central Silk Board, Bangalore, India
| | | | - Aparna Singh
- Centre for Cellular and Molecular Biology, CSIR, Hyderabad, India
- Department of Botany, MMV, Banaras Hindu University, Varanasi, India
| | - Chandan Kumar
- Centre for Cellular and Molecular Biology, CSIR, Hyderabad, India
- University of Texas at Austin, Austin, TX, USA
| | - Hardik Gala
- Centre for Cellular and Molecular Biology, CSIR, Hyderabad, India
| | | | | | - Jyotsna Dhawan
- Centre for Cellular and Molecular Biology, CSIR, Hyderabad, India
| | - L Aravind
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Imran Siddiqi
- Centre for Cellular and Molecular Biology, CSIR, Hyderabad, India.
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India.
| |
Collapse
|
2
|
Mostafazade R, Tazik Z, Emami SA, Nesměrák K, Štícha M, Soheili V, Akaberi M. Isolation and Characterization of Fungal Endophytes From Helichrysum oocephalum, Evaluating Their Antimicrobial Activities, and Annotation of Their Metabolites. Chem Biodivers 2025:e202402236. [PMID: 40007502 DOI: 10.1002/cbdv.202402236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2024] [Revised: 01/31/2025] [Accepted: 02/05/2025] [Indexed: 02/27/2025]
Abstract
Antibiotic resistance is one of the biggest threats to global health. Fungal endophytes are important sources of active natural products with antimicrobial potential. The purpose of this study was to characterize the endophytes coexisting with Helichrysum oocephalum, evaluate their antimicrobial activities, and annotate the endophytes metabolites. Six fungal species, including Fusarium avenaceum and Fusarium tricinctum, were identified. Endophytes were cultured, and their metabolites were extracted. The antimicrobial effects of the extracts were tested against Staphylococcus aureus, Bacillus cereus, Staphylococcus epidermidis, Pseudomonas aeruginosa, Escherichia coli, and Candida albicans. In addition, anti-biofilm effects of the extracts were examined against P. aeruginosa and S. epidermidis. The metabolites in the most active extract were annotated on the basis of the LC-ESI-QToF-MS/MS data. In anti-biofilm studies, F. avenaceum extract was effective in destroying and inhibiting the biofilm formation of S. epidermidis. LC-MS analysis showed that most of the identified compounds in the active extracts were enniatins (cyclic hexadepsipeptides). However, apicidin derivatives were also annotated. Our results revealed that these endophytes, especially Fusarium species, have antimicrobial activity against S. aureus, B. cereus, and C. albicans and anti-biofilm activity against S. epidermidis. According to the literature, the observed antimicrobial activity can be attributed to the enniatins. However, further phytochemical and pharmacological studies are necessary in this regard.
Collapse
Affiliation(s)
- Reza Mostafazade
- Department of Pharmacognosy, School of Pharmacy, Mashhad University of Medical Sciences, Mashhad, Iran
- Biotechnology Research Center, Pharmaceutical Technology Institute, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Zahra Tazik
- Department of Pharmacognosy, School of Pharmacy, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Seyed Ahmad Emami
- Department of Traditional Pharmacy, School of Pharmacy, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Karel Nesměrák
- Department of Analytical Chemistry, Faculty of Science, Charles University, Prague, Czech Republic
| | - Martin Štícha
- Mass Spectrometry Laboratory, Section of Chemistry, Faculty of Science, Charles University, Prague, Czech Republic
| | - Vahid Soheili
- Department of Pharmaceutical Control, School of Pharmacy, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Maryam Akaberi
- Department of Pharmacognosy, School of Pharmacy, Mashhad University of Medical Sciences, Mashhad, Iran
| |
Collapse
|
3
|
Jagodzik P, Zietkiewicz E, Bukowy-Bieryllo Z. Conservation of OFD1 Protein Motifs: Implications for Discovery of Novel Interactors and the OFD1 Function. Int J Mol Sci 2025; 26:1167. [PMID: 39940934 PMCID: PMC11818881 DOI: 10.3390/ijms26031167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2024] [Revised: 01/16/2025] [Accepted: 01/21/2025] [Indexed: 02/16/2025] Open
Abstract
OFD1 is a protein involved in many cellular processes, including cilia biogenesis, mitotic spindle assembly, translation, autophagy and the repair of double-strand DNA breaks. Despite many potential interactors identified in high-throughput studies, only a few have been directly confirmed with their binding sites identified. We performed an analysis of the evolutionary conservation of the OFD1 sequence in three clades: 80 Tetrapoda, 144 Vertebrata or 26 Animalia species, and identified 59 protein-binding motifs localized in the OFD1 regions conserved in various clades. Our results indicate that OFD1 contains 14 potential post-translational modification (PTM) sites targeted by at least eight protein kinases, seven motifs bound by proteins recognizing phosphorylated aa residues and a binding site for phosphatase 2A. Moreover, OFD1 harbors both a motif that enables its phosphorylation by mitogen-activated protein kinases (MAPKs) and a specific docking site for these proteins. Generally, our results suggest that OFD1 forms a scaffold for interaction with many proteins and is tightly regulated by PTMs and ligands. Future research on OFD1 should focus on the regulation of OFD1 function and localization.
Collapse
Affiliation(s)
| | | | - Zuzanna Bukowy-Bieryllo
- Institute of Human Genetics Polish Academy of Sciences, Strzeszynska 32, 60-479 Poznan, Poland; (P.J.); (E.Z.)
| |
Collapse
|
4
|
Ziemska-Legiecka J, Jarnot P, Szymańska S, Błaszczyk D, Staśczak A, Langer-Macioł H, Lucińska K, Widzisz K, Janas A, Słowik H, Śliwińska W, Gruca A, Grynberg M. LCRAnnotationsDB: a database of low complexity regions functional and structural annotations. BMC Genomics 2024; 25:1251. [PMID: 39731018 DOI: 10.1186/s12864-024-10960-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Accepted: 10/25/2024] [Indexed: 12/29/2024] Open
Abstract
Low Complexity Regions (LCRs) are segments of proteins with a low diversity of amino acid composition. These regions play important roles in proteins. However, annotations describing these functions are dispersed across databases and scientific literature. LCRAnnotationsDB aims to consolidate knowledge about LCRs and store relevant annotations in a single place. To unify redundant annotations, we assigned them categories based on similarity in function, protein structure, and biological process. Categories are organized hierarchically by linking them to Gene Ontology terms. The LCRAnnotationsDB database can be accessed at https://lcrannotdb.lcr-lab.org/ .
Collapse
Affiliation(s)
- Joanna Ziemska-Legiecka
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Warsaw, 02-106, Poland.
| | - Patryk Jarnot
- Department of Computer Networks and Systems, Silesian University of Technology, Gliwice, 44-100, Poland
| | - Sylwia Szymańska
- Department of Computer Networks and Systems, Silesian University of Technology, Gliwice, 44-100, Poland
| | - Dagmara Błaszczyk
- Malopolska Centre of Biotechnology, Jagiellonian University, Kraków, 30-387, Poland
| | - Alicja Staśczak
- Biotechnology Center, Silesian University of Technology, Gliwice, 44-100, Poland
- Department of Systems Biology and Engineering, Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Gliwice, 44-100, Poland
| | - Hanna Langer-Macioł
- Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Gliwice, 44-100, Poland
- Department of Clinical and Molecular Genetics, Maria Sklodowska-Curie National Research Institute of Oncology, Gliwice Branch, Gliwice, 44-100, Poland
| | - Kinga Lucińska
- Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Gliwice, 44-100, Poland
| | - Karolina Widzisz
- Department of Graphics, Computer Vision and Digital Systems, Silesian University of Technology, Gliwice, 44-100, Poland
| | - Aleksandra Janas
- Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Gliwice, 44-100, Poland
| | - Hanna Słowik
- Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Gliwice, 44-100, Poland
| | - Wiktoria Śliwińska
- Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Gliwice, 44-100, Poland
| | - Aleksandra Gruca
- Department of Computer Networks and Systems, Silesian University of Technology, Gliwice, 44-100, Poland.
| | - Marcin Grynberg
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Warsaw, 02-106, Poland.
| |
Collapse
|
5
|
Chou RT, Ouattara A, Takala-Harrison S, Cummings MP. Plasmodium vivax antigen candidate prediction improves with the addition of Plasmodium falciparum data. NPJ Syst Biol Appl 2024; 10:133. [PMID: 39537634 PMCID: PMC11561111 DOI: 10.1038/s41540-024-00465-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Accepted: 10/25/2024] [Indexed: 11/16/2024] Open
Abstract
Intensive malaria control and elimination efforts have led to substantial reductions in malaria incidence over the past two decades. However, the reduction in Plasmodium falciparum malaria cases has led to a species shift in some geographic areas, with P. vivax predominating in many areas outside of Africa. Despite its wide geographic distribution, P. vivax vaccine development has lagged far behind that for P. falciparum, in part due to the inability to cultivate P. vivax in vitro, hindering traditional approaches for antigen identification. In a prior study, we have used a positive-unlabeled random forest (PURF) machine learning approach to identify P. falciparum antigens based on features of known antigens for consideration in vaccine development efforts. Here we integrate systems data from P. falciparum (the better-studied species) to improve PURF models to predict potential P. vivax vaccine antigen candidates. We further show that inclusion of known antigens from the other species is critical for model performance, but the inclusion of only the unlabeled proteins from the other species can result in misdirection of the model toward predictors of species classification, rather than antigen identification. Beyond malaria, incorporating antigens from a closely related species may aid in vaccine development for emerging pathogens having few or no known antigens.
Collapse
Affiliation(s)
- Renee Ti Chou
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, College Park, MD, USA
| | - Amed Ouattara
- Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Shannon Takala-Harrison
- Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, MD, USA.
| | - Michael P Cummings
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, College Park, MD, USA.
| |
Collapse
|
6
|
Shimizu K, Negishi L, Kurumizaka H, Suzuki M. Diversification of von Willebrand Factor A and Chitin-Binding Domains in Pif/BMSPs Among Mollusks. J Mol Evol 2024; 92:415-431. [PMID: 38864871 PMCID: PMC11291548 DOI: 10.1007/s00239-024-10180-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 05/23/2024] [Indexed: 06/13/2024]
Abstract
Pif is a shell matrix protein (SMP) identified in the nacreous layer of Pinctada fucata (Pfu) comprised two proteins, Pif97 and Pif 80. Pif97 contains a von Willebrand factor A (VWA) and chitin-binding domains, whereas Pif80 can bind calcium carbonate crystals. The VWA domain is conserved in the SMPs of various mollusk species; however, their phylogenetic relationship remains obscure. Furthermore, although the VWA domain participates in protein-protein interactions, its role in shell formation has not been established. Accordingly, in the current study, we investigate the phylogenetic relationship between PfuPif and other VWA domain-containing proteins in major mollusk species. The shell-related proteins containing VWA domains formed a large clade (the Pif/BMSP family) and were classified into eight subfamilies with unique sequential features, expression patterns, and taxa diversity. Furthermore, a pull-down assay using recombinant proteins containing the VWA domain of PfuPif 97 revealed that the VWA domain interacts with five nacreous layer-related SMPs of P. fucata, including Pif 80 and nacrein. Collectively, these results suggest that the VWA domain is important in the formation of organic complexes and participates in shell mineralisation.
Collapse
Affiliation(s)
- Keisuke Shimizu
- Research Institute for Global Change, Japan Agency for Marine-Earth Science and Technology, 2-15 Natsushima-Cho, Yokosuka, Kanagawa, 237-0061, Japan
- Department of Applied Biological Chemistry, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo, Tokyo, 113-8657, Japan
| | - Lumi Negishi
- Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo, Tokyo, 113-8657, Japan
| | - Hitoshi Kurumizaka
- Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo, Tokyo, 113-8657, Japan
| | - Michio Suzuki
- Department of Applied Biological Chemistry, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo, Tokyo, 113-8657, Japan.
| |
Collapse
|
7
|
Chou RT, Ouattara A, Adams M, Berry AA, Takala-Harrison S, Cummings MP. Positive-unlabeled learning identifies vaccine candidate antigens in the malaria parasite Plasmodium falciparum. NPJ Syst Biol Appl 2024; 10:44. [PMID: 38678051 PMCID: PMC11055854 DOI: 10.1038/s41540-024-00365-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 03/29/2024] [Indexed: 04/29/2024] Open
Abstract
Malaria vaccine development is hampered by extensive antigenic variation and complex life stages of Plasmodium species. Vaccine development has focused on a small number of antigens, many of which were identified without utilizing systematic genome-level approaches. In this study, we implement a machine learning-based reverse vaccinology approach to predict potential new malaria vaccine candidate antigens. We assemble and analyze P. falciparum proteomic, structural, functional, immunological, genomic, and transcriptomic data, and use positive-unlabeled learning to predict potential antigens based on the properties of known antigens and remaining proteins. We prioritize candidate antigens based on model performance on reference antigens with different genetic diversity and quantify the protein properties that contribute most to identifying top candidates. Candidate antigens are characterized by gene essentiality, gene ontology, and gene expression in different life stages to inform future vaccine development. This approach provides a framework for identifying and prioritizing candidate vaccine antigens for a broad range of pathogens.
Collapse
Affiliation(s)
- Renee Ti Chou
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, College Park, MD, USA
| | - Amed Ouattara
- Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Matthew Adams
- Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Andrea A Berry
- Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Shannon Takala-Harrison
- Center for Vaccine Development and Global Health, University of Maryland School of Medicine, Baltimore, MD, USA.
| | - Michael P Cummings
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, College Park, MD, USA.
| |
Collapse
|
8
|
Chen D, Shi C, Xu W, Rong Q, Wu Q. Regulation of phase separation and antiviral activity of Cactin by glycolytic enzyme PGK via phosphorylation in Drosophila. mBio 2024; 15:e0137823. [PMID: 38446061 PMCID: PMC11005415 DOI: 10.1128/mbio.01378-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 02/12/2024] [Indexed: 03/07/2024] Open
Abstract
Liquid-liquid phase separation (LLPS) plays a crucial role in various biological processes in eukaryotic organisms, including immune responses in mammals. However, the specific function of LLPS in immune responses in Drosophila melanogaster remains poorly understood. Cactin, a highly conserved protein in eukaryotes, is involved in a non-canonical signaling pathway associated with Nuclear factor-κB (NF-κB)-related pathways in Drosophila. In this study, we investigated the role of Cactin in LLPS and its implications for immune response modulation. We discovered that Cactin undergoes LLPS, forming droplet-like particles, primarily mediated by its intrinsically disordered region (IDR). Utilizing immunoprecipitation and mass spectrometry analysis, we identified two phosphorylation sites at serine residues 99 and 104 within the IDR1 domain of Cactin. Co-immunoprecipitation and mass spectrometry further revealed phosphoglycerate kinase (PGK) as a Cactin-interacting protein responsible for regulating its phosphorylation. Phosphorylation of Cactin by PGK induced a transition from stable aggregates to dynamic liquid droplets, enhancing its ability to interact with other components in the cellular environment. Overexpression of PGK inhibited Drosophila C virus (DCV) replication, while PGK knockdown increased replication. DCV infection also increased Cactin phosphorylation. We also found that phosphorylation enhances the antiviral ability of Cactin by promoting liquid-phase droplet formation. These findings demonstrate the role of Cactin-phase separation in regulating DCV replication and highlight the modulation of its antiviral function through phosphorylation, providing insights into the interplay between LLPS and antiviral defense mechanisms. IMPORTANCE Liquid-liquid phase separation (LLPS) plays an integral role in various biological processes in eukaryotic organisms. Although several studies have highlighted its crucial role in modulating immune responses in mammals, its function in immune responses in Drosophila melanogaster remains poorly understood. Our study investigated the role of Cactin in LLPS and its implications for immune response modulation. We identified that phosphoglycerate kinase (PGK), an essential enzyme in the glycolytic pathway, phosphorylates Cactin, facilitating its transition from a relatively stable aggregated state to a more dynamic liquid droplet phase during the phase separation process. This transformation allows Cactin to rapidly interact with other cellular components, enhancing its antiviral properties and ultimately inhibiting virus replication. These findings expand our understanding of the role of LLPS in the antiviral defense mechanism, shedding light on the intricate mechanisms underlying immune responses in D. melanogaster.
Collapse
Affiliation(s)
- Dongchao Chen
- Department of Pharmacy, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
- Division of Molecular Medicine, CAS Key Laboratory of Innate Immunity and Chronic Disease, University of Science and Technology of China, Hefei, Anhui, China
| | - Chang Shi
- Department of Pharmacy, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
- Division of Molecular Medicine, CAS Key Laboratory of Innate Immunity and Chronic Disease, University of Science and Technology of China, Hefei, Anhui, China
| | - Wen Xu
- Department of Pharmacy, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
- Division of Molecular Medicine, CAS Key Laboratory of Innate Immunity and Chronic Disease, University of Science and Technology of China, Hefei, Anhui, China
| | - Qiqi Rong
- Department of Pharmacy, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
- Division of Molecular Medicine, CAS Key Laboratory of Innate Immunity and Chronic Disease, University of Science and Technology of China, Hefei, Anhui, China
| | - Qingfa Wu
- Department of Pharmacy, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
- Division of Molecular Medicine, CAS Key Laboratory of Innate Immunity and Chronic Disease, University of Science and Technology of China, Hefei, Anhui, China
- Anhui Provincial Key Laboratory of Precision Pharmaceutical Preparations and Clinical Pharmacy, Hefei, Anhui, China
| |
Collapse
|
9
|
Mahapatra A, Newberry RW. Liquid-liquid phase separation of α-synuclein is highly sensitive to sequence complexity. Protein Sci 2024; 33:e4951. [PMID: 38511533 PMCID: PMC10955625 DOI: 10.1002/pro.4951] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 02/06/2024] [Accepted: 02/19/2024] [Indexed: 03/22/2024]
Abstract
The Parkinson's-associated protein α-synuclein (α-syn) can undergo liquid-liquid phase separation (LLPS), which typically leads to the formation of amyloid fibrils. The coincidence of LLPS and amyloid formation has complicated the identification of the molecular determinants unique to LLPS of α-syn. Moreover, the lack of strategies to selectively perturb LLPS makes it difficult to dissect the biological roles specific to α-syn LLPS, independent of fibrillation. Herein, using a combination of subtle missense mutations, we show that LLPS of α-syn is highly sensitive to its sequence complexity. In fact, we find that even a highly conservative mutation (V16I) that increases sequence complexity without perturbing physicochemical and structural properties, is sufficient to reduce LLPS by 75%; this effect can be reversed by an adjacent V-to-I mutation (V15I) that restores the original sequence complexity. A18T, a complexity-enhancing PD-associated mutation, was likewise found to reduce LLPS, implicating sequence complexity in α-syn pathogenicity. Furthermore, leveraging the differences in LLPS propensities among different α-syn variants, we demonstrate that fibrillation of α-syn does not necessarily correlate with its LLPS. In fact, we identify mutations that selectively perturb LLPS or fibrillation of α-syn, unlike previously studied mutations. The variants and design principles reported herein should therefore empower future studies to disentangle these two phenomena and distinguish their (patho)biological roles.
Collapse
|
10
|
Makarova KS, Zhang C, Wolf YI, Karamycheva S, Whitaker RJ, Koonin EV. Computational analysis of genes with lethal knockout phenotype and prediction of essential genes in archaea. mBio 2024; 15:e0309223. [PMID: 38189270 PMCID: PMC10865827 DOI: 10.1128/mbio.03092-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 11/27/2023] [Indexed: 01/09/2024] Open
Abstract
The identification of microbial genes essential for survival as those with lethal knockout phenotype (LKP) is a common strategy for functional interrogation of genomes. However, interpretation of the LKP is complicated because a substantial fraction of the genes with this phenotype remains poorly functionally characterized. Furthermore, many genes can exhibit LKP not because their products perform essential cellular functions but because their knockout activates the toxicity of other genes (conditionally essential genes). We analyzed the sets of LKP genes for two archaea, Methanococcus maripaludis and Sulfolobus islandicus, using a variety of computational approaches aiming to differentiate between essential and conditionally essential genes and to predict at least a general function for as many of the proteins encoded by these genes as possible. This analysis allowed us to predict the functions of several LKP genes including previously uncharacterized subunit of the GINS protein complex with an essential function in genome replication and of the KEOPS complex that is responsible for an essential tRNA modification as well as GRP protease implicated in protein quality control. Additionally, several novel antitoxins (conditionally essential genes) were predicted, and this prediction was experimentally validated by showing that the deletion of these genes together with the adjacent genes apparently encoding the cognate toxins caused no growth defect. We applied principal component analysis based on sequence and comparative genomic features showing that this approach can separate essential genes from conditionally essential ones and used it to predict essential genes in other archaeal genomes.IMPORTANCEOnly a relatively small fraction of the genes in any bacterium or archaeon is essential for survival as demonstrated by the lethal effect of their disruption. The identification of essential genes and their functions is crucial for understanding fundamental cell biology. However, many of the genes with a lethal knockout phenotype remain poorly functionally characterized, and furthermore, many genes can exhibit this phenotype not because their products perform essential cellular functions but because their knockout activates the toxicity of other genes. We applied state-of-the-art computational methods to predict the functions of a number of uncharacterized genes with the lethal knockout phenotype in two archaeal species and developed a computational approach to predict genes involved in essential functions. These findings advance the current understanding of key functionalities of archaeal cells.
Collapse
Affiliation(s)
- Kira S. Makarova
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
| | - Changyi Zhang
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| | - Yuri I. Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
| | - Svetlana Karamycheva
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
| | - Rachel J. Whitaker
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| | - Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
| |
Collapse
|
11
|
Harrison PM. Optimizing strategy for the discovery of compositionally-biased or low-complexity regions in proteins. Sci Rep 2024; 14:680. [PMID: 38182699 PMCID: PMC10770407 DOI: 10.1038/s41598-023-50991-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 12/28/2023] [Indexed: 01/07/2024] Open
Abstract
Proteins can contain tracts dominated by a subset of amino acids and that have a functional significance. These are often termed 'low-complexity regions' (LCRs) or 'compositionally-biased regions' (CBRs). However, a wide spectrum of compositional bias is possible, and program parameters used to annotate these regions are often arbitrarily chosen. Also, investigators are sometimes interested in longer regions, or sometimes very short ones. Here, two programs for annotating LCRs/CBRs, namely SEG and fLPS, are investigated in detail across the whole expanse of their parameter spaces. In doing so, boundary behaviours are resolved that are used to derive an optimized systematic strategy for annotating LCRs/CBRs. Sets of parameters that progressively annotate or 'cover' more of protein sequence space and are optimized for a given target length have been derived. This progressive annotation can be applied to discern the biological relevance of CBRs, e.g., in parsing domains for experimental constructs and in generating hypotheses. It is also useful for picking out candidate regions of interest of a given target length and bias signature, and for assessing the parameter dependence of annotations. This latter application is demonstrated for a set of human intrinsically-disordered proteins associated with cancer.
Collapse
Affiliation(s)
- Paul M Harrison
- Department of Biology, McGill University, Montreal, QC, Canada.
| |
Collapse
|
12
|
Rich KD, Srivastava S, Muthye VR, Wasmuth JD. Identification of potential molecular mimicry in pathogen-host interactions. PeerJ 2023; 11:e16339. [PMID: 37953771 PMCID: PMC10637249 DOI: 10.7717/peerj.16339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 10/02/2023] [Indexed: 11/14/2023] Open
Abstract
Pathogens have evolved sophisticated strategies to manipulate host signaling pathways, including the phenomenon of molecular mimicry, where pathogen-derived biomolecules imitate host biomolecules. In this study, we resurrected, updated, and optimized a sequence-based bioinformatics pipeline to identify potential molecular mimicry candidates between humans and 32 pathogenic species whose proteomes' 3D structure predictions were available at the start of this study. We observed considerable variation in the number of mimicry candidates across pathogenic species, with pathogenic bacteria exhibiting fewer candidates compared to fungi and protozoans. Further analysis revealed that the candidate mimicry regions were enriched in solvent-accessible regions, highlighting their potential functional relevance. We identified a total of 1,878 mimicked regions in 1,439 human proteins, and clustering analysis indicated diverse target proteins across pathogen species. The human proteins containing mimicked regions revealed significant associations between these proteins and various biological processes, with an emphasis on host extracellular matrix organization and cytoskeletal processes. However, immune-related proteins were underrepresented as targets of mimicry. Our findings provide insights into the broad range of host-pathogen interactions mediated by molecular mimicry and highlight potential targets for further investigation. This comprehensive analysis contributes to our understanding of the complex mechanisms employed by pathogens to subvert host defenses and we provide a resource to assist researchers in the development of novel therapeutic strategies.
Collapse
Affiliation(s)
- Kaylee D. Rich
- Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta, Canada
- Host-Parasite Interactions Research Training Network, University of Calgary, Calgary, Alberta, Canada
| | - Shruti Srivastava
- Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta, Canada
- Host-Parasite Interactions Research Training Network, University of Calgary, Calgary, Alberta, Canada
| | - Viraj R. Muthye
- Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta, Canada
- Host-Parasite Interactions Research Training Network, University of Calgary, Calgary, Alberta, Canada
| | - James D. Wasmuth
- Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta, Canada
- Host-Parasite Interactions Research Training Network, University of Calgary, Calgary, Alberta, Canada
| |
Collapse
|
13
|
Orlov YL, Orlova NG. Bioinformatics tools for the sequence complexity estimates. Biophys Rev 2023; 15:1367-1378. [PMID: 37974990 PMCID: PMC10643780 DOI: 10.1007/s12551-023-01140-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 09/01/2023] [Indexed: 11/19/2023] Open
Abstract
We review current methods and bioinformatics tools for the text complexity estimates (information and entropy measures). The search DNA regions with extreme statistical characteristics such as low complexity regions are important for biophysical models of chromosome function and gene transcription regulation in genome scale. We discuss the complexity profiling for segmentation and delineation of genome sequences, search for genome repeats and transposable elements, and applications to next-generation sequencing reads. We review the complexity methods and new applications fields: analysis of mutation hotspots loci, analysis of short sequencing reads with quality control, and alignment-free genome comparisons. The algorithms implementing various numerical measures of text complexity estimates including combinatorial and linguistic measures have been developed before genome sequencing era. The series of tools to estimate sequence complexity use compression approaches, mainly by modification of Lempel-Ziv compression. Most of the tools are available online providing large-scale service for whole genome analysis. Novel machine learning applications for classification of complete genome sequences also include sequence compression and complexity algorithms. We present comparison of the complexity methods on the different sequence sets, the applications for gene transcription regulatory regions analysis. Furthermore, we discuss approaches and application of sequence complexity for proteins. The complexity measures for amino acid sequences could be calculated by the same entropy and compression-based algorithms. But the functional and evolutionary roles of low complexity regions in protein have specific features differing from DNA. The tools for protein sequence complexity aimed for protein structural constraints. It was shown that low complexity regions in protein sequences are conservative in evolution and have important biological and structural functions. Finally, we summarize recent findings in large scale genome complexity comparison and applications for coronavirus genome analysis.
Collapse
Affiliation(s)
- Yuriy L. Orlov
- The Digital Health Institute, I.M. Sechenov First Moscow State Medical University of the Russian Ministry of Health (Sechenov University), Moscow, 119991 Russia
- Institute of Cytology and Genetics SB RAS, 630090 Novosibirsk, Russia
- Agrarian and Technological Institute, Peoples’ Friendship University of Russia, 117198 Moscow, Russia
| | - Nina G. Orlova
- Department of Mathematics, Financial University under the Government of the Russian Federation, Moscow, 125167 Russia
| |
Collapse
|
14
|
Tripathi S, Shirnekhi HK, Gorman SD, Chandra B, Baggett DW, Park CG, Somjee R, Lang B, Hosseini SMH, Pioso BJ, Li Y, Iacobucci I, Gao Q, Edmonson MN, Rice SV, Zhou X, Bollinger J, Mitrea DM, White MR, McGrail DJ, Jarosz DF, Yi SS, Babu MM, Mullighan CG, Zhang J, Sahni N, Kriwacki RW. Defining the condensate landscape of fusion oncoproteins. Nat Commun 2023; 14:6008. [PMID: 37770423 PMCID: PMC10539325 DOI: 10.1038/s41467-023-41655-2] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 09/13/2023] [Indexed: 09/30/2023] Open
Abstract
Fusion oncoproteins (FOs) arise from chromosomal translocations in ~17% of cancers and are often oncogenic drivers. Although some FOs can promote oncogenesis by undergoing liquid-liquid phase separation (LLPS) to form aberrant biomolecular condensates, the generality of this phenomenon is unknown. We explored this question by testing 166 FOs in HeLa cells and found that 58% formed condensates. The condensate-forming FOs displayed physicochemical features distinct from those of condensate-negative FOs and segregated into distinct feature-based groups that aligned with their sub-cellular localization and biological function. Using Machine Learning, we developed a predictor of FO condensation behavior, and discovered that 67% of ~3000 additional FOs likely form condensates, with 35% of those predicted to function by altering gene expression. 47% of the predicted condensate-negative FOs were associated with cell signaling functions, suggesting a functional dichotomy between condensate-positive and -negative FOs. Our Datasets and reagents are rich resources to interrogate FO condensation in the future.
Collapse
Affiliation(s)
- Swarnendu Tripathi
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Hazheen K Shirnekhi
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Scott D Gorman
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
- Arrakis Therapeutics, 830 Winter St, Waltham, MA, 02451, USA
| | - Bappaditya Chandra
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - David W Baggett
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Cheon-Gil Park
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Ramiz Somjee
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
- Rhodes College, Memphis, TN, USA
- Washington University School of Medicine, 660 South Euclid Avenue, St. Louis, MO, 63110, USA
| | - Benjamin Lang
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
- Center of Excellence for Data-Driven Discovery, Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Seyed Mohammad Hadi Hosseini
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
- Center of Excellence for Data-Driven Discovery, Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Brittany J Pioso
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Yongsheng Li
- Livestrong Cancer Institutes, Department of Oncology, Dell Medical School, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Ilaria Iacobucci
- Department of Pathology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Qingsong Gao
- Department of Pathology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Michael N Edmonson
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Stephen V Rice
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Xin Zhou
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - John Bollinger
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Diana M Mitrea
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
- Dewpoint Therapeutics, 451 D Street, Suite 104, Boston, MA, 02210, USA
| | - Michael R White
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
- IDEXX Laboratories, Inc., One IDEXX Drive, Westbrook, ME, 04092, USA
| | - Daniel J McGrail
- Center for Immunotherapy and Precision Immuno-Oncology, Cleveland Clinic, Cleveland, OH, USA
- Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Daniel F Jarosz
- Department of Chemical and Systems Biology, Stanford University School of Medicine, Stanford, CA, USA
- Department of Developmental Biology, Stanford University School of Medicine, Stanford, CA, USA
| | - S Stephen Yi
- Livestrong Cancer Institutes, Department of Oncology, Dell Medical School, The University of Texas at Austin, Austin, TX, 78712, USA
- Department of Biomedical Engineering, and Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, TX, USA
| | - M Madan Babu
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
- Center of Excellence for Data-Driven Discovery, Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Charles G Mullighan
- Department of Pathology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Jinghui Zhang
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Nidhi Sahni
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
- Program in Quantitative and Computational Biosciences, Baylor College of Medicine, Houston, TX, USA
| | - Richard W Kriwacki
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA.
- Department of Microbiology, Immunology and Biochemistry, University of Tennessee Health Sciences Center, Memphis, TN, USA.
| |
Collapse
|
15
|
Sutanto K, Turcotte M. Assessing Global-Local Secondary Structure Fingerprints to Classify RNA Sequences With Deep Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2736-2747. [PMID: 34633933 DOI: 10.1109/tcbb.2021.3118358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
RNA elements that are transcribed but not translated into proteins are called non-coding RNAs (ncRNAs). They play wide-ranging roles in biological processes and disorders. Just like proteins, their structure is often intimately linked to their function. Many examples have been documented where structure is conserved across taxa despite sequence divergence. Thus, structure is often used to identify function. Specifically, the secondary structure is predicted and ncRNAs with similar structures are assumed to have same or similar functions. However, a strand of RNA can fold into multiple possible structures, and some strands even fold differently in vivo and in vitro. Furthermore, ncRNAs often function as RNA-protein complexes, which can affect structure. Because of these, we hypothesized using one structure per sequence may discard information, possibly resulting in poorer classification accuracy. Therefore, we propose using secondary structure fingerprints, comprising two categories: a higher-level representation derived from RNA-As-Graphs (RAG), and free energy fingerprints based on a curated repertoire of small structural motifs. The fingerprints take into account the difference between global and local structural matches. We also evaluated our deep learning architecture with k-mers. By combining our global-local fingerprints with 6-mer, we achieved an accuracy, precision, and recall of 91.04%, 91.10%, and 91.00%.
Collapse
|
16
|
Ribolla LM, Sala K, Tonoli D, Ramella M, Bracaglia L, Bonomo I, Gonnelli L, Lamarca A, Brindisi M, Pierattelli R, Provenzani A, de Curtis I. Interfering with the ERC1-LL5β interaction disrupts plasma membrane-Associated platforms and affects tumor cell motility. PLoS One 2023; 18:e0287670. [PMID: 37437062 DOI: 10.1371/journal.pone.0287670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 06/10/2023] [Indexed: 07/14/2023] Open
Abstract
Cell migration requires a complex array of molecular events to promote protrusion at the front of motile cells. The scaffold protein LL5β interacts with the scaffold ERC1, and recruits it at plasma membrane-associated platforms that form at the front of migrating tumor cells. LL5 and ERC1 proteins support protrusion during migration as shown by the finding that depletion of either endogenous protein impairs tumor cell motility and invasion. In this study we have tested the hypothesis that interfering with the interaction between LL5β and ERC1 may be used to interfere with the function of the endogenous proteins to inhibit tumor cell migration. For this, we identified ERC1(270-370) and LL5β(381-510) as minimal fragments required for the direct interaction between the two proteins. The biochemical characterization demonstrated that the specific regions of the two proteins, including predicted intrinsically disordered regions, are implicated in a reversible, high affinity direct heterotypic interaction. NMR spectroscopy further confirmed the disordered nature of the two fragments and also support the occurrence of interaction between them. We tested if the LL5β protein fragment interferes with the formation of the complex between the two full-length proteins. Coimmunoprecipitation experiments showed that LL5β(381-510) hampers the formation of the complex in cells. Moreover, expression of either fragment is able to specifically delocalize endogenous ERC1 from the edge of migrating MDA-MB-231 tumor cells. Coimmunoprecipitation experiments show that the ERC1-binding fragment of LL5β interacts with endogenous ERC1 and interferes with the binding of endogenous ERC1 to full length LL5β. Expression of LL5β(381-510) affects tumor cell motility with a reduction in the density of invadopodia and inhibits transwell invasion. These results provide a proof of principle that interfering with heterotypic intermolecular interactions between components of plasma membrane-associated platforms forming at the front of tumor cells may represent a new approach to inhibit cell invasion.
Collapse
Affiliation(s)
- Lucrezia Maria Ribolla
- Vita-Salute San Raffaele University and San Raffaele Scientific Institute, Milano, Italy
| | - Kristyna Sala
- Vita-Salute San Raffaele University and San Raffaele Scientific Institute, Milano, Italy
| | - Diletta Tonoli
- Vita-Salute San Raffaele University and San Raffaele Scientific Institute, Milano, Italy
| | - Martina Ramella
- Vita-Salute San Raffaele University and San Raffaele Scientific Institute, Milano, Italy
| | - Lorenzo Bracaglia
- Department of Chemistry "Ugo Schiff" and Magnetic Resonance Center, University of Florence, Sesto Fiorentino (Florence), Italy
| | - Isabelle Bonomo
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy
| | - Leonardo Gonnelli
- Department of Chemistry "Ugo Schiff" and Magnetic Resonance Center, University of Florence, Sesto Fiorentino (Florence), Italy
| | - Andrea Lamarca
- Vita-Salute San Raffaele University and San Raffaele Scientific Institute, Milano, Italy
| | - Matteo Brindisi
- Vita-Salute San Raffaele University and San Raffaele Scientific Institute, Milano, Italy
| | - Roberta Pierattelli
- Department of Chemistry "Ugo Schiff" and Magnetic Resonance Center, University of Florence, Sesto Fiorentino (Florence), Italy
| | - Alessandro Provenzani
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy
| | - Ivan de Curtis
- Vita-Salute San Raffaele University and San Raffaele Scientific Institute, Milano, Italy
| |
Collapse
|
17
|
Tayeb-Fligelman E, Bowler JT, Tai CE, Sawaya MR, Jiang YX, Garcia G, Griner SL, Cheng X, Salwinski L, Lutter L, Seidler PM, Lu J, Rosenberg GM, Hou K, Abskharon R, Pan H, Zee CT, Boyer DR, Li Y, Anderson DH, Murray KA, Falcon G, Cascio D, Saelices L, Damoiseaux R, Arumugaswami V, Guo F, Eisenberg DS. Low complexity domains of the nucleocapsid protein of SARS-CoV-2 form amyloid fibrils. Nat Commun 2023; 14:2379. [PMID: 37185252 PMCID: PMC10127185 DOI: 10.1038/s41467-023-37865-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 04/03/2023] [Indexed: 05/17/2023] Open
Abstract
The self-assembly of the Nucleocapsid protein (NCAP) of SARS-CoV-2 is crucial for its function. Computational analysis of the amino acid sequence of NCAP reveals low-complexity domains (LCDs) akin to LCDs in other proteins known to self-assemble as phase separation droplets and amyloid fibrils. Previous reports have described NCAP's propensity to phase-separate. Here we show that the central LCD of NCAP is capable of both, phase separation and amyloid formation. Within this central LCD we identified three adhesive segments and determined the atomic structure of the fibrils formed by each. Those structures guided the design of G12, a peptide that interferes with the self-assembly of NCAP and demonstrates antiviral activity in SARS-CoV-2 infected cells. Our work, therefore, demonstrates the amyloid form of the central LCD of NCAP and suggests that amyloidogenic segments of NCAP could be targeted for drug development.
Collapse
Affiliation(s)
- Einav Tayeb-Fligelman
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, 90095, USA
- Howard Hughes Medical Institute, Los Angeles, CA, 90095, USA
| | - Jeannette T Bowler
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, 90095, USA
- Howard Hughes Medical Institute, Los Angeles, CA, 90095, USA
| | - Christen E Tai
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
| | - Michael R Sawaya
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, 90095, USA
- Howard Hughes Medical Institute, Los Angeles, CA, 90095, USA
- UCLA-DOE Institute of Genomics and Proteomics, UCLA, Los Angeles, CA, 90095, USA
| | - Yi Xiao Jiang
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, 90095, USA
- Howard Hughes Medical Institute, Los Angeles, CA, 90095, USA
| | - Gustavo Garcia
- Department of Molecular and Medical Pharmacology, UCLA, Los Angeles, CA, 90095, USA
| | - Sarah L Griner
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, 90095, USA
- Howard Hughes Medical Institute, Los Angeles, CA, 90095, USA
| | - Xinyi Cheng
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, 90095, USA
- Howard Hughes Medical Institute, Los Angeles, CA, 90095, USA
| | - Lukasz Salwinski
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
- UCLA-DOE Institute of Genomics and Proteomics, UCLA, Los Angeles, CA, 90095, USA
| | - Liisa Lutter
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, 90095, USA
- Howard Hughes Medical Institute, Los Angeles, CA, 90095, USA
| | - Paul M Seidler
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
- Department of Pharmacology and Pharmaceutical Sciences, University of Southern California School of Pharmacy, Los Angeles, CA, 90089-9121, USA
| | - Jiahui Lu
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, 90095, USA
- Howard Hughes Medical Institute, Los Angeles, CA, 90095, USA
| | - Gregory M Rosenberg
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, 90095, USA
- Howard Hughes Medical Institute, Los Angeles, CA, 90095, USA
| | - Ke Hou
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, 90095, USA
- Howard Hughes Medical Institute, Los Angeles, CA, 90095, USA
| | - Romany Abskharon
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, 90095, USA
- Howard Hughes Medical Institute, Los Angeles, CA, 90095, USA
| | - Hope Pan
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, 90095, USA
- Howard Hughes Medical Institute, Los Angeles, CA, 90095, USA
| | - Chih-Te Zee
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, 90095, USA
| | - David R Boyer
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, 90095, USA
- Howard Hughes Medical Institute, Los Angeles, CA, 90095, USA
| | - Yan Li
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
| | - Daniel H Anderson
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, 90095, USA
- Howard Hughes Medical Institute, Los Angeles, CA, 90095, USA
| | - Kevin A Murray
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, 90095, USA
- Howard Hughes Medical Institute, Los Angeles, CA, 90095, USA
| | - Genesis Falcon
- UCLA-DOE Institute of Genomics and Proteomics, UCLA, Los Angeles, CA, 90095, USA
| | - Duilio Cascio
- UCLA-DOE Institute of Genomics and Proteomics, UCLA, Los Angeles, CA, 90095, USA
| | - Lorena Saelices
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
- Center for Alzheimer's and Neurodegenerative Diseases, Department of Biophysics, Peter O'Donnell Jr. Brain Institute, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - Robert Damoiseaux
- Department of Molecular and Medical Pharmacology, UCLA, Los Angeles, CA, 90095, USA
- Department of Bioengineering, UCLA, Los Angeles, CA, 90095, USA
- California NanoSystems Institute, UCLA, Los Angeles, CA, 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, UCLA, Los Angeles, CA, 90095, USA
- Jonsson Comprehensive Cancer Center, UCLA, Los Angeles, CA, 90095, USA
| | - Vaithilingaraja Arumugaswami
- Department of Molecular and Medical Pharmacology, UCLA, Los Angeles, CA, 90095, USA
- California NanoSystems Institute, UCLA, Los Angeles, CA, 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, UCLA, Los Angeles, CA, 90095, USA
| | - Feng Guo
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA
- Jonsson Comprehensive Cancer Center, UCLA, Los Angeles, CA, 90095, USA
| | - David S Eisenberg
- Department of Biological Chemistry, UCLA, Los Angeles, CA, 90095, USA.
- Molecular Biology Institute, UCLA, Los Angeles, CA, 90095, USA.
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA, 90095, USA.
- Howard Hughes Medical Institute, Los Angeles, CA, 90095, USA.
- UCLA-DOE Institute of Genomics and Proteomics, UCLA, Los Angeles, CA, 90095, USA.
- California NanoSystems Institute, UCLA, Los Angeles, CA, 90095, USA.
| |
Collapse
|
18
|
Roux S, Camargo AP, Coutinho FH, Dabdoub SM, Dutilh BE, Nayfach S, Tritt A. iPHoP: An integrated machine learning framework to maximize host prediction for metagenome-derived viruses of archaea and bacteria. PLoS Biol 2023; 21:e3002083. [PMID: 37083735 PMCID: PMC10155999 DOI: 10.1371/journal.pbio.3002083] [Citation(s) in RCA: 117] [Impact Index Per Article: 58.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 05/03/2023] [Accepted: 03/15/2023] [Indexed: 04/22/2023] Open
Abstract
The extraordinary diversity of viruses infecting bacteria and archaea is now primarily studied through metagenomics. While metagenomes enable high-throughput exploration of the viral sequence space, metagenome-derived sequences lack key information compared to isolated viruses, in particular host association. Different computational approaches are available to predict the host(s) of uncultivated viruses based on their genome sequences, but thus far individual approaches are limited either in precision or in recall, i.e., for a number of viruses they yield erroneous predictions or no prediction at all. Here, we describe iPHoP, a two-step framework that integrates multiple methods to reliably predict host taxonomy at the genus rank for a broad range of viruses infecting bacteria and archaea, while retaining a low false discovery rate. Based on a large dataset of metagenome-derived virus genomes from the IMG/VR database, we illustrate how iPHoP can provide extensive host prediction and guide further characterization of uncultivated viruses.
Collapse
Affiliation(s)
- Simon Roux
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Antonio Pedro Camargo
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | | | - Shareef M Dabdoub
- Division of Biostatistics and Computational Biology, University of Iowa College of Dentistry, Iowa City, Iowa, United States of America
| | - Bas E Dutilh
- Institute of Biodiversity, Faculty of Biological Sciences, Cluster of Excellence Balance of the Microverse, Friedrich Schiller University, Jena, Germany
- Theoretical Biology and Bioinformatics, Science for Life, Utrecht University, Utrecht, the Netherlands
| | - Stephen Nayfach
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Andrew Tritt
- Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| |
Collapse
|
19
|
Oh C, Buckley PM, Choi J, Hierro A, DiMaio D. Sequence independent activity of a predicted long disordered segment of the human papillomavirus L2 capsid protein during virus entry. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.21.533711. [PMID: 36993745 PMCID: PMC10055320 DOI: 10.1101/2023.03.21.533711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
Abstract
The papillomavirus L2 capsid protein protrudes through the endosome membrane into the cytoplasm during virus entry to bind cellular factors required for intracellular virus trafficking. Cytoplasmic protrusion of HPV16 L2, virus trafficking, and infectivity are inhibited by large deletions in an ∼110 amino acid segment of L2 that is predicted to be disordered. The activity of these mutants can be restored by inserting protein segments with diverse compositions and chemical properties into this region, including scrambled sequences, a tandem array of a short sequence, and the intrinsically disordered region of a cellular protein. The infectivity of mutants with small in-frame insertions and deletions in this segment directly correlates with the size of the segment. These results indicate that the length of the disordered segment, not its sequence or its composition, determines its activity during virus entry. Sequence independent but length dependent activity has important implications for protein function and evolution.
Collapse
|
20
|
Sun YH, Cui H, Song C, Shen JT, Zhuo X, Wang RH, Yu X, Ndamba R, Mu Q, Gu H, Wang D, Murthy GG, Li P, Liang F, Liu L, Tao Q, Wang Y, Orlowski S, Xu Q, Zhou H, Jagne J, Gokcumen O, Anthony N, Zhao X, Li XZ. Amniotes co-opt intrinsic genetic instability to protect germ-line genome integrity. Nat Commun 2023; 14:812. [PMID: 36781861 PMCID: PMC9925758 DOI: 10.1038/s41467-023-36354-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 01/27/2023] [Indexed: 02/15/2023] Open
Abstract
Unlike PIWI-interacting RNA (piRNA) in other species that mostly target transposable elements (TEs), >80% of piRNAs in adult mammalian testes lack obvious targets. However, mammalian piRNA sequences and piRNA-producing loci evolve more rapidly than the rest of the genome for unknown reasons. Here, through comparative studies of chickens, ducks, mice, and humans, as well as long-read nanopore sequencing on diverse chicken breeds, we find that piRNA loci across amniotes experience: (1) a high local mutation rate of structural variations (SVs, mutations ≥ 50 bp in size); (2) positive selection to suppress young and actively mobilizing TEs commencing at the pachytene stage of meiosis during germ cell development; and (3) negative selection to purge deleterious SV hotspots. Our results indicate that genetic instability at pachytene piRNA loci, while producing certain pathogenic SVs, also protects genome integrity against TE mobilization by driving the formation of rapid-evolving piRNA sequences.
Collapse
Affiliation(s)
- Yu H Sun
- Center for RNA Biology: From Genome to Therapeutics, Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, 14642, USA
| | - Hongxiao Cui
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, 712100, China
| | - Chi Song
- College of Public Health, Division of Biostatistics, The Ohio State University, Columbus, OH, 43210, USA
| | - Jiafei Teng Shen
- International Institutes of Medicine, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu, Zhejiang, 322000, China
| | - Xiaoyu Zhuo
- Department of Genetics, The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, 63110, USA
| | - Ruoqiao Huiyi Wang
- Center for RNA Biology: From Genome to Therapeutics, Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, 14642, USA
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, 712100, China
| | - Xiaohui Yu
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, 712100, China
| | - Rudo Ndamba
- Center for RNA Biology: From Genome to Therapeutics, Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, 14642, USA
| | - Qian Mu
- Center for RNA Biology: From Genome to Therapeutics, Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, 14642, USA
| | - Hanwen Gu
- Center for RNA Biology: From Genome to Therapeutics, Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, 14642, USA
| | - Duolin Wang
- Center for RNA Biology: From Genome to Therapeutics, Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, 14642, USA
| | - Gayathri Guru Murthy
- Center for RNA Biology: From Genome to Therapeutics, Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, 14642, USA
| | - Pidong Li
- Grandomics Biosciences Co., Ltd, Beijing, 102206, China
| | - Fan Liang
- Grandomics Biosciences Co., Ltd, Beijing, 102206, China
| | - Lei Liu
- Grandomics Biosciences Co., Ltd, Beijing, 102206, China
| | - Qing Tao
- Grandomics Biosciences Co., Ltd, Beijing, 102206, China
| | - Ying Wang
- Department of Animal Science, University of California, Davis, CA, 95616, USA
| | - Sara Orlowski
- Department of Poultry Science, University of Arkansas, Fayetteville, AR, 72701, USA
| | - Qi Xu
- Department of Animal Science, McGill University, Quebec, H9X 3V9, Canada
| | - Huaijun Zhou
- Department of Animal Science, University of California, Davis, CA, 95616, USA
| | - Jarra Jagne
- Animal Health Diagnostic Center, Cornell University College of Veterinary Medicine, Ithaca, NY, 14850, USA
| | - Omer Gokcumen
- Department of Biological Sciences, University at Buffalo, State University of New York, Buffalo, NY, 14260, USA
| | - Nick Anthony
- Department of Poultry Science, University of Arkansas, Fayetteville, AR, 72701, USA
| | - Xin Zhao
- Department of Animal Science, McGill University, Quebec, H9X 3V9, Canada.
| | - Xin Zhiguo Li
- Center for RNA Biology: From Genome to Therapeutics, Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, 14642, USA.
| |
Collapse
|
21
|
Brownsword MJ, Locker N. A little less aggregation a little more replication: Viral manipulation of stress granules. WILEY INTERDISCIPLINARY REVIEWS. RNA 2023; 14:e1741. [PMID: 35709333 PMCID: PMC10078398 DOI: 10.1002/wrna.1741] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Revised: 04/29/2022] [Accepted: 05/05/2022] [Indexed: 01/31/2023]
Abstract
Recent exciting studies have uncovered how membrane-less organelles, also known as biocondensates, are providing cells with rapid response pathways, allowing them to re-organize their cellular contents and adapt to stressful conditions. Their assembly is driven by the phase separation of their RNAs and intrinsically disordered protein components into condensed foci. Among these, stress granules (SGs) are dynamic cytoplasmic biocondensates that form in response to many stresses, including activation of the integrated stress response or viral infections. SGs sit at the crossroads between antiviral signaling and translation because they concentrate signaling proteins and components of the innate immune response, in addition to translation machinery and stalled mRNAs. Consequently, they have been proposed to contribute to antiviral activities, and therefore are targeted by viral countermeasures. Equally, SGs components can be commandeered by viruses for their own efficient replication. Phase separation processes are an important component of the viral life cycle, for example, driving the assembly of replication factories or inclusion bodies. Therefore, in this review, we will outline the recent understanding of this complex interplay and tug of war between viruses, SGs, and their components. This article is categorized under: RNA in Disease and Development > RNA in Disease Translation > Regulation RNA Interactions with Proteins and Other Molecules > RNA-Protein Complexes.
Collapse
Affiliation(s)
- Matthew J. Brownsword
- Faculty of Health and Medical Sciences, School of Biosciences and MedicineUniversity of SurreyGuildfordSurreyUK
| | - Nicolas Locker
- Faculty of Health and Medical Sciences, School of Biosciences and MedicineUniversity of SurreyGuildfordSurreyUK
| |
Collapse
|
22
|
Shimizu K, Negishi L, Ito T, Touma S, Matsumoto T, Awaji M, Kurumizaka H, Yoshitake K, Kinoshita S, Asakawa S, Suzuki M. Evolution of nacre- and prisms-related shell matrix proteins in the pen shell, Atrina pectinata. COMPARATIVE BIOCHEMISTRY AND PHYSIOLOGY. PART D, GENOMICS & PROTEOMICS 2022; 44:101025. [PMID: 36075178 DOI: 10.1016/j.cbd.2022.101025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 08/24/2022] [Accepted: 08/24/2022] [Indexed: 01/27/2023]
Abstract
The molluscan shell is a good model for understanding the mechanisms underlying biomineralization. It is composed of calcium carbonate crystals and many types of organic molecules, such as the matrix proteins, polysaccharides, and lipids. The pen shell Atrina pectinata (Pterioida, Pinnidae) has two shell microstructures: an outer prismatic layer and an inner nacreous layer. Similar microstructures are well known in pearl oysters (Pteriidae), such as Pinctada fucata, and many kinds of shell matrix proteins (SMPs) have been identified from their shells. However, the members of SMPs that consist of the nacreous and prismatic layers of Pinnidae bivalves remain unclear. In this study, we identified 114 SMPs in the nacreous and prismatic layers of A. pectinata, of which only seven were found in both microstructures. 54 of them were found to bind calcium carbonate. Comparative analysis of nine molluscan shell proteomes showed that 69 of 114 SMPs of A. pectinata were found to have sequential similarity with at least one or more SMPs of other molluscan species. For instance, nacrein, tyrosinase, Pif/BMSP-like, chitinase (CN), chitin-binding proteins, CD109, and Kunitz-type serine proteinase inhibitors are widely shared among bivalves and gastropods. Our results provide new insights for understanding the complex evolution of SMPs related to nacreous and prismatic layer formation in the pteriomorph bivalves.
Collapse
Affiliation(s)
- Keisuke Shimizu
- Department of Applied Biological Chemistry, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo, Tokyo 113-8657, Japan
| | - Lumi Negishi
- Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo, Tokyo 113-8657, Japan
| | - Takumi Ito
- Department of Aquatic Bioscience, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo, Tokyo 113-8657, Japan
| | - Shogo Touma
- Department of Aquatic Bioscience, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo, Tokyo 113-8657, Japan
| | - Toshie Matsumoto
- National Research Institute of Aquaculture, Japan Fisheries Research and Education Agency, 422-1 Nakatsuhama, Minami-Ise, Watarai, Mie 516-0193, Japan
| | - Masahiko Awaji
- National Research Institute of Aquaculture, Japan Fisheries Research and Education Agency, 422-1 Nakatsuhama, Minami-Ise, Watarai, Mie 516-0193, Japan
| | - Hitoshi Kurumizaka
- Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo, Tokyo 113-8657, Japan
| | - Kazutoshi Yoshitake
- Department of Aquatic Bioscience, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo, Tokyo 113-8657, Japan
| | - Shigeharu Kinoshita
- Department of Aquatic Bioscience, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo, Tokyo 113-8657, Japan
| | - Shuichi Asakawa
- Department of Aquatic Bioscience, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo, Tokyo 113-8657, Japan
| | - Michio Suzuki
- Department of Applied Biological Chemistry, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo, Tokyo 113-8657, Japan.
| |
Collapse
|
23
|
Mier P, Elena-Real CA, Cortés J, Bernadó P, Andrade-Navarro MA. The sequence context in poly-alanine regions: structure, function and conservation. Bioinformatics 2022; 38:4851-4858. [PMID: 36106994 PMCID: PMC9620824 DOI: 10.1093/bioinformatics/btac610] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Revised: 07/07/2022] [Accepted: 09/05/2022] [Indexed: 11/24/2022] Open
Abstract
MOTIVATION Poly-alanine (polyA) regions are protein stretches mostly composed of alanines. Despite their abundance in eukaryotic proteomes and their association to nine inherited human diseases, the structural and functional roles exerted by polyA stretches remain poorly understood. In this work we study how the amino acid context in which polyA regions are settled in proteins influences their structure and function. RESULTS We identified glycine and proline as the most abundant amino acids within polyA and in the flanking regions of polyA tracts, in human proteins as well as in 17 additional eukaryotic species. Our analyses indicate that the non-structuring nature of these two amino acids influences the α-helical conformations predicted for polyA, suggesting a relevant role in reducing the inherent aggregation propensity of long polyA. Then, we show how polyA position in protein N-termini relates with their function as transit peptides. PolyA placed just after the initial methionine is often predicted as part of mitochondrial transit peptides, whereas when placed in downstream positions, polyA are part of signal peptides. A few examples from known structures suggest that short polyA can emerge by alanine substitutions in α-helices; but evolution by insertion is observed for longer polyA. Our results showcase the importance of studying the sequence context of homorepeats as a mechanism to shape their structure-function relationships. AVAILABILITY AND IMPLEMENTATION The datasets used and/or analyzed during the current study are available from the corresponding author onreasonable request. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Pablo Mier
- Faculty of Biology, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University Mainz, 55128 Mainz, Germany
| | - Carlos A Elena-Real
- Centre de Biologie Structurale (CBS), Université de Montpellier, INSERM, CNRS, 34090 Montpellier, France
| | - Juan Cortés
- LAAS-CNRS, Université de Toulouse, CNRS, Toulouse, France
| | - Pau Bernadó
- Centre de Biologie Structurale (CBS), Université de Montpellier, INSERM, CNRS, 34090 Montpellier, France
| | - Miguel A Andrade-Navarro
- Faculty of Biology, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University Mainz, 55128 Mainz, Germany
| |
Collapse
|
24
|
Wang X, Simon SM, Coffino P. Single molecule microscopy reveals diverse actions of substrate sequences that impair ClpX AAA+ ATPase function. J Biol Chem 2022; 298:102457. [PMID: 36064000 PMCID: PMC9531181 DOI: 10.1016/j.jbc.2022.102457] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Revised: 08/25/2022] [Accepted: 08/26/2022] [Indexed: 10/28/2022] Open
Abstract
AAA+ (ATPases Associated with diverse cellular Activities) proteases unfold substrate proteins by pulling the substrate polypeptide through a narrow pore. To overcome the barrier to unfolding, substrates may require extended association with the ATPase. Failed unfolding attempts can lead to a slip of grip, which may result in substrate dissociation, but how substrate sequence affects slippage is unresolved. Here, we measured single molecule dwell time using total internal reflection fluorescence microscopy, scoring time-dependent dissociation of engaged substrates from bacterial AAA+ ATPase unfoldase/translocase ClpX. Substrates comprising a stable domain resistant to unfolding and a C-terminal unstructured tail, tagged with a degron for initiating translocase insertion, were used to determine dwell time in relation to tail length and composition. We found greater tail length promoted substrate retention during futile unfolding. Additionally, we tested two tail compositions known to frustrate unfolding. A poly-glycine tract (polyG) promoted release, but only when adjacent to the folded domain, whereas glycine-alanine repeats (GAr) did not promote release. A high complexity motif containing polar and charged residues also promoted release. We further investigated the impact of these and related motifs on substrate degradation rates and ATP consumption, using the unfoldase-protease complex ClpXP. Here, substrate domain stability modulates the effects of substrate tail sequences. polyG and GAr are both inhibitory for unfolding, but act in different ways. GAr motifs only negatively affected degradation of highly stable substrates, which is accompanied by reduced ClpXP ATPase activity. Together, our results specify substrate characteristics that affect unfolding and degradation by ClpXP.
Collapse
Affiliation(s)
- Xiao Wang
- Laboratory of Cellular Biophysics, The Rockefeller University, New York, New York, USA
| | - Sanford M Simon
- Laboratory of Cellular Biophysics, The Rockefeller University, New York, New York, USA
| | - Philip Coffino
- Laboratory of Cellular Biophysics, The Rockefeller University, New York, New York, USA.
| |
Collapse
|
25
|
Karamycheva S, Wolf YI, Persi E, Koonin EV, Makarova KS. Analysis of lineage-specific protein family variability in prokaryotes combined with evolutionary reconstructions. Biol Direct 2022; 17:22. [PMID: 36042479 PMCID: PMC9425974 DOI: 10.1186/s13062-022-00337-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 08/13/2022] [Indexed: 12/24/2022] Open
Abstract
Background Evolutionary rate is a key characteristic of gene families that is linked to the functional importance of the respective genes as well as specific biological functions of the proteins they encode. Accurate estimation of evolutionary rates is a challenging task that requires precise phylogenetic analysis. Here we present an easy to estimate protein family level measure of sequence variability based on alignment column homogeneity in multiple alignments of protein sequences from Clade-Specific Clusters of Orthologous Genes (csCOGs). Results We report genome-wide estimates of variability for 8 diverse groups of bacteria and archaea and investigate the connection between variability and various genomic and biological features. The variability estimates are based on homogeneity distributions across amino acid sequence alignments and can be obtained for multiple groups of genomes at minimal computational expense. About half of the variance in variability values can be explained by the analyzed features, with the greatest contribution coming from the extent of gene paralogy in the given csCOG. The correlation between variability and paralogy appears to originate, primarily, not from gene duplication, but from acquisition of distant paralogs and xenologs, introducing sequence variants that are more divergent than those that could have evolved in situ during the lifetime of the given group of organisms. Both high-variability and low-variability csCOGs were identified in all functional categories, but as expected, proteins encoded by integrated mobile elements as well as proteins involved in defense functions and cell motility are, on average, more variable than proteins with housekeeping functions. Additionally, using linear discriminant analysis, we found that variability and fraction of genomes carrying a given gene are the two variables that provide the best prediction of gene essentiality as compared to the results of transposon mutagenesis in Sulfolobus islandicus. Conclusions Variability, a measure of sequence diversity within an alignment relative to the overall diversity within a group of organisms, offers a convenient proxy for evolutionary rate estimates and is informative with respect to prediction of functional properties of proteins. In particular, variability is a strong predictor of gene essentiality for the respective organisms and indicative of sub- or neofunctionalization of paralogs. Supplementary Information The online version contains supplementary material available at 10.1186/s13062-022-00337-7.
Collapse
Affiliation(s)
- Svetlana Karamycheva
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, 20894, USA
| | - Yuri I Wolf
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, 20894, USA
| | - Erez Persi
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, 20894, USA
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, 20894, USA
| | - Kira S Makarova
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, 20894, USA.
| |
Collapse
|
26
|
González-Tortuero E, Anthon C, Havgaard JH, Geissler AS, Breüner A, Hjort C, Gorodkin J, Seemann SE. The Bacillaceae-1 RNA motif comprises two distinct classes. Gene 2022; 841:146756. [PMID: 35905857 DOI: 10.1016/j.gene.2022.146756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 06/10/2022] [Accepted: 07/24/2022] [Indexed: 11/04/2022]
Abstract
Non-coding RNAs are key regulatory players in bacteria. Many computationally predicted non-coding RNAs, however, lack functional associations. An example is the Bacillaceae-1 RNA motif, whose Rfam model consists of two hairpin loops. We find the motif conserved in nine of 13 non-pathogenic strains of the genus Bacillus but only in one pathogenic strain. To elucidate functional characteristics, we studied 118 hits of the Rfam model in 11 Bacillus spp. and found two distinct classes based on the ensemble diversity of their RNA secondary structure and the genomic context concerning the ribosomal RNA (rRNA) cluster. Forty hits are associated with the rRNA cluster, of which all 19 hits upstream flanking of 16S rRNA have a reverse complementary structure of low structural diversity. Fifty-two hits have large ensemble diversity, of which 38 are located between two coding genes. For eight hits in Bacillus subtilis, we investigated public expression data under various conditions and observed either the forward or the reverse complementary motif expressed. Five hits are associated with the rRNA cluster. Four of them are located upstream of the 16S rRNA and are not transcriptionally active, but instead, their reverse complements with low structural diversity are expressed together with the rRNA cluster. The three other hits are located between two coding genes in non-conserved genomic loci. Two of them are independently expressed from their surrounding genes and are structurally diverse. In summary, we found that Bacillaceae-1 RNA motifs upstream flanking of ribosomal RNA clusters tend to have one stable structure with the reverse complementary motif expressed in B. subtilis. In contrast, a subgroup of intergenic motifs has the thermodynamic potential for structural switches.
Collapse
Affiliation(s)
- Enrique González-Tortuero
- Center for non-coding RNA in Technology and Health (RTH), Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Christian Anthon
- Center for non-coding RNA in Technology and Health (RTH), Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Jakob H Havgaard
- Center for non-coding RNA in Technology and Health (RTH), Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Adrian S Geissler
- Center for non-coding RNA in Technology and Health (RTH), Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, Denmark
| | | | | | - Jan Gorodkin
- Center for non-coding RNA in Technology and Health (RTH), Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, Denmark.
| | - Stefan E Seemann
- Center for non-coding RNA in Technology and Health (RTH), Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, Denmark.
| |
Collapse
|
27
|
Shimizu K, Takeuchi T, Negishi L, Kurumizaka H, Kuriyama I, Endo K, Suzuki M. Evolution of EGF-like and Zona pellucida domains containing shell matrix proteins in mollusks. Mol Biol Evol 2022; 39:6633355. [PMID: 35796746 PMCID: PMC9290575 DOI: 10.1093/molbev/msac148] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Several types of shell matrix proteins (SMPs) have been identified in molluskan shells. Their diversity is the consequence of various molecular processes, including domain shuffling and gene duplication. However, the evolutionary origin of most SMPs remains unclear. In this study, we investigated the evolutionary process EGF-like and zona pellucida (ZP) domains containing SMPs. Two types of the proteins (EGF-like protein (EGFL) and EGF-like and ZP domains containing protein (EGFZP)) were found in the pearl oyster, Pinctada fucata. In contrast, only EGFZP was identified in the gastropods. Phylogenetic analysis and genomic arrangement studies showed that EGFL and EGFZP formed a clade in bivalves, and their encoding genes were localized in tandem repeats on the same scaffold. In P. fucata, EGFL genes were expressed in the outer part of mantle epithelial cells are related to the calcitic shell formation. However, in both P. fucata and the limpet Nipponacmea fuscoviridis, EGFZP genes were expressed in the inner part of the mantle epithelial cells are related to aragonitic shell formation. Furthermore, our analysis showed that in P. fucata, the ZP domain interacts with eight SMPs that have various functions in the nacreous shell mineralization. The data suggest that the ZP domain can interact with other SMPs, and EGFL evolution in pterimorph bivalves represents an example of neo-functionalization that involves the acquisition of a novel protein through gene duplication.
Collapse
Affiliation(s)
- Keisuke Shimizu
- Department of Applied Biological Chemistry, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo, Tokyo, 113-8657, Japan
| | - Takeshi Takeuchi
- Marine Genomics Unit, Okinawa Institute of Science and Technology Graduate University, Onna, Okinawa, Japan
| | - Lumi Negishi
- Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo, Tokyo, 113-8657, Japan
| | - Hitoshi Kurumizaka
- Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo, Tokyo, 113-8657, Japan
| | - Isao Kuriyama
- Mie Prefecture Fisheries Research Institute, 3564-3 Hamajima, Hamajima-cho, Shima-city, Mie 517-0404, Japan
| | - Kazuyoshi Endo
- Department of Earth and Planetary Science, The University of Tokyo, 7-3-1 Hongo, Tokyo 113-0033, Japan
| | - Michio Suzuki
- Department of Applied Biological Chemistry, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo, Tokyo, 113-8657, Japan
| |
Collapse
|
28
|
Chamanrokh P, Colwell RR, Huq A. Loop-Mediated Isothermal Amplification (LAMP) Assay for Rapid Detection of viable but non-culturable Vibrio cholerae O1. Can J Microbiol 2021; 68:103-110. [PMID: 34793252 DOI: 10.1139/cjm-2021-0142] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Vibrio cholerae, an important waterborne pathogen, is a rod-shaped bacterium that naturally exists in aquatic environments. When conditions are unfavorable for growth, the bacterium can undergo morphological and physiological changes to assume a coccoid morphology. This stage in its life cycle is referred to as viable but non-culturable (VBNC) since VBNC cells do not grow on conventional bacteriological culture media. The current study compared polymerase chain reaction (PCR) and loop-mediated isothermal amplification (LAMP) to detect and identify VBNC V. cholerae. Because it is difficult to detect and identify VBNC V. cholerae, the results of the current study are useful in showing LAMP to be more sensitive and rapid than PCR in detecting and identifying non-culturable, coccoid forms of V. cholerae. Furthermore, the LAMP method is effective in detecting and identifying very low numbers of coccoid VBNC V. cholerae in environmental water samples, with the added benefit of being inexpensive to perform.
Collapse
Affiliation(s)
- Parastoo Chamanrokh
- University of Maryland College Park, Maryland Pathogen Research Institute, College Park, Maryland, United States;
| | - Rita R Colwell
- University of Maryland at College Park, 1068, Maryland Pathogen Research Institute, College Park, Maryland, United States.,University of Maryland at College Park, 1068, Maryland Institute of Applied Environmental Health, College Park, Maryland, United States.,University of Maryland at College Park, 1068, CBCB. UMIACS, College Park, Maryland, United States.,Johns Hopkins University Bloomberg School of Public Health, 25802, Baltimore, Maryland, United States;
| | - Anwar Huq
- University of Maryland at College Park, 1068, Maryland Pathogen Research Institute, College Park, Maryland, United States;
| |
Collapse
|
29
|
Harrison PM. fLPS 2.0: rapid annotation of compositionally-biased regions in biological sequences. PeerJ 2021; 9:e12363. [PMID: 34760378 PMCID: PMC8557692 DOI: 10.7717/peerj.12363] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 09/30/2021] [Indexed: 12/12/2022] Open
Abstract
Compositionally-biased (CB) regions in biological sequences are enriched for a subset of sequence residue types. These can be shorter regions with a concentrated bias (i.e., those termed ‘low-complexity’), or longer regions that have a compositional skew. These regions comprise a prominent class of the uncharacterized ‘dark matter’ of the protein universe. Here, I report the latest version of the fLPS package for the annotation of CB regions, which includes added consideration of DNA sequences, to label the eight possible biased regions of DNA. In this version, the user is now able to restrict analysis to a specified subset of residue types, and also to filter for previously annotated domains to enable detection of discontinuous CB regions. A ‘thorough’ option has been added which enables the labelling of subtler biases, typically made from a skew for several residue types. In the output, protein CB regions are now labelled with bias classes reflecting the physico-chemical character of the biasing residues. The fLPS 2.0 package is available from: https://github.com/pmharrison/flps2 or in a Supplemental File of this paper.
Collapse
Affiliation(s)
- Paul M Harrison
- Department of Biology, McGill University, Montreal, QC, Canada
| |
Collapse
|
30
|
Fesenko I, Shabalina SA, Mamaeva A, Knyazev A, Glushkevich A, Lyapina I, Ziganshin R, Kovalchuk S, Kharlampieva D, Lazarev V, Taliansky M, Koonin EV. A vast pool of lineage-specific microproteins encoded by long non-coding RNAs in plants. Nucleic Acids Res 2021; 49:10328-10346. [PMID: 34570232 DOI: 10.1093/nar/gkab816] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 08/17/2021] [Accepted: 09/17/2021] [Indexed: 12/17/2022] Open
Abstract
Pervasive transcription of eukaryotic genomes results in expression of long non-coding RNAs (lncRNAs) most of which are poorly conserved in evolution and appear to be non-functional. However, some lncRNAs have been shown to perform specific functions, in particular, transcription regulation. Thousands of small open reading frames (smORFs, <100 codons) located on lncRNAs potentially might be translated into peptides or microproteins. We report a comprehensive analysis of the conservation and evolutionary trajectories of lncRNAs-smORFs from the moss Physcomitrium patens across transcriptomes of 479 plant species. Although thousands of smORFs are subject to substantial purifying selection, the majority of the smORFs appear to be evolutionary young and could represent a major pool for functional innovation. Using nanopore RNA sequencing, we show that, on average, the transcriptional level of conserved smORFs is higher than that of non-conserved smORFs. Proteomic analysis confirmed translation of 82 novel species-specific smORFs. Numerous conserved smORFs containing low complexity regions (LCRs) or transmembrane domains were identified, the biological functions of a selected LCR-smORF were demonstrated experimentally. Thus, microproteins encoded by smORFs are a major, functionally diverse component of the plant proteome.
Collapse
Affiliation(s)
- Igor Fesenko
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow 117997, Russian Federation
| | - Svetlana A Shabalina
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Anna Mamaeva
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow 117997, Russian Federation
| | - Andrey Knyazev
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow 117997, Russian Federation
| | - Anna Glushkevich
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow 117997, Russian Federation
| | - Irina Lyapina
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow 117997, Russian Federation
| | - Rustam Ziganshin
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow 117997, Russian Federation
| | - Sergey Kovalchuk
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow 117997, Russian Federation
| | - Daria Kharlampieva
- Department of Cell Biology, Federal Research and Clinical Center of Physical -Chemical Medicine of Federal Medical Biological Agency, Moscow 119435, Russian Federation
| | - Vassili Lazarev
- Department of Cell Biology, Federal Research and Clinical Center of Physical -Chemical Medicine of Federal Medical Biological Agency, Moscow 119435, Russian Federation.,Moscow Institute of Physics and Technology (National Research University), Dolgoprudny, Moscow region, 141701, Russian Federation
| | - Michael Taliansky
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow 117997, Russian Federation.,The James Hutton Institute, Invergowrie, Dundee DD2 5DA, UK
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|
31
|
Cappannini A, Forcelloni S, Giansanti A. Evolutionary pressures and codon bias in low complexity regions of plasmodia. Genetica 2021; 149:217-237. [PMID: 34254217 DOI: 10.1007/s10709-021-00126-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 06/30/2021] [Indexed: 11/25/2022]
Abstract
The biological meaning of low complexity regions in the proteins of Plasmodium species is a topic of discussion in evolutionary biology. There is a debate between selectionists and neutralists, who either attribute or do not attribute an effect of low-complexity regions on the fitness of these parasites, respectively. In this work, we comparatively study 22 Plasmodium species to understand whether their low complexity regions undergo a neutral or, rather, a selective and species-dependent evolution. The focus is on the connection between the codon repertoire of the genetic coding sequences and the occurrence of low complexity regions in the corresponding proteins. The first part of the work concerns the correlation between the length of plasmodial proteins and their propensity at embedding low complexity regions. Relative synonymous codon usage, entropy, and other indicators reveal that the incidence of low complexity regions and their codon bias is species-specific and subject to selective evolutionary pressure. We also observed that protein length, a relaxed selective pressure, and a broad repertoire of codons in proteins, are strongly correlated with the occurrence of low complexity regions. Overall, it seems plausible that the codon bias of low-complexity regions contributes to functional innovation and codon bias enhancement of proteins on which Plasmodium species rest as successful evolutionary parasites.
Collapse
Affiliation(s)
- Andrea Cappannini
- Department of Physics, Sapienza, University of Rome, P.le A. Moro 5, 00185, Roma, Italy.
| | - Sergio Forcelloni
- Max Planck Institute of Biochemistry, 82152, Martinsried, Germany.,Department of Chemistry, Technical University of Munich, 85748, Garching, Germany
| | - Andrea Giansanti
- Department of Physics, Sapienza, University of Rome, P.le A. Moro 5, 00185, Roma, Italy.,Istituto Nazionale di Fisica Nucleare, INFN, Roma1 section. 00185, Roma, Italy
| |
Collapse
|
32
|
Survey of Drought-Associated TAWRKY2-D1 Gene Diversity in Bread Wheat and Wheat Relatives. Mol Biotechnol 2021; 63:953-962. [PMID: 34131856 DOI: 10.1007/s12033-021-00350-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Accepted: 06/03/2021] [Indexed: 10/21/2022]
Abstract
Recent advances in plant genomics revealed numerous factors related to drought tolerance, including a family of WRKY transcription factors. The aim of this study was to evaluate polymorphism of the TaWRKY2-D1 across a range of bread wheat cultivars, interspecific hybrids, and wild wheat relatives within the Triticum genus as a potential molecular target for marker-assistant selection. The initial sequencing of the TaWRKY2-D1 gene in six Ukrainian commercial cultivars detected some sequence variations along the ~ 1.8 kb of gene promoter and the followed coding region composed of four exons and three introns. Based on the gained sequence information, five sets of primers covering different gene regions were designed to annotate theTaWRKY2-D1 genetic diversity in 202 wheat cultivars, including 77 accessions from the CIMMYT collection, 72 commercial varieties cultivated in Ukraine, and 53 hybrids and wild wheat species. The combination of developed DNA markers enabled effective and reproducible annotation of cultivars genetic diversity. The primers set targeting introns adjusted to the gene's exon 3, turned out to be the most informative for screening heterogeneity of the TaWRKY2-D1. The developed molecular markers represent effective, informative means for selecting drought tolerance germplasm donors to promote wheat breeding programs.
Collapse
|
33
|
Coates HW, Capell-Hattam IM, Brown AJ. The mammalian cholesterol synthesis enzyme squalene monooxygenase is proteasomally truncated to a constitutively active form. J Biol Chem 2021; 296:100731. [PMID: 33933449 PMCID: PMC8166775 DOI: 10.1016/j.jbc.2021.100731] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 04/24/2021] [Accepted: 04/28/2021] [Indexed: 02/06/2023] Open
Abstract
Squalene monooxygenase (SM, also known as squalene epoxidase) is a rate-limiting enzyme of cholesterol synthesis that converts squalene to monooxidosqualene and is oncogenic in numerous cancer types. SM is subject to feedback regulation via cholesterol-induced proteasomal degradation, which depends on its lipid-sensing N-terminal regulatory domain. We previously identified an endogenous truncated form of SM with a similar abundance to full-length SM, but whether this truncated form is functional or subject to the same regulatory mechanisms as full-length SM is not known. Here, we show that truncated SM differs from full-length SM in two major ways: it is cholesterol resistant and adopts a peripheral rather than integral association with the endoplasmic reticulum membrane. However, truncated SM retains full SM activity and is therefore constitutively active. Truncation of SM occurs during its endoplasmic reticulum–associated degradation and requires the proteasome, which partially degrades the SM N-terminus and disrupts cholesterol-sensing elements within the regulatory domain. Furthermore, truncation relies on a ubiquitin signal that is distinct from that required for cholesterol-induced degradation. Using mutagenesis, we demonstrate that partial proteasomal degradation of SM depends on both an intrinsically disordered region near the truncation site and the stability of the adjacent catalytic domain, which escapes degradation. These findings uncover an additional layer of complexity in the post-translational regulation of cholesterol synthesis and establish SM as the first eukaryotic enzyme found to undergo proteasomal truncation.
Collapse
Affiliation(s)
- Hudson W Coates
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, NSW, Australia
| | | | - Andrew J Brown
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, NSW, Australia.
| |
Collapse
|
34
|
Tayeb-Fligelman E, Cheng X, Tai C, Bowler JT, Griner S, Sawaya MR, Seidler PM, Jiang YX, Lu J, Rosenberg GM, Salwinski L, Abskharon R, Zee CT, Hou K, Li Y, Boyer DR, Murray KA, Falcon G, Anderson DH, Cascio D, Saelices L, Damoiseaux R, Guo F, Eisenberg DS. Inhibition of amyloid formation of the Nucleoprotein of SARS-CoV-2. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2021:2021.03.05.434000. [PMID: 33688654 PMCID: PMC7941625 DOI: 10.1101/2021.03.05.434000] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The SARS-CoV-2 Nucleoprotein (NCAP) functions in RNA packaging during viral replication and assembly. Computational analysis of its amino acid sequence reveals a central low-complexity domain (LCD) having sequence features akin to LCDs in other proteins known to function in liquid-liquid phase separation. Here we show that in the presence of viral RNA, NCAP, and also its LCD segment alone, form amyloid-like fibrils when undergoing liquid-liquid phase separation. Within the LCD we identified three 6-residue segments that drive amyloid fibril formation. We determined atomic structures for fibrils formed by each of the three identified segments. These structures informed our design of peptide inhibitors of NCAP fibril formation and liquid-liquid phase separation, suggesting a therapeutic route for Covid-19. ONE SENTENCE SUMMARY Atomic structures of amyloid-driving peptide segments from SARS-CoV-2 Nucleoprotein inform the development of Covid-19 therapeutics.
Collapse
|
35
|
Lallemand T, Leduc M, Landès C, Rizzon C, Lerat E. An Overview of Duplicated Gene Detection Methods: Why the Duplication Mechanism Has to Be Accounted for in Their Choice. Genes (Basel) 2020; 11:E1046. [PMID: 32899740 PMCID: PMC7565063 DOI: 10.3390/genes11091046] [Citation(s) in RCA: 65] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 09/01/2020] [Accepted: 09/02/2020] [Indexed: 12/11/2022] Open
Abstract
Gene duplication is an important evolutionary mechanism allowing to provide new genetic material and thus opportunities to acquire new gene functions for an organism, with major implications such as speciation events. Various processes are known to allow a gene to be duplicated and different models explain how duplicated genes can be maintained in genomes. Due to their particular importance, the identification of duplicated genes is essential when studying genome evolution but it can still be a challenge due to the various fates duplicated genes can encounter. In this review, we first describe the evolutionary processes allowing the formation of duplicated genes but also describe the various bioinformatic approaches that can be used to identify them in genome sequences. Indeed, these bioinformatic approaches differ according to the underlying duplication mechanism. Hence, understanding the specificity of the duplicated genes of interest is a great asset for tool selection and should be taken into account when exploring a biological question.
Collapse
Affiliation(s)
- Tanguy Lallemand
- IRHS, Agrocampus-Ouest, INRAE, Université d’Angers, SFR 4207 QuaSaV, 49071 Beaucouzé, France; (T.L.); (M.L.); (C.L.)
| | - Martin Leduc
- IRHS, Agrocampus-Ouest, INRAE, Université d’Angers, SFR 4207 QuaSaV, 49071 Beaucouzé, France; (T.L.); (M.L.); (C.L.)
| | - Claudine Landès
- IRHS, Agrocampus-Ouest, INRAE, Université d’Angers, SFR 4207 QuaSaV, 49071 Beaucouzé, France; (T.L.); (M.L.); (C.L.)
| | - Carène Rizzon
- Laboratoire de Mathématiques et Modélisation d’Evry (LaMME), Université d’Evry Val d’Essonne, Université Paris-Saclay, UMR CNRS 8071, ENSIIE, USC INRAE, 23 bvd de France, CEDEX, 91037 Evry Paris, France;
| | - Emmanuelle Lerat
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Evolutive UMR 5558, F-69622 Villeurbanne, France
| |
Collapse
|
36
|
Maurer-Stroh S, Krutz NL, Kern PS, Gunalan V, Nguyen MN, Limviphuvadh V, Eisenhaber F, Gerberick GF. AllerCatPro-prediction of protein allergenicity potential from the protein sequence. Bioinformatics 2020; 35:3020-3027. [PMID: 30657872 PMCID: PMC6736023 DOI: 10.1093/bioinformatics/btz029] [Citation(s) in RCA: 98] [Impact Index Per Article: 19.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Revised: 12/18/2018] [Accepted: 01/14/2019] [Indexed: 12/22/2022] Open
Abstract
Motivation Due to the risk of inducing an immediate Type I (IgE-mediated) allergic response, proteins intended for use in consumer products must be investigated for their allergenic potential before introduction into the marketplace. The FAO/WHO guidelines for computational assessment of allergenic potential of proteins based on short peptide hits and linear sequence window identity thresholds misclassify many proteins as allergens. Results We developed AllerCatPro which predicts the allergenic potential of proteins based on similarity of their 3D protein structure as well as their amino acid sequence compared with a data set of known protein allergens comprising of 4180 unique allergenic protein sequences derived from the union of the major databases Food Allergy Research and Resource Program, Comprehensive Protein Allergen Resource, WHO/International Union of Immunological Societies, UniProtKB and Allergome. We extended the hexamer hit rule by removing peptides with high probability of random occurrence measured by sequence entropy as well as requiring 3 or more hexamer hits consistent with natural linear epitope patterns in known allergens. This is complemented with a Gluten-like repeat pattern detection. We also switched from a linear sequence window similarity to a B-cell epitope-like 3D surface similarity window which became possible through extensive 3D structure modeling covering the majority (74%) of allergens. In case no structure similarity is found, the decision workflow reverts to the old linear sequence window rule. The overall accuracy of AllerCatPro is 84% compared with other current methods which range from 51 to 73%. Both the FAO/WHO rules and AllerCatPro achieve highest sensitivity but AllerCatPro provides a 37-fold increase in specificity. Availability and implementation https://allercatpro.bii.a-star.edu.sg/ Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sebastian Maurer-Stroh
- Biomolecular Function Discovery Division, Bioinformatics Institute, Agency for Science, Technology and Research, Singapore.,Department of Biological Sciences, National University of Singapore, Singapore
| | - Nora L Krutz
- The Procter & Gamble Services Company, Strombeek-Bever, Belgium
| | - Petra S Kern
- The Procter & Gamble Services Company, Strombeek-Bever, Belgium
| | - Vithiagaran Gunalan
- Biomolecular Function Discovery Division, Bioinformatics Institute, Agency for Science, Technology and Research, Singapore
| | - Minh N Nguyen
- Biomolecular Function Discovery Division, Bioinformatics Institute, Agency for Science, Technology and Research, Singapore
| | - Vachiranee Limviphuvadh
- Biomolecular Function Discovery Division, Bioinformatics Institute, Agency for Science, Technology and Research, Singapore
| | - Frank Eisenhaber
- Biomolecular Function Discovery Division, Bioinformatics Institute, Agency for Science, Technology and Research, Singapore.,Department of Biological Sciences, National University of Singapore, Singapore
| | | |
Collapse
|
37
|
Lau Y, Oamen HP, Caudron F. Protein Phase Separation during Stress Adaptation and Cellular Memory. Cells 2020; 9:cells9051302. [PMID: 32456195 PMCID: PMC7291175 DOI: 10.3390/cells9051302] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 05/14/2020] [Accepted: 05/21/2020] [Indexed: 12/13/2022] Open
Abstract
Cells need to organise and regulate their biochemical processes both in space and time in order to adapt to their surrounding environment. Spatial organisation of cellular components is facilitated by a complex network of membrane bound organelles. Both the membrane composition and the intra-organellar content of these organelles can be specifically and temporally controlled by imposing gates, much like bouncers controlling entry into night-clubs. In addition, a new level of compartmentalisation has recently emerged as a fundamental principle of cellular organisation, the formation of membrane-less organelles. Many of these structures are dynamic, rapidly condensing or dissolving and are therefore ideally suited to be involved in emergency cellular adaptation to stresses. Remarkably, the same proteins have also the propensity to adopt self-perpetuating assemblies which properties fit the needs to encode cellular memory. Here, we review some of the principles of phase separation and the function of membrane-less organelles focusing particularly on their roles during stress response and cellular memory.
Collapse
|
38
|
Cloning, expression and purification of the low-complexity region of RanBP9 protein. Protein Expr Purif 2020; 172:105630. [PMID: 32217127 DOI: 10.1016/j.pep.2020.105630] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Revised: 03/19/2020] [Accepted: 03/19/2020] [Indexed: 11/22/2022]
Abstract
Recombinant expression and purification of proteins is key for biochemical and biophysical investigations. Although this has become a routine and standard procedure for many proteins, intrinsically disordered ones and those with low complexity sequences pose difficulties. Proteins containing low complexity regions (LCRs) are increasingly becoming significant for their roles in both normal and pathological processes. Here, we report cloning, expression and purification of N-terminal LCR of RanBP9 protein (Nt-RanBP9). RanBP9 is a scaffolding protein present in both cytoplasm and nucleus that is implicated in many cellular processes. Nt-RanBP9 is a poorly understood region of the protein perhaps due to difficulties posed by the LCR. Indeed, conventional methods presented difficulties in Nt-RanBP9 cloning due to its high GC content resulting in insignificant protein expression. These led us to use a different approach of cloning by expressing the protein as a fusion construct containing mCherry or mEGFP using in vivo DNA recombination methods. Our results indicate that expression of mEGFP-tagged Nt-RanBP9 followed by thrombin cleavage of the tag was the most effective method to obtain the protein with >90% purity and good yields. We report and discuss the challenges in obtaining the N-terminal region of RanBP9, a protein with functional implications in multiple biological processes and neurodegenerative diseases.
Collapse
|
39
|
Atypical structural tendencies among low-complexity domains in the Protein Data Bank proteome. PLoS Comput Biol 2020; 16:e1007487. [PMID: 31986130 PMCID: PMC7004392 DOI: 10.1371/journal.pcbi.1007487] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Revised: 02/06/2020] [Accepted: 12/23/2019] [Indexed: 11/29/2022] Open
Abstract
A variety of studies have suggested that low-complexity domains (LCDs) tend to be intrinsically disordered and are relatively rare within structured proteins in the Protein Data Bank (PDB). Although LCDs are often treated as a single class, we previously found that LCDs enriched in different amino acids can exhibit substantial differences in protein metabolism and function. Therefore, we wondered whether the structural conformations of LCDs are likewise dependent on which specific amino acids are enriched within each LCD. Here, we directly examined relationships between enrichment of individual amino acids and secondary structure tendencies across the entire PDB proteome. Secondary structure tendencies varied as a function of the identity of the amino acid enriched and its degree of enrichment. Furthermore, divergence in secondary structure profiles often occurred for LCDs enriched in physicochemically similar amino acids (e.g. valine vs. leucine), indicating that LCDs composed of related amino acids can have distinct secondary structure tendencies. Comparison of LCD secondary structure tendencies with numerous pre-existing secondary structure propensity scales resulted in relatively poor correlations for certain types of LCDs, indicating that these scales may not capture secondary structure tendencies as sequence complexity decreases. Collectively, these observations provide a highly resolved view of structural tendencies among LCDs parsed by the nature and magnitude of single amino acid enrichment. The structures that proteins adopt are directly related to their amino acid sequences. Low-complexity domains (LCDs) in protein sequences are unusual regions made up of only a few different types of amino acids. Although this is the key feature that classifies sequences as LCDs, the physical properties of LCDs will differ based on the types of amino acids that are found in each domain. For example, the sequences “AAAAAAAAAA”, “EEEEEEEEEE”, and “EEKRKEEEKE” will have very different properties, even though they would all be classified as LCDs by traditional methods. In a previous study, we developed a new method to further divide LCDs into categories that more closely reflect the differences in their physical properties. In this study, we apply that approach to examine the structures of LCDs when sorted into different categories based on their amino acids. This allowed us to define relationships between the types of amino acids in the LCDs and their corresponding structures. Since protein structure is closely related to protein function, this has important implications for understanding the basic functions and properties of LCDs in a variety of proteins.
Collapse
|
40
|
Urbanek A, Elena-Real CA, Popovic M, Morató A, Fournet A, Allemand F, Delbecq S, Sibille N, Bernadó P. Site-Specific Isotopic Labeling (SSIL): Access to High-Resolution Structural and Dynamic Information in Low-Complexity Proteins. Chembiochem 2019; 21:769-775. [PMID: 31697025 DOI: 10.1002/cbic.201900583] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Revised: 11/05/2019] [Indexed: 12/17/2022]
Abstract
Remarkable technical progress in the area of structural biology has paved the way to study previously inaccessible targets. For example, large protein complexes can now be easily investigated by cryo-electron microscopy, and modern high-field NMR magnets have challenged the limits of high-resolution characterization of proteins in solution. However, the structural and dynamic characteristics of certain proteins with important functions still cannot be probed by conventional methods. These proteins in question contain low-complexity regions (LCRs), compositionally biased sequences where only a limited number of amino acids is repeated multiple times, which hamper their characterization. This Concept article describes a site-specific isotopic labeling (SSIL) strategy, which combines nonsense suppression and cell-free protein synthesis to overcome these limitations. An overview on how poly-glutamine tracts were made amenable to high-resolution structural studies is used to illustrate the usefulness of SSIL. Furthermore, we discuss the potential of this methodology to give further insights into the roles of LCRs in human pathologies and liquid-liquid phase separation, as well as the challenges that must be addressed in the future for the popularization of SSIL.
Collapse
Affiliation(s)
- Annika Urbanek
- Centre de Biochimie Structurale (CBS), INSERM, CNRS, Université de Montpellier, 29, rue de Navacelles, 34090, Montpellier, France
| | - Carlos A Elena-Real
- Centre de Biochimie Structurale (CBS), INSERM, CNRS, Université de Montpellier, 29, rue de Navacelles, 34090, Montpellier, France
| | - Matija Popovic
- Centre de Biochimie Structurale (CBS), INSERM, CNRS, Université de Montpellier, 29, rue de Navacelles, 34090, Montpellier, France
| | - Anna Morató
- Centre de Biochimie Structurale (CBS), INSERM, CNRS, Université de Montpellier, 29, rue de Navacelles, 34090, Montpellier, France
| | - Aurélie Fournet
- Centre de Biochimie Structurale (CBS), INSERM, CNRS, Université de Montpellier, 29, rue de Navacelles, 34090, Montpellier, France
| | - Frédéric Allemand
- Centre de Biochimie Structurale (CBS), INSERM, CNRS, Université de Montpellier, 29, rue de Navacelles, 34090, Montpellier, France
| | - Stephane Delbecq
- Laboratoire de Biologie Cellulaire et Moléculaire, (LBCM-EA4558 Vaccination Antiparasitaire), UFR Pharmacie, Université de Montpellier, 15, Av. Charles Flahault, BP 14491, 34000, Montpellier, France
| | - Nathalie Sibille
- Centre de Biochimie Structurale (CBS), INSERM, CNRS, Université de Montpellier, 29, rue de Navacelles, 34090, Montpellier, France
| | - Pau Bernadó
- Centre de Biochimie Structurale (CBS), INSERM, CNRS, Université de Montpellier, 29, rue de Navacelles, 34090, Montpellier, France
| |
Collapse
|
41
|
Ntountoumi C, Vlastaridis P, Mossialos D, Stathopoulos C, Iliopoulos I, Promponas V, Oliver SG, Amoutzias GD. Low complexity regions in the proteins of prokaryotes perform important functional roles and are highly conserved. Nucleic Acids Res 2019; 47:9998-10009. [PMID: 31504783 PMCID: PMC6821194 DOI: 10.1093/nar/gkz730] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2019] [Revised: 07/16/2019] [Accepted: 08/15/2019] [Indexed: 01/27/2023] Open
Abstract
We provide the first high-throughput analysis of the properties and functional role of Low Complexity Regions (LCRs) in more than 1500 prokaryotic and phage proteomes. We observe that, contrary to a widespread belief based on older and sparse data, LCRs actually have a significant, persistent and highly conserved presence and role in many and diverse prokaryotes. Their specific amino acid content is linked to proteins with certain molecular functions, such as the binding of RNA, DNA, metal-ions and polysaccharides. In addition, LCRs have been repeatedly identified in very ancient, and usually highly expressed proteins of the translation machinery. At last, based on the amino acid content enriched in certain categories, we have developed a neural network web server to identify LCRs and accurately predict whether they can bind nucleic acids, metal-ions or are involved in chaperone functions. An evaluation of the tool showed that it is highly accurate for eukaryotic proteins as well.
Collapse
Affiliation(s)
- Chrysa Ntountoumi
- Bioinformatics Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, 41500, Greece
| | - Panayotis Vlastaridis
- Bioinformatics Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, 41500, Greece
| | - Dimitris Mossialos
- Microbial Biotechnology-Molecular Bacteriology-Virology Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, 41500, Greece
| | | | | | - Vasilios Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, New Campus, University of Cyprus, PO Box 20537, CY-1678 Nicosia, Cyprus
| | - Stephen G Oliver
- Cambridge Systems Biology Centre & Department of Biochemistry, University of Cambridge, CB2 1GA, UK
| | - Grigoris D Amoutzias
- Bioinformatics Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, 41500, Greece
| |
Collapse
|
42
|
Abstract
Although Kraken's k-mer-based approach provides a fast taxonomic classification of metagenomic sequence data, its large memory requirements can be limiting for some applications. Kraken 2 improves upon Kraken 1 by reducing memory usage by 85%, allowing greater amounts of reference genomic data to be used, while maintaining high accuracy and increasing speed fivefold. Kraken 2 also introduces a translated search mode, providing increased sensitivity in viral metagenomics analysis.
Collapse
Affiliation(s)
- Derrick E Wood
- Department of Computer Science, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Jennifer Lu
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Ben Langmead
- Department of Computer Science, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA.
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
43
|
Wood DE, Lu J, Langmead B. Improved metagenomic analysis with Kraken 2. Genome Biol 2019; 20:257. [PMID: 31779668 PMCID: PMC6883579 DOI: 10.1186/s13059-019-1891-0] [Citation(s) in RCA: 3179] [Impact Index Per Article: 529.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Accepted: 11/18/2019] [Indexed: 02/06/2023] Open
Abstract
Although Kraken’s k-mer-based approach provides a fast taxonomic classification of metagenomic sequence data, its large memory requirements can be limiting for some applications. Kraken 2 improves upon Kraken 1 by reducing memory usage by 85%, allowing greater amounts of reference genomic data to be used, while maintaining high accuracy and increasing speed fivefold. Kraken 2 also introduces a translated search mode, providing increased sensitivity in viral metagenomics analysis.
Collapse
Affiliation(s)
- Derrick E Wood
- Department of Computer Science, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA.,Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Jennifer Lu
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA.,Department of Biomedical Engineering, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Ben Langmead
- Department of Computer Science, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA. .,Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
44
|
Tomii K, Santos HJ, Nozaki T. Genome-Wide Analysis of Known and Potential Tetraspanins in Entamoeba histolytica. Genes (Basel) 2019; 10:genes10110885. [PMID: 31684194 PMCID: PMC6895871 DOI: 10.3390/genes10110885] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2019] [Revised: 10/25/2019] [Accepted: 10/31/2019] [Indexed: 12/12/2022] Open
Abstract
Tetraspanins are membrane proteins involved in intra- and/or intercellular signaling, and membrane protein complex formation. In some organisms, their role is associated with virulence and pathogenesis. Here, we investigate known and potential tetraspanins in the human intestinal protozoan parasite Entamoeba histolytica. We conducted sequence similarity searches against the proteome data of E. histolytica and newly identified nine uncharacterized proteins as potential tetraspanins in E. histolytica. We found three subgroups within known and potential tetraspanins, as well as subgroup-associated features in both their amino acid and nucleotide sequences. We also examined the subcellular localization of a few representative tetraspanins that might be potentially related to pathogenicity. The results in this study could be useful resources for further understanding and downstream analyses of tetraspanins in Entamoeba.
Collapse
Affiliation(s)
- Kentaro Tomii
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan.
| | - Herbert J Santos
- Department of Biomedical Chemistry, Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan.
| | - Tomoyoshi Nozaki
- Department of Biomedical Chemistry, Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan.
| |
Collapse
|
45
|
miRWoods: Enhanced precursor detection and stacked random forests for the sensitive detection of microRNAs. PLoS Comput Biol 2019; 15:e1007309. [PMID: 31596843 PMCID: PMC6785219 DOI: 10.1371/journal.pcbi.1007309] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 08/05/2019] [Indexed: 12/29/2022] Open
Abstract
MicroRNAs are conserved, endogenous small RNAs with critical post-transcriptional regulatory functions throughout eukaryota, including prominent roles in development and disease. Despite much effort, microRNA annotations still contain errors and are incomplete due especially to challenges related to identifying valid miRs that have small numbers of reads, to properly locating hairpin precursors and to balancing precision and recall. Here, we present miRWoods, which solves these challenges using a duplex-focused precursor detection method and stacked random forests with specialized layers to detect mature and precursor microRNAs, and has been tuned to optimize the harmonic mean of precision and recall. We trained and tuned our discovery pipeline on data sets from the well-annotated human genome, and evaluated its performance on data from mouse. Compared to existing approaches, miRWoods better identifies precursor spans, and can balance sensitivity and specificity for an overall greater prediction accuracy, recalling an average of 10% more annotated microRNAs, and correctly predicts substantially more microRNAs with only one read. We apply this method to the under-annotated genomes of Felis catus (domestic cat) and Bos taurus (cow). We identified hundreds of novel microRNAs in small RNA sequencing data sets from muscle and skin from cat, from 10 tissues from cow and also from human and mouse cells. Our novel predictions include a microRNA in an intron of tyrosine kinase 2 (TYK2) that is present in both cat and cow, as well as a family of mirtrons with two instances in the human genome. Our predictions support a more expanded miR-2284 family in the bovine genome, a larger mir-548 family in the human genome, and a larger let-7 family in the feline genome. While the computational prediction of microRNA loci from high-throughput sequence data is well-studied, challenges persist in defining the minimum number of reads required for a locus to be evaluated, as well as in defining the precursor span. We present a new method, “miRWoods”, which has greater recall of known microRNAs, while also achieving as good or better overall performance. Our approach uses improved duplex-based methods of precursor detection and a pair of random forest layers that sensitively detect mature products and precursors. We trained our model on data from human, and confirmed that it can successfully be applied cross-species by evaluating predictions for the mouse genome. We then applied our approach to new sequencing data mapped to the under-annotated genomes of cow and cat. We were able to use miRWoods to improve annotations for cat and cow microRNAs, and found novel microRNAs in human and mouse, and identified errors in current annotations.
Collapse
|
46
|
Bernhofer M, Goldberg T, Wolf S, Ahmed M, Zaugg J, Boden M, Rost B. NLSdb-major update for database of nuclear localization signals and nuclear export signals. Nucleic Acids Res 2019; 46:D503-D508. [PMID: 29106588 PMCID: PMC5753228 DOI: 10.1093/nar/gkx1021] [Citation(s) in RCA: 64] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Accepted: 10/18/2017] [Indexed: 11/13/2022] Open
Abstract
NLSdb is a database collecting nuclear export signals (NES) and nuclear localization signals (NLS) along with experimentally annotated nuclear and non-nuclear proteins. NES and NLS are short sequence motifs related to protein transport out of and into the nucleus. The updated NLSdb now contains 2253 NLS and introduces 398 NES. The potential sets of novel NES and NLS have been generated by a simple 'in silico mutagenesis' protocol. We started with motifs annotated by experiments. In step 1, we increased specificity such that no known non-nuclear protein matched the refined motif. In step 2, we increased the sensitivity trying to match several different families with a motif. We then iterated over steps 1 and 2. The final set of 2253 NLS motifs matched 35% of 8421 experimentally verified nuclear proteins (up from 21% for the previous version) and none of 18 278 non-nuclear proteins. We updated the web interface providing multiple options to search protein sequences for NES and NLS motifs, and to evaluate your own signal sequences. NLSdb can be accessed via Rostlab services at: https://rostlab.org/services/nlsdb/.
Collapse
Affiliation(s)
- Michael Bernhofer
- Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748 Garching/Munich, Germany
| | - Tatyana Goldberg
- Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748 Garching/Munich, Germany
| | - Silvana Wolf
- Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748 Garching/Munich, Germany
| | - Mohamed Ahmed
- Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748 Garching/Munich, Germany
| | - Julian Zaugg
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane 4072, Australia
| | - Mikael Boden
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane 4072, Australia
| | - Burkhard Rost
- Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748 Garching/Munich, Germany.,Institute of Advanced Study (TUM-IAS), Lichtenbergstrasse 2a, 85748 Garching/Munich, Germany.,Institute for Food and Plant Sciences WZW-Weihenstephan, Alte Akademie 8, 85354 Freising, Germany.,Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA
| |
Collapse
|
47
|
Frottin F, Schueder F, Tiwary S, Gupta R, Körner R, Schlichthaerle T, Cox J, Jungmann R, Hartl FU, Hipp MS. The nucleolus functions as a phase-separated protein quality control compartment. Science 2019; 365:342-347. [PMID: 31296649 DOI: 10.1126/science.aaw9157] [Citation(s) in RCA: 331] [Impact Index Per Article: 55.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2019] [Revised: 05/23/2019] [Accepted: 06/27/2019] [Indexed: 12/24/2022]
Abstract
The nuclear proteome is rich in stress-sensitive proteins, which suggests that effective protein quality control mechanisms are in place to ensure conformational maintenance. We investigated the role of the nucleolus in this process. In mammalian tissue culture cells under stress conditions, misfolded proteins entered the granular component (GC) phase of the nucleolus. Transient associations with nucleolar proteins such as NPM1 conferred low mobility to misfolded proteins within the liquid-like GC phase, avoiding irreversible aggregation. Refolding and extraction of proteins from the nucleolus during recovery from stress was Hsp70-dependent. The capacity of the nucleolus to store misfolded proteins was limited, and prolonged stress led to a transition of the nucleolar matrix from liquid-like to solid, with loss of reversibility and dysfunction in quality control. Thus, we suggest that the nucleolus has chaperone-like properties and can promote nuclear protein maintenance under stress.
Collapse
Affiliation(s)
- F Frottin
- Department of Cellular Biochemistry, Max Planck Institute of Biochemistry, D-82152 Martinsried, Germany
| | - F Schueder
- Research Group "Molecular Imaging and Bionanotechnology," Max Planck Institute of Biochemistry, D-82152 Martinsried, Germany.,Faculty of Physics and Center for Nanoscience, Ludwig Maximilian University, D-80539 Munich, Germany
| | - S Tiwary
- Research Group "Computational Systems Biochemistry," Max Planck Institute of Biochemistry, D-82152 Martinsried, Germany
| | - R Gupta
- Department of Cellular Biochemistry, Max Planck Institute of Biochemistry, D-82152 Martinsried, Germany
| | - R Körner
- Department of Cellular Biochemistry, Max Planck Institute of Biochemistry, D-82152 Martinsried, Germany
| | - T Schlichthaerle
- Research Group "Molecular Imaging and Bionanotechnology," Max Planck Institute of Biochemistry, D-82152 Martinsried, Germany.,Faculty of Physics and Center for Nanoscience, Ludwig Maximilian University, D-80539 Munich, Germany
| | - J Cox
- Research Group "Computational Systems Biochemistry," Max Planck Institute of Biochemistry, D-82152 Martinsried, Germany
| | - R Jungmann
- Research Group "Molecular Imaging and Bionanotechnology," Max Planck Institute of Biochemistry, D-82152 Martinsried, Germany. .,Faculty of Physics and Center for Nanoscience, Ludwig Maximilian University, D-80539 Munich, Germany
| | - F U Hartl
- Department of Cellular Biochemistry, Max Planck Institute of Biochemistry, D-82152 Martinsried, Germany. .,Munich Cluster for Systems Neurology (SyNergy), D-80336 Munich, Germany
| | - M S Hipp
- Department of Cellular Biochemistry, Max Planck Institute of Biochemistry, D-82152 Martinsried, Germany. .,Munich Cluster for Systems Neurology (SyNergy), D-80336 Munich, Germany
| |
Collapse
|
48
|
Entropy and Information within Intrinsically Disordered Protein Regions. ENTROPY 2019; 21:e21070662. [PMID: 33267376 PMCID: PMC7515160 DOI: 10.3390/e21070662] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Revised: 06/27/2019] [Accepted: 07/01/2019] [Indexed: 02/06/2023]
Abstract
Bioinformatics and biophysical studies of intrinsically disordered proteins and regions (IDRs) note the high entropy at individual sequence positions and in conformations sampled in solution. This prevents application of the canonical sequence-structure-function paradigm to IDRs and motivates the development of new methods to extract information from IDR sequences. We argue that the information in IDR sequences cannot be fully revealed through positional conservation, which largely measures stable structural contacts and interaction motifs. Instead, considerations of evolutionary conservation of molecular features can reveal the full extent of information in IDRs. Experimental quantification of the large conformational entropy of IDRs is challenging but can be approximated through the extent of conformational sampling measured by a combination of NMR spectroscopy and lower-resolution structural biology techniques, which can be further interpreted with simulations. Conformational entropy and other biophysical features can be modulated by post-translational modifications that provide functional advantages to IDRs by tuning their energy landscapes and enabling a variety of functional interactions and modes of regulation. The diverse mosaic of functional states of IDRs and their conformational features within complexes demands novel metrics of information, which will reflect the complicated sequence-conformational ensemble-function relationship of IDRs.
Collapse
|
49
|
Post-transcriptional regulatory patterns revealed by protein-RNA interactions. Sci Rep 2019; 9:4302. [PMID: 30867517 PMCID: PMC6416249 DOI: 10.1038/s41598-019-40939-2] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2018] [Accepted: 02/26/2019] [Indexed: 02/07/2023] Open
Abstract
The coordination of the synthesis of functionally-related proteins can be achieved at the post-transcriptional level by the action of common regulatory molecules, such as RNA–binding proteins (RBPs). Despite advances in the genome-wide identification of RBPs and their binding transcripts, the protein–RNA interaction space is still largely unexplored, thus hindering a broader understanding of the extent of the post-transcriptional regulation of related coding RNAs. Here, we propose a computational approach that combines protein–mRNA interaction networks and statistical analyses to provide an inferred regulatory landscape for more than 800 human RBPs and identify the cellular processes that can be regulated at the post-transcriptional level. We show that 10% of the tested sets of functionally-related mRNAs can be post-transcriptionally regulated. Moreover, we propose a classification of (i) the RBPs and (ii) the functionally-related mRNAs, based on their distinct behaviors in the functional landscape, hinting towards mechanistic regulatory hypotheses. In addition, we demonstrate the usefulness of the inferred functional landscape to investigate the cellular role of both well-characterized and novel RBPs in the context of human diseases.
Collapse
|
50
|
Shimizu K, Kimura K, Isowa Y, Oshima K, Ishikawa M, Kagi H, Kito K, Hattori M, Chiba S, Endo K. Insights into the Evolution of Shells and Love Darts of Land Snails Revealed from Their Matrix Proteins. Genome Biol Evol 2019; 11:380-397. [PMID: 30388206 PMCID: PMC6368272 DOI: 10.1093/gbe/evy242] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/31/2018] [Indexed: 12/14/2022] Open
Abstract
Over the past decade, many skeletal matrix proteins that are possibly related to calcification have been reported in various calcifying animals. Molluscs are among the most diverse calcifying animals and some gastropods have adapted to terrestrial ecological niches. Although many shell matrix proteins (SMPs) have already been reported in molluscs, most reports have focused on marine molluscs, and the SMPs of terrestrial snails remain unclear. In addition, some terrestrial stylommatophoran snails have evolved an additional unique calcified character, called a "love dart," used for mating behavior. We identified 54 SMPs in the terrestrial snail Euhadra quaesita, and found that they contain specific domains that are widely conserved in molluscan SMPs. However, our results also suggest that some of them possibly have evolved independently by domain shuffling, domain recruitment, or gene co-option. We then identified four dart matrix proteins, and found that two of them are the same proteins as those identified as SMPs. Our results suggest that some dart matrix proteins possibly have evolved by independent gene co-option from SMPs during dart evolution events. These results provide a new perspective on the evolution of SMPs and "love darts" in land snails.
Collapse
Affiliation(s)
- Keisuke Shimizu
- Department of Earth and Planetary Science, The University of Tokyo, Hongo, Japan
- College of Life and Environmental Sciences, University of Exeter, United Kingdom
| | - Kazuki Kimura
- Department of Environmental Life Sciences, Graduate School of Life Sciences, Tohoku University, Sendai, Miyagi, Japan
- Research Institute for Ulleungdo and Dokdo Islands, Kyungpook National University, Bukgu, Daegu, Korea
| | - Yukinobu Isowa
- Organization for the Strategic Coordination of Research and Intellectual Properties, Meiji University, Kawasaki, Kanagawa, Japan
| | - Kenshiro Oshima
- Center for Omics and Bioinformatics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba, Japan
| | - Makiko Ishikawa
- Department of Earth and Planetary Science, The University of Tokyo, Hongo, Japan
- Faculty of Animal Health Technology, Yamazaki University of Animal Health Technology, Hachioji, Tokyo, Japan
| | - Hiroyuki Kagi
- Geochemical Research Center, Graduate School of Science, The University of Tokyo, Hongo, Japan
| | - Keiji Kito
- Department of Life Sciences, School of Agriculture, Meiji University, Kawasaki, Kanagawa, Japan
| | - Masahira Hattori
- Center for Omics and Bioinformatics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba, Japan
- Cooperative Major of Advanced Health Science, Graduate School of Advanced Science and Engineering, Waseda University, Japan
| | - Satoshi Chiba
- Department of Environmental Life Sciences, Graduate School of Life Sciences, Tohoku University, Sendai, Miyagi, Japan
| | - Kazuyoshi Endo
- Department of Earth and Planetary Science, The University of Tokyo, Hongo, Japan
| |
Collapse
|