1
|
Haft DH, Tolstoy I. Novel selenoprotein neighborhoods suggest specialized biochemical processes. mSystems 2025; 10:e0141724. [PMID: 40162776 PMCID: PMC12013261 DOI: 10.1128/msystems.01417-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2024] [Accepted: 02/27/2025] [Indexed: 04/02/2025] Open
Abstract
Prokaryotic genomes encode selenoproteins sparsely, roughly one protein per 5,000. Finding novel selenoprotein families can expose unknown biological processes that are enabled, or at least enhanced, by having a selenium atom replace a sulfur atom in some cysteine residues. Here, we report the discovery of 18 novel selenoprotein families or second selenocysteine sites in previously unrecognized extensions of protein translations. Most of these families had some confounding factors-too small a family, too few selenoproteins in the family, selenocysteine (U) too close to one end, a skew toward understudied or uncultured lineages, and consequently were missed previously. Discoveries were triggered by observations during the ongoing construction of protein family models for the National Center for Biotechnology Information's RefSeq and Prokaryotic Gene Annotation Pipeline or made by targeted searches for novel selenoproteins in the vicinity of known ones, rather than by any broadly applied genome mining method. Unrelated families TsoA, TsoB, TsoC, and TsoX are adjacent in tso (three selenoprotein operon) loci in the bacterial phylum Thermodesulfobacteriota. TrsS (third radical SAM selenoprotein) occurs strictly in the context of a molybdopterin-dependent aldehyde oxidoreductase. A short carboxy-terminal motif, U-X-X-stop (UXX-star), occurs in selenoproteins with various architectures, usually providing the second U in the protein. The multiple new selenocysteine insertion sites, selenoprotein families, and selenium-dependent operons we curated manually suggest that many more proteins and pathways remain to be discovered; once improved computational methods are applied comprehensively to the latest collections of microbial genomes and metagenomes, they may reveal surprising new biochemical processes. IMPORTANCE Next-generation DNA sequencing and assembly of metagenome-assembled genomes (MAGs) for uncultured species of various microbiomes adds a vast "dark matter" of hard-to-decipher protein sequences. Selenoproteins, optimized by natural selection to encode selenocysteine where cysteine might have been encoded much more easily, carry a strong clue to their function-some specialized aspect of binding or catalysis. Operons with multiple adjacent, but otherwise unrelated, selenoproteins should provide even more vivid information. In this study, efforts in protein family construction and curation, aimed at improving the PGAP genome annotation pipeline, generated multiple novel selenoprotein-containing genomic contexts that may lead to the future characterization of several systems of proteins. Past observations suggest roles in the metabolic handling of trace elements (mercury, tungsten, arsenic, etc.) or of organic compounds refractory to simpler enzymatic pathways. In addition, the work significantly expands the truth set of validated selenoproteins, which should aid future, more automated genome mining efforts.
Collapse
Affiliation(s)
- Daniel H. Haft
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
| | | |
Collapse
|
2
|
Froschauer K, Svensson SL, Gelhausen R, Fiore E, Kible P, Klaude A, Kucklick M, Fuchs S, Eggenhofer F, Yang C, Falush D, Engelmann S, Backofen R, Sharma CM. Complementary Ribo-seq approaches map the translatome and provide a small protein census in the foodborne pathogen Campylobacter jejuni. Nat Commun 2025; 16:3078. [PMID: 40159498 PMCID: PMC11955535 DOI: 10.1038/s41467-025-58329-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Accepted: 03/18/2025] [Indexed: 04/02/2025] Open
Abstract
In contrast to transcriptome maps, bacterial small protein (≤50-100 aa) coding landscapes, including overlapping genes, are poorly characterized. However, an emerging number of small proteins have crucial roles in bacterial physiology and virulence. Here, we present a Ribo-seq-based high-resolution translatome map for the major foodborne pathogen Campylobacter jejuni. Besides conventional Ribo-seq, we employed translation initiation site (TIS) profiling to map start codons and also developed a translation termination site (TTS) profiling approach, which revealed stop codons not apparent from the reference genome in virulence loci. Our integrated approach combined with independent validation expanded the small proteome by two-fold, including CioY, a new 34 aa component of the CioAB oxidase. Overall, our study generates a high-resolution annotation of the C. jejuni coding landscape, provided in an interactive browser, and showcases a strategy for applying integrated Ribo-seq to other species to enrich our understanding of small proteomes.
Collapse
Affiliation(s)
- Kathrin Froschauer
- University of Würzburg, Institute of Molecular Infection Biology, Department of Molecular Infection Biology II, Würzburg, Germany
| | - Sarah L Svensson
- University of Würzburg, Institute of Molecular Infection Biology, Department of Molecular Infection Biology II, Würzburg, Germany
- The Center for Microbes, Development and Health, CAS Key Laboratory of Molecular Virology and Immunology, Shanghai Institute of Immunity and Infection, Chinese Academy of Sciences, Shanghai, China
| | - Rick Gelhausen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Elisabetta Fiore
- University of Würzburg, Institute of Molecular Infection Biology, Department of Molecular Infection Biology II, Würzburg, Germany
| | - Philipp Kible
- University of Würzburg, Institute of Molecular Infection Biology, Department of Molecular Infection Biology II, Würzburg, Germany
| | - Alicia Klaude
- Technische Universität Braunschweig, Institute for Microbiology, Braunschweig, Germany
- Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany
| | - Martin Kucklick
- Technische Universität Braunschweig, Institute for Microbiology, Braunschweig, Germany
- Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany
| | - Stephan Fuchs
- Robert Koch Institute, Methodenentwicklung und Forschungsinfrastruktur (MF), Berlin, Germany
| | - Florian Eggenhofer
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Chao Yang
- The Center for Microbes, Development and Health, CAS Key Laboratory of Molecular Virology and Immunology, Shanghai Institute of Immunity and Infection, Chinese Academy of Sciences, Shanghai, China
| | - Daniel Falush
- The Center for Microbes, Development and Health, CAS Key Laboratory of Molecular Virology and Immunology, Shanghai Institute of Immunity and Infection, Chinese Academy of Sciences, Shanghai, China
| | - Susanne Engelmann
- Technische Universität Braunschweig, Institute for Microbiology, Braunschweig, Germany
- Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
- Signalling Research Centre CIBSS, University of Freiburg, Freiburg, Germany
| | - Cynthia M Sharma
- University of Würzburg, Institute of Molecular Infection Biology, Department of Molecular Infection Biology II, Würzburg, Germany.
| |
Collapse
|
3
|
Eight Unexpected Selenoprotein Families in Organometallic Biochemistry in Clostridium difficile, in ABC Transport, and in Methylmercury Biosynthesis. J Bacteriol 2023; 205:e0025922. [PMID: 36598231 PMCID: PMC9879109 DOI: 10.1128/jb.00259-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
The bioinformatics of a nine-gene locus, designated selenocysteine-assisted organometallic (SAO), was investigated after identifying six new selenoprotein families and constructing hidden Markov models (HMMs) that find and annotate members of those families. Four are selenoproteins in most SAO loci, including Clostridium difficile. They include two ABC transporter subunits, namely, permease SaoP, with selenocysteine (U) at the channel-gating position, and substrate-binding subunit SaoB. Cytosolic selenoproteins include SaoL, homologous to MerB organomercurial lyases from mercury resistance loci, and SaoT, related to thioredoxins. SaoL, SaoB, and surface protein SaoC (an occasional selenoprotein) share an unusual CU dipeptide motif, which is something rare in selenoproteins but found in selenoprotein variants of mercury resistance transporter subunit MerT. A nonselenoprotein, SaoE, shares homology with Cu/Zn efflux and arsenical efflux pumps. The organization of the SAO system suggests substrate interaction with surface-exposed selenoproteins, followed by import, metabolism that may cleave a carbon-to-heavy metal bond, and finally metal efflux. A novel type of mercury resistance is possible, but SAO instead may support fermentative metabolism, with selenocysteine-mediated formation of organometallic intermediates, followed by import, degradation, and metal efflux. Phylogenetic profiling shows SOA loci consistently co-occur with Stickland fermentation markers but even more consistently with 8Fe-9S cofactor-type double-cubane proteins. Hypothesizing that the SAO system forms organometallic intermediates, we investigated the known methylmercury formation protein families HgcA and HgcB. Both families contained overlooked selenoproteins. Most HgcAs have a CU motif N terminal to their previously accepted start sites. Seeking additional rare and overlooked selenoproteins may help reveal more cryptic aspects of microbial biochemistry. IMPORTANCE This work adds 8 novel prokaryotic selenoproteins to the 80 or so families previously known. It describes the SAO (selenocysteine-assisted organometallic) locus, with the most selenoproteins of any known system. The rare CU motif recurs throughout, suggesting the formation and degradation of organometallic compounds. That suggestion triggered a reexamination of HgcA and HcgB, which are methylmercury formation proteins that can adversely impact food safety. Both are selenoproteins, once corrected, with HgcA again showing a CU motif. The SAO system is plausibly a mercury resistance locus for selenium-dependent anaerobes. But instead, it may exploit heavy metals as cofactors in organometallic intermediate-forming pathways that circumvent high activation energies and facilitate the breakdown of otherwise poorly accessible nutrients. SAO could provide an edge that helps Clostridium difficile, an important pathogen, establish disease.
Collapse
|
4
|
Santesmasses D, Mariotti M, Gladyshev VN. Bioinformatics of Selenoproteins. Antioxid Redox Signal 2020; 33:525-536. [PMID: 32031018 PMCID: PMC7409585 DOI: 10.1089/ars.2020.8044] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Accepted: 02/05/2020] [Indexed: 12/13/2022]
Abstract
Significance: Bioinformatics has brought important insights into the field of selenium research. The progress made in the development of computational tools in the last two decades, coordinated with growing genome resources, provided new opportunities to study selenoproteins. The present review discusses existing tools for selenoprotein gene finding and other bioinformatic approaches to study the biology of selenium. Recent Advances: The availability of complete selenoproteomes allowed assessing a global distribution of the use of selenocysteine (Sec) across the tree of life, as well as studying the evolution of selenoproteins and their biosynthetic pathway. Beyond gene identification and characterization, human genetic variants in selenoprotein genes were used to examine adaptations to selenium levels in diverse human populations and to estimate selective constraints against gene loss. Critical Issues: The synthesis of selenoproteins is essential for development in mice. In humans, several mutations in selenoprotein genes have been linked to rare congenital disorders. And yet, the mechanism of Sec insertion and the regulation of selenoprotein synthesis in mammalian cells are not completely understood. Future Directions: Omics technologies offer new possibilities to study selenoproteins and mechanisms of Sec incorporation in cells, tissues, and organisms.
Collapse
Affiliation(s)
- Didac Santesmasses
- Division of Genetics, Department of Medicine, Harvard Medical School, Brigham and Women's Hospital, Boston, Massachusetts, USA
| | - Marco Mariotti
- Division of Genetics, Department of Medicine, Harvard Medical School, Brigham and Women's Hospital, Boston, Massachusetts, USA
| | - Vadim N. Gladyshev
- Division of Genetics, Department of Medicine, Harvard Medical School, Brigham and Women's Hospital, Boston, Massachusetts, USA
| |
Collapse
|
5
|
Zhang Y, Zheng J. Bioinformatics of Metalloproteins and Metalloproteomes. Molecules 2020; 25:molecules25153366. [PMID: 32722260 PMCID: PMC7435645 DOI: 10.3390/molecules25153366] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Revised: 07/17/2020] [Accepted: 07/22/2020] [Indexed: 12/14/2022] Open
Abstract
Trace metals are inorganic elements that are required for all organisms in very low quantities. They serve as cofactors and activators of metalloproteins involved in a variety of key cellular processes. While substantial effort has been made in experimental characterization of metalloproteins and their functions, the application of bioinformatics in the research of metalloproteins and metalloproteomes is still limited. In the last few years, computational prediction and comparative genomics of metalloprotein genes have arisen, which provide significant insights into their distribution, function, and evolution in nature. This review aims to offer an overview of recent advances in bioinformatic analysis of metalloproteins, mainly focusing on metalloprotein prediction and the use of different metals across the tree of life. We describe current computational approaches for the identification of metalloprotein genes and metal-binding sites/patterns in proteins, and then introduce a set of related databases. Furthermore, we discuss the latest research progress in comparative genomics of several important metals in both prokaryotes and eukaryotes, which demonstrates divergent and dynamic evolutionary patterns of different metalloprotein families and metalloproteomes. Overall, bioinformatic studies of metalloproteins provide a foundation for systematic understanding of trace metal utilization in all three domains of life.
Collapse
Affiliation(s)
- Yan Zhang
- Shenzhen Key Laboratory of Marine Bioresources and Ecology, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen 518055, China;
- Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions, Shenzhen 518055, China
- Shenzhen Bay Laboratory, Shenzhen 518055, China
- Correspondence: ; Tel.: +86-755-2692-2024
| | - Junge Zheng
- Shenzhen Key Laboratory of Marine Bioresources and Ecology, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen 518055, China;
- Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions, Shenzhen 518055, China
- Shenzhen Bay Laboratory, Shenzhen 518055, China
| |
Collapse
|
6
|
Casas E, Cai G, Kuehn LA, Register KB, McDaneld TG, Neill JD. Association of Circulating Transfer RNA fragments with antibody response to Mycoplasma bovis in beef cattle. BMC Vet Res 2018. [PMID: 29534724 PMCID: PMC5851088 DOI: 10.1186/s12917-018-1418-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Background High throughput sequencing allows identification of small non-coding RNAs. Transfer RNA Fragments are a class of small non-coding RNAs, and have been identified as being involved in inhibition of gene expression. Given their role, it is possible they may be involved in mediating the infection-induced defense response in the host. Therefore, the objective of this study was to identify 5′ transfer RNA fragments (tRF5s) associated with a serum antibody response to M. bovis in beef cattle. Results The tRF5s encoding alanine, glutamic acid, glycine, lysine, proline, selenocysteine, threonine, and valine were associated (P < 0.05) with antibody response against M. bovis. tRF5s encoding alanine, glutamine, glutamic acid, glycine, histidine, lysine, proline, selenocysteine, threonine, and valine were associated (P < 0.05) with season, which could be attributed to calf growth. There were interactions (P < 0.05) between antibody response to M. bovis and season for tRF5 encoding selenocysteine (anticodon UGA), proline (anticodon CGG), and glutamine (anticodon TTG). Selenocysteine is a rarely used amino acid that is incorporated into proteins by the opal stop codon (UGA), and its function is not well understood. Conclusions Differential expression of tRF5s was identified between ELISA-positive and negative animals. Production of tRF5s may be associated with a host defense mechanism triggered by bacterial infection, or it may provide some advantage to a pathogen during infection of a host. Further studies are needed to establish if tRF5s could be used as a diagnostic marker of chronic exposure.
Collapse
Affiliation(s)
- Eduardo Casas
- USDA, ARS, National Animal Disease Center, Ames, IA, 50010, USA.
| | - Guohong Cai
- USDA, ARS, National Animal Disease Center, Ames, IA, 50010, USA
| | - Larry A Kuehn
- USDA, ARS, U.S. Meat Animal Research Center, Clay Center, NE, 68933, USA
| | | | - Tara G McDaneld
- USDA, ARS, U.S. Meat Animal Research Center, Clay Center, NE, 68933, USA
| | - John D Neill
- USDA, ARS, National Animal Disease Center, Ames, IA, 50010, USA
| |
Collapse
|
7
|
Fu X, Söll D, Sevostyanova A. Challenges of site-specific selenocysteine incorporation into proteins by Escherichia coli. RNA Biol 2018; 15:461-470. [PMID: 29447106 DOI: 10.1080/15476286.2018.1440876] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
Selenocysteine (Sec), a rare genetically encoded amino acid with unusual chemical properties, is of great interest for protein engineering. Sec is synthesized on its cognate tRNA (tRNASec) by the concerted action of several enzymes. While all other aminoacyl-tRNAs are delivered to the ribosome by the elongation factor Tu (EF-Tu), Sec-tRNASec requires a dedicated factor, SelB. Incorporation of Sec into protein requires recoding of the stop codon UGA aided by a specific mRNA structure, the SECIS element. This unusual biogenesis restricts the use of Sec in recombinant proteins, limiting our ability to study the properties of selenoproteins. Several methods are currently available for the synthesis selenoproteins. Here we focus on strategies for in vivo Sec insertion at any position(s) within a recombinant protein in a SECIS-independent manner: (i) engineering of tRNASec for use by EF-Tu without the SECIS requirement, and (ii) design of a SECIS-independent SelB route.
Collapse
Affiliation(s)
- Xian Fu
- a Department of Molecular Biophysics and Biochemistry , Yale University , New Haven , CT , USA
| | - Dieter Söll
- a Department of Molecular Biophysics and Biochemistry , Yale University , New Haven , CT , USA.,b Department of Chemistry , Yale University , New Haven , CT , USA
| | - Anastasia Sevostyanova
- a Department of Molecular Biophysics and Biochemistry , Yale University , New Haven , CT , USA
| |
Collapse
|
8
|
Miller WG, Yee E, Lopes BS, Chapman MH, Huynh S, Bono JL, Parker CT, Strachan NJC, Forbes KJ. Comparative Genomic Analysis Identifies a Campylobacter Clade Deficient in Selenium Metabolism. Genome Biol Evol 2017; 9:1843-1858. [PMID: 28854596 PMCID: PMC5570042 DOI: 10.1093/gbe/evx093] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/09/2017] [Indexed: 12/19/2022] Open
Abstract
The nonthermotolerant Campylobacter species C. fetus, C. hyointestinalis, C. iguaniorum, and C. lanienae form a distinct phylogenetic cluster within the genus. These species are primarily isolated from foraging (swine) or grazing (e.g., cattle, sheep) animals and cause sporadic and infrequent human illness. Previous typing studies identified three putative novel C. lanienae-related taxa, based on either MLST or atpA sequence data. To further characterize these putative novel taxa and the C. fetus group as a whole, 76 genomes were sequenced, either to completion or to draft level. These genomes represent 26 C. lanienae strains and 50 strains of the three novel taxa. C. fetus, C. hyointestinalis and C. iguaniorum genomes were previously sequenced to completion; therefore, a comparative genomic analysis across the entire C. fetus group was conducted (including average nucleotide identity analysis) that supports the initial identification of these three novel Campylobacter species. Furthermore, C. lanienae and the three putative novel species form a discrete clade within the C. fetus group, which we have termed the C. lanienae clade. This clade is distinguished from other members of the C. fetus group by a reduced genome size and distinct CRISPR/Cas systems. Moreover, there are two signature characteristics of the C. lanienae clade. C. lanienae clade genomes carry four to ten unlinked and similar, but nonidentical, flagellin genes. Additionally, all 76 C. lanienae clade genomes sequenced demonstrate a complete absence of genes related to selenium metabolism, including genes encoding the selenocysteine insertion machinery, selenoproteins, and the selenocysteinyl tRNA.
Collapse
Affiliation(s)
- William G Miller
- Produce Safety and Microbiology Research Unit, Agricultural Research Service, U.S. Department of Agriculture, Albany, CA
| | - Emma Yee
- Produce Safety and Microbiology Research Unit, Agricultural Research Service, U.S. Department of Agriculture, Albany, CA
| | - Bruno S Lopes
- School of Medicine, Medical Sciences and Nutrition, University of Aberdeen, United Kingdom
| | - Mary H Chapman
- Produce Safety and Microbiology Research Unit, Agricultural Research Service, U.S. Department of Agriculture, Albany, CA
| | - Steven Huynh
- Produce Safety and Microbiology Research Unit, Agricultural Research Service, U.S. Department of Agriculture, Albany, CA
| | - James L Bono
- Meat Safety and Quality Research Unit, Agricultural Research Service, U.S. Department of Agriculture, Clay Center, NE
| | - Craig T Parker
- Produce Safety and Microbiology Research Unit, Agricultural Research Service, U.S. Department of Agriculture, Albany, CA
| | - Norval J C Strachan
- School of Medicine, Medical Sciences and Nutrition, University of Aberdeen, United Kingdom
| | - Ken J Forbes
- School of Medicine, Medical Sciences and Nutrition, University of Aberdeen, United Kingdom
| |
Collapse
|
9
|
Junqueira ACM, Ratan A, Acerbi E, Drautz-Moses DI, Premkrishnan BNV, Costea PI, Linz B, Purbojati RW, Paulo DF, Gaultier NE, Subramanian P, Hasan NA, Colwell RR, Bork P, Azeredo-Espin AML, Bryant DA, Schuster SC. The microbiomes of blowflies and houseflies as bacterial transmission reservoirs. Sci Rep 2017; 7:16324. [PMID: 29176730 PMCID: PMC5701178 DOI: 10.1038/s41598-017-16353-x] [Citation(s) in RCA: 85] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2017] [Accepted: 11/10/2017] [Indexed: 12/25/2022] Open
Abstract
Blowflies and houseflies are mechanical vectors inhabiting synanthropic environments around the world. They feed and breed in fecal and decaying organic matter, but the microbiome they harbour and transport is largely uncharacterized. We sampled 116 individual houseflies and blowflies from varying habitats on three continents and subjected them to high-coverage, whole-genome shotgun sequencing. This allowed for genomic and metagenomic analyses of the host-associated microbiome at the species level. Both fly host species segregate based on principal coordinate analysis of their microbial communities, but they also show an overlapping core microbiome. Legs and wings displayed the largest microbial diversity and were shown to be an important route for microbial dispersion. The environmental sequencing approach presented here detected a stochastic distribution of human pathogens, such as Helicobacter pylori, thereby demonstrating the potential of flies as proxies for environmental and public health surveillance.
Collapse
Affiliation(s)
- Ana Carolina M Junqueira
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, 637551, Singapore.
- Departamento de Genética, Instituto de Biologia, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, 21941-902, Brazil.
| | - Aakrosh Ratan
- Department of Public Health Sciences and Center for Public Health Genomics, University of Virginia, Charlottesville, VA, 22908, USA
| | - Enzo Acerbi
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, 637551, Singapore
| | - Daniela I Drautz-Moses
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, 637551, Singapore
| | - Balakrishnan N V Premkrishnan
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, 637551, Singapore
| | - Paul I Costea
- European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, 69117, Germany
| | - Bodo Linz
- Center for Vaccines and Immunology, College of Veterinary Medicine, University of Georgia, Athens, 30602, GA, USA
| | - Rikky W Purbojati
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, 637551, Singapore
| | - Daniel F Paulo
- Centro de Biologia Molecular e Engenharia Genética, Departamento de Genética, Evolução e Bioagentes, Instituto de Biologia, Universidade Estadual de Campinas, Campinas, SP, 13083-875, Brazil
| | - Nicolas E Gaultier
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, 637551, Singapore
| | | | - Nur A Hasan
- CosmosID Inc, Rockville, MD, 20850, USA
- Center for Bioinformatics and Computational Biology, University of Maryland. Institute for Computational Biology, University of Maryland College Park, College Park, MD, 20742, USA
| | - Rita R Colwell
- CosmosID Inc, Rockville, MD, 20850, USA
- Center for Bioinformatics and Computational Biology, University of Maryland. Institute for Computational Biology, University of Maryland College Park, College Park, MD, 20742, USA
- Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
| | - Peer Bork
- European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, 69117, Germany
| | - Ana Maria L Azeredo-Espin
- Centro de Biologia Molecular e Engenharia Genética, Departamento de Genética, Evolução e Bioagentes, Instituto de Biologia, Universidade Estadual de Campinas, Campinas, SP, 13083-875, Brazil
| | - Donald A Bryant
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, 637551, Singapore
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Stephan C Schuster
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, 637551, Singapore.
| |
Collapse
|
10
|
Mukai T, Vargas-Rodriguez O, Englert M, Tripp HJ, Ivanova NN, Rubin EM, Kyrpides NC, Söll D. Transfer RNAs with novel cloverleaf structures. Nucleic Acids Res 2017; 45:2776-2785. [PMID: 28076288 PMCID: PMC5389517 DOI: 10.1093/nar/gkw898] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2016] [Accepted: 09/30/2016] [Indexed: 01/16/2023] Open
Abstract
We report the identification of novel tRNA species with 12-base pair amino-acid acceptor branches composed of longer acceptor stem and shorter T-stem. While canonical tRNAs have a 7/5 configuration of the branch, the novel tRNAs have either 8/4 or 9/3 structure. They were found during the search for selenocysteine tRNAs in terabytes of genome, metagenome and metatranscriptome sequences. Certain bacteria and their phages employ the 8/4 structure for serine and histidine tRNAs, while minor cysteine and selenocysteine tRNA species may have a modified 8/4 structure with one bulge nucleotide. In Acidobacteria, tRNAs with 8/4 and 9/3 structures may function as missense and nonsense suppressor tRNAs and/or regulatory noncoding RNAs. In δ-proteobacteria, an additional cysteine tRNA with an 8/4 structure mimics selenocysteine tRNA and may function as opal suppressor. We examined the potential translation function of suppressor tRNA species in Escherichia coli; tRNAs with 8/4 or 9/3 structures efficiently inserted serine, alanine and cysteine in response to stop and sense codons, depending on the identity element and anticodon sequence of the tRNA. These findings expand our view of how tRNA, and possibly the genetic code, is diversified in nature.
Collapse
Affiliation(s)
- Takahito Mukai
- Department of Molecular Biophysics and Biochemistry, New Haven, CT 06520, USA
| | | | - Markus Englert
- Department of Molecular Biophysics and Biochemistry, New Haven, CT 06520, USA
| | - H James Tripp
- Department of Energy Joint Genome Institute (DOE JGI), Walnut Creek, CA 94598, USA
| | - Natalia N Ivanova
- Department of Energy Joint Genome Institute (DOE JGI), Walnut Creek, CA 94598, USA
| | - Edward M Rubin
- Department of Energy Joint Genome Institute (DOE JGI), Walnut Creek, CA 94598, USA
| | - Nikos C Kyrpides
- Department of Energy Joint Genome Institute (DOE JGI), Walnut Creek, CA 94598, USA
| | - Dieter Söll
- Department of Molecular Biophysics and Biochemistry, New Haven, CT 06520, USA.,Department of Chemistry, Yale University, New Haven, CT 06520, USA
| |
Collapse
|
11
|
Santesmasses D, Mariotti M, Guigó R. Computational identification of the selenocysteine tRNA (tRNASec) in genomes. PLoS Comput Biol 2017; 13:e1005383. [PMID: 28192430 PMCID: PMC5330540 DOI: 10.1371/journal.pcbi.1005383] [Citation(s) in RCA: 43] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Revised: 02/28/2017] [Accepted: 01/26/2017] [Indexed: 12/11/2022] Open
Abstract
Selenocysteine (Sec) is known as the 21st amino acid, a cysteine analogue with selenium replacing sulphur. Sec is inserted co-translationally in a small fraction of proteins called selenoproteins. In selenoprotein genes, the Sec specific tRNA (tRNASec) drives the recoding of highly specific UGA codons from stop signals to Sec. Although found in organisms from the three domains of life, Sec is not universal. Many species are completely devoid of selenoprotein genes and lack the ability to synthesize Sec. Since tRNASec is a key component in selenoprotein biosynthesis, its efficient identification in genomes is instrumental to characterize the utilization of Sec across lineages. Available tRNA prediction methods fail to accurately predict tRNASec, due to its unusual structural fold. Here, we present Secmarker, a method based on manually curated covariance models capturing the specific tRNASec structure in archaea, bacteria and eukaryotes. We exploited the non-universality of Sec to build a proper benchmark set for tRNASec predictions, which is not possible for the predictions of other tRNAs. We show that Secmarker greatly improves the accuracy of previously existing methods constituting a valuable tool to identify tRNASec genes, and to efficiently determine whether a genome contains selenoproteins. We used Secmarker to analyze a large set of fully sequenced genomes, and the results revealed new insights in the biology of tRNASec, led to the discovery of a novel bacterial selenoprotein family, and shed additional light on the phylogenetic distribution of selenoprotein containing genomes. Secmarker is freely accessible for download, or online analysis through a web server at http://secmarker.crg.cat. Most proteins are made of twenty amino acids. However, there is a small group of proteins that incorporate a 21st amino acid, Selenocysteine (Sec). These proteins are called selenoproteins and are present in some, but not all, species from the three domains of life. Sec is inserted in selenoproteins in response to the UGA codon, normally a stop codon. A Sec specific tRNA (tRNASec), which only exists in the organisms that synthesize selenoproteins recognizes the UGA codon. tRNASec is not only indispensable for Sec incorporation into selenoproteins, but also for Sec synthesis, since Sec is synthesized on its own tRNA. The structure of tRNASec differs from that of canonical tRNAs, and general tRNA detection methods fail to accurately predict it. We developed Secmarker, a tRNASec specific identification tool based on the characteristic structural features of the tRNASec. Our benchmark shows that Secmarker produces nearly flawless tRNASec predictions. We used Secmarker to scan all currently available genome sequences. The analysis of the highly accurate predictions obtained revealed new insights into the biology of tRNASec.
Collapse
Affiliation(s)
- Didac Santesmasses
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Institut Hospital del Mar d’Investigacions Mèdiques (IMIM), Barcelona, Spain
- * E-mail: (DS); (MM)
| | - Marco Mariotti
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Institut Hospital del Mar d’Investigacions Mèdiques (IMIM), Barcelona, Spain
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- * E-mail: (DS); (MM)
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Institut Hospital del Mar d’Investigacions Mèdiques (IMIM), Barcelona, Spain
| |
Collapse
|
12
|
Comparative genomics reveals new evolutionary and ecological patterns of selenium utilization in bacteria. ISME JOURNAL 2016; 10:2048-59. [PMID: 26800233 PMCID: PMC5029168 DOI: 10.1038/ismej.2015.246] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/02/2015] [Revised: 10/28/2015] [Accepted: 11/27/2015] [Indexed: 12/15/2022]
Abstract
Selenium (Se) is an important micronutrient for many organisms, which is required for the biosynthesis of selenocysteine, selenouridine and Se-containing cofactor. Several key genes involved in different Se utilization traits have been characterized; however, systematic studies on the evolution and ecological niches of Se utilization are very limited. Here, we analyzed more than 5200 sequenced organisms to examine the occurrence patterns of all Se traits in bacteria. A global species map of all Se utilization pathways has been generated, which demonstrates the most detailed understanding of Se utilization in bacteria so far. In addition, the selenophosphate synthetase gene, which is used to define the overall Se utilization, was also detected in some organisms that do not have any of the known Se traits, implying the presence of a novel Se form in this domain. Phylogenetic analyses of components of different Se utilization traits revealed new horizontal gene transfer events for each of them. Moreover, by characterizing the selenoproteomes of all organisms, we found a new selenoprotein-rich phylum and additional selenoprotein-rich species. Finally, the relationship between ecological environments and Se utilization was investigated and further verified by metagenomic analysis of environmental samples, which indicates new macroevolutionary trends of each Se utilization trait in bacteria. Our data provide insights into the general features of Se utilization in bacteria and should be useful for a further understanding of the evolutionary dynamics of Se utilization in nature.
Collapse
|