1
|
Froschauer K, Svensson SL, Gelhausen R, Fiore E, Kible P, Klaude A, Kucklick M, Fuchs S, Eggenhofer F, Yang C, Falush D, Engelmann S, Backofen R, Sharma CM. Complementary Ribo-seq approaches map the translatome and provide a small protein census in the foodborne pathogen Campylobacter jejuni. Nat Commun 2025; 16:3078. [PMID: 40159498 PMCID: PMC11955535 DOI: 10.1038/s41467-025-58329-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Accepted: 03/18/2025] [Indexed: 04/02/2025] Open
Abstract
In contrast to transcriptome maps, bacterial small protein (≤50-100 aa) coding landscapes, including overlapping genes, are poorly characterized. However, an emerging number of small proteins have crucial roles in bacterial physiology and virulence. Here, we present a Ribo-seq-based high-resolution translatome map for the major foodborne pathogen Campylobacter jejuni. Besides conventional Ribo-seq, we employed translation initiation site (TIS) profiling to map start codons and also developed a translation termination site (TTS) profiling approach, which revealed stop codons not apparent from the reference genome in virulence loci. Our integrated approach combined with independent validation expanded the small proteome by two-fold, including CioY, a new 34 aa component of the CioAB oxidase. Overall, our study generates a high-resolution annotation of the C. jejuni coding landscape, provided in an interactive browser, and showcases a strategy for applying integrated Ribo-seq to other species to enrich our understanding of small proteomes.
Collapse
Affiliation(s)
- Kathrin Froschauer
- University of Würzburg, Institute of Molecular Infection Biology, Department of Molecular Infection Biology II, Würzburg, Germany
| | - Sarah L Svensson
- University of Würzburg, Institute of Molecular Infection Biology, Department of Molecular Infection Biology II, Würzburg, Germany
- The Center for Microbes, Development and Health, CAS Key Laboratory of Molecular Virology and Immunology, Shanghai Institute of Immunity and Infection, Chinese Academy of Sciences, Shanghai, China
| | - Rick Gelhausen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Elisabetta Fiore
- University of Würzburg, Institute of Molecular Infection Biology, Department of Molecular Infection Biology II, Würzburg, Germany
| | - Philipp Kible
- University of Würzburg, Institute of Molecular Infection Biology, Department of Molecular Infection Biology II, Würzburg, Germany
| | - Alicia Klaude
- Technische Universität Braunschweig, Institute for Microbiology, Braunschweig, Germany
- Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany
| | - Martin Kucklick
- Technische Universität Braunschweig, Institute for Microbiology, Braunschweig, Germany
- Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany
| | - Stephan Fuchs
- Robert Koch Institute, Methodenentwicklung und Forschungsinfrastruktur (MF), Berlin, Germany
| | - Florian Eggenhofer
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Chao Yang
- The Center for Microbes, Development and Health, CAS Key Laboratory of Molecular Virology and Immunology, Shanghai Institute of Immunity and Infection, Chinese Academy of Sciences, Shanghai, China
| | - Daniel Falush
- The Center for Microbes, Development and Health, CAS Key Laboratory of Molecular Virology and Immunology, Shanghai Institute of Immunity and Infection, Chinese Academy of Sciences, Shanghai, China
| | - Susanne Engelmann
- Technische Universität Braunschweig, Institute for Microbiology, Braunschweig, Germany
- Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
- Signalling Research Centre CIBSS, University of Freiburg, Freiburg, Germany
| | - Cynthia M Sharma
- University of Würzburg, Institute of Molecular Infection Biology, Department of Molecular Infection Biology II, Würzburg, Germany.
| |
Collapse
|
2
|
Lim CS, Gibbon AK, Tran Nguyen AT, Chieng GSW, Brown CM. RIBOSS detects novel translational events by combining long- and short-read transcriptome and translatome profiling. Brief Bioinform 2025; 26:bbaf164. [PMID: 40221960 PMCID: PMC11994033 DOI: 10.1093/bib/bbaf164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2024] [Revised: 03/18/2025] [Accepted: 03/23/2025] [Indexed: 04/15/2025] Open
Abstract
Ribosome profiling is a high-throughput sequencing technique that captures the positions of translating ribosomes on RNAs. Recent advancements in ribosome profiling include achieving highly phased ribosome footprints for plant translatomes and more recently for bacterial translatomes. This substantially increases the specificity of detecting open reading frames (ORFs) that can be translated, such as small ORFs located upstream and downstream of the annotated ORFs. However, most genomes (e.g. bacterial genomes) lack the annotations for the transcription start and termination sites. This hinders the systematic discovery of novel ORFs in the 'untranslated' regions in ribosome profiling data. Here, we develop a new computational pipeline called RIBOSS to discover noncanonical ORFs and assess their translational potential against annotated ORFs. The RIBOSS Python modules are versatile, and we use them to analyse both prokaryotic and eukaryotic data. We present a resulting list of noncanonical ORFs with high translational potential in Homo sapiens, Arabidopsis thaliana, and Salmonella enterica. We further illustrate RIBOSS utility when studying organisms with incomplete transcriptome annotations. We leverage long-read and short-read data for reference-guided transcriptome assembly and highly phased ribosome profiling data for detecting novel translational events in the assembled transcriptome for S. enterica. In sum, RIBOSS is the first integrated computational pipeline for noncanonical ORF detection and translational potential assessment that incorporates long- and short-read sequencing technologies to investigate translation. RIBOSS is freely available at https://github.com/lcscs12345/riboss.
Collapse
Affiliation(s)
- Chun Shen Lim
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, 710 Cumberland Street, Dunedin North, Dunedin 9016, New Zealand
- Genetics Otago, University of Otago, 710 Cumberland Street, Dunedin North, Dunedin 9016, New Zealand
| | - Alexandra K Gibbon
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, 710 Cumberland Street, Dunedin North, Dunedin 9016, New Zealand
- Genetics Otago, University of Otago, 710 Cumberland Street, Dunedin North, Dunedin 9016, New Zealand
| | - Anh Thu Tran Nguyen
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, 710 Cumberland Street, Dunedin North, Dunedin 9016, New Zealand
- Genetics Otago, University of Otago, 710 Cumberland Street, Dunedin North, Dunedin 9016, New Zealand
| | - Gabrielle S W Chieng
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, 710 Cumberland Street, Dunedin North, Dunedin 9016, New Zealand
- Genetics Otago, University of Otago, 710 Cumberland Street, Dunedin North, Dunedin 9016, New Zealand
| | - Chris M Brown
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, 710 Cumberland Street, Dunedin North, Dunedin 9016, New Zealand
- Genetics Otago, University of Otago, 710 Cumberland Street, Dunedin North, Dunedin 9016, New Zealand
| |
Collapse
|
3
|
Haseltine WA, Patarca R. The RNA Revolution in the Central Molecular Biology Dogma Evolution. Int J Mol Sci 2024; 25:12695. [PMID: 39684407 DOI: 10.3390/ijms252312695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2024] [Revised: 11/24/2024] [Accepted: 11/25/2024] [Indexed: 12/18/2024] Open
Abstract
Human genome projects in the 1990s identified about 20,000 protein-coding sequences. We are now in the RNA revolution, propelled by the realization that genes determine phenotype beyond the foundational central molecular biology dogma, stating that inherited linear pieces of DNA are transcribed to RNAs and translated into proteins. Crucially, over 95% of the genome, initially considered junk DNA between protein-coding genes, encodes essential, functionally diverse non-protein-coding RNAs, raising the gene count by at least one order of magnitude. Most inherited phenotype-determining changes in DNA are in regulatory areas that control RNA and regulatory sequences. RNAs can directly or indirectly determine phenotypes by regulating protein and RNA function, transferring information within and between organisms, and generating DNA. RNAs also exhibit high structural, functional, and biomolecular interaction plasticity and are modified via editing, methylation, glycosylation, and other mechanisms, which bestow them with diverse intra- and extracellular functions without altering the underlying DNA. RNA is, therefore, currently considered the primary determinant of cellular to populational functional diversity, disease-linked and biomolecular structural variations, and cell function regulation. As demonstrated by RNA-based coronavirus vaccines' success, RNA technology is transforming medicine, agriculture, and industry, as did the advent of recombinant DNA technology in the 1980s.
Collapse
Affiliation(s)
- William A Haseltine
- Access Health International, 384 West Lane, Ridgefield, CT 06877, USA
- Feinstein Institutes for Medical Research, 350 Community Dr, Manhasset, NY 11030, USA
| | - Roberto Patarca
- Access Health International, 384 West Lane, Ridgefield, CT 06877, USA
- Feinstein Institutes for Medical Research, 350 Community Dr, Manhasset, NY 11030, USA
| |
Collapse
|
4
|
Chanin RB, West PT, Wirbel J, Gill MO, Green GZM, Park RM, Enright N, Miklos AM, Hickey AS, Brooks EF, Lum KK, Cristea IM, Bhatt AS. Intragenic DNA inversions expand bacterial coding capacity. Nature 2024; 634:234-242. [PMID: 39322669 DOI: 10.1038/s41586-024-07970-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 08/20/2024] [Indexed: 09/27/2024]
Abstract
Bacterial populations that originate from a single bacterium are not strictly clonal and often contain subgroups with distinct phenotypes1. Bacteria can generate heterogeneity through phase variation-a preprogrammed, reversible mechanism that alters gene expression levels across a population1. One well-studied type of phase variation involves enzyme-mediated inversion of specific regions of genomic DNA2. Frequently, these DNA inversions flip the orientation of promoters, turning transcription of adjacent coding regions on or off2. Through this mechanism, inversion can affect fitness, survival or group dynamics3,4. Here, we describe the development of PhaVa, a computational tool that identifies DNA inversions using long-read datasets. We also identify 372 'intragenic invertons', a novel class of DNA inversions found entirely within genes, in genomes of bacterial and archaeal isolates. Intragenic invertons allow a gene to encode two or more versions of a protein by flipping a DNA sequence within the coding region, thereby increasing coding capacity without increasing genome size. We validate ten intragenic invertons in the gut commensal Bacteroides thetaiotaomicron, and experimentally characterize an intragenic inverton in the thiamine biosynthesis gene thiC.
Collapse
Affiliation(s)
- Rachael B Chanin
- Department of Medicine, Division of Hematology, Stanford University, Stanford, CA, USA
| | - Patrick T West
- Department of Medicine, Division of Hematology, Stanford University, Stanford, CA, USA
| | - Jakob Wirbel
- Department of Medicine, Division of Hematology, Stanford University, Stanford, CA, USA
| | - Matthew O Gill
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Gabriella Z M Green
- Department of Medicine, Division of Hematology, Stanford University, Stanford, CA, USA
| | - Ryan M Park
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Nora Enright
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - Arjun M Miklos
- Department of Medicine, Division of Hematology, Stanford University, Stanford, CA, USA
| | - Angela S Hickey
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Erin F Brooks
- Department of Medicine, Division of Hematology, Stanford University, Stanford, CA, USA
| | - Krystal K Lum
- Department of Molecular Biology, Princeton University, Princeton, NJ, USA
| | - Ileana M Cristea
- Department of Molecular Biology, Princeton University, Princeton, NJ, USA
| | - Ami S Bhatt
- Department of Medicine, Division of Hematology, Stanford University, Stanford, CA, USA.
- Department of Genetics, Stanford University, Stanford, CA, USA.
| |
Collapse
|
5
|
Mohsen JJ, Mohsen MG, Jiang K, Landajuela A, Quinto L, Isaacs FJ, Karatekin E, Slavoff SA. Cellular function of the GndA small open reading frame-encoded polypeptide during heat shock. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.29.601336. [PMID: 38979229 PMCID: PMC11230408 DOI: 10.1101/2024.06.29.601336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Over the past 15 years, hundreds of previously undiscovered bacterial small open reading frame (sORF)-encoded polypeptides (SEPs) of fewer than fifty amino acids have been identified, and biological functions have been ascribed to an increasing number of SEPs from intergenic regions and small RNAs. However, despite numbering in the dozens in Escherichia coli, and hundreds to thousands in humans, same-strand nested sORFs that overlap protein coding genes in alternative reading frames remain understudied. In order to provide insight into this enigmatic class of unannotated genes, we characterized GndA, a 36-amino acid, heat shock-regulated SEP encoded within the +2 reading frame of the gnd gene in E. coli K-12 MG1655. We show that GndA pulls down components of respiratory complex I (RCI) and is required for proper localization of a RCI subunit during heat shock. At high temperature GndA deletion (ΔGndA) cells exhibit perturbations in cell growth, NADH+/NAD ratio, and expression of a number of genes including several associated with oxidative stress. These findings suggest that GndA may function in maintenance of homeostasis during heat shock. Characterization of GndA therefore supports the nascent but growing consensus that functional, overlapping genes occur in genomes from viruses to humans.
Collapse
Affiliation(s)
- Jessica J. Mohsen
- Department of Chemistry, Yale University, New Haven, CT 06511
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT 06516
| | - Michael G. Mohsen
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT 06511
- Howard Hughes Medical Institute, Yale University, New Haven, CT 06511
| | - Kevin Jiang
- Department of Chemistry, Yale University, New Haven, CT 06511
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT 06516
| | - Ane Landajuela
- Department of Cellular and Molecular Physiology, Yale School of Medicine, New Haven, CT 06510
- Nanobiology Institute, Yale University, West Haven, CT 06516
| | - Laura Quinto
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT 06511
- Systems Biology Institute, Yale University, West Haven, CT 06516
| | - Farren J. Isaacs
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT 06511
- Systems Biology Institute, Yale University, West Haven, CT 06516
| | - Erdem Karatekin
- Department of Cellular and Molecular Physiology, Yale School of Medicine, New Haven, CT 06510
- Nanobiology Institute, Yale University, West Haven, CT 06516
- Wu Tsai Institute, Yale University, New Haven, CT 06511
- Université de Paris, Saints-Pères Paris Institute for the Neurosciences (SPPIN), Centre National de la Recherche Scientifique (CNRS), 75006 Paris, France
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06511
| | - Sarah A. Slavoff
- Department of Chemistry, Yale University, New Haven, CT 06511
- Institute for Biomolecular Design and Discovery, Yale University, West Haven, CT 06516
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06511
| |
Collapse
|
6
|
Deng Z, Liu C, Wang F, Song N, Liu J, Li H, Liu S, Li T, Liu Z, Xiao F, Li W. A Versatile Thioesterase Involved in Dimerization during Cinnamoyl Lipid Biosynthesis. Angew Chem Int Ed Engl 2024; 63:e202402010. [PMID: 38462490 DOI: 10.1002/anie.202402010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2024] [Revised: 03/07/2024] [Accepted: 03/07/2024] [Indexed: 03/12/2024]
Abstract
The cinnamoyl lipid compound youssoufene A1 (1), featuring a unique dearomatic carbon-bridged dimeric skeleton, exhibits increased inhibition against multidrug resistant Enterococcus faecalis as compared to monomeric youssoufenes. However, the formation process of this intriguing dearomatization/dimerization remains unknown. In this study, an unusual "gene-within-gene" thioesterase (TE) gene ysfF was functionally characterized. The gene was found to naturally encodes two proteins, an entire YsfF with α/β-hydrolase and 4-hydroxybenzoyl-CoA thioesterase (4-HBT)-like enzyme domains, and a nested YsfFHBT (4-HBT-like enzyme). Using an intracellular tagged carrier-protein tracking (ITCT) strategy, in vitro reconstitution and in vivo experiments, we found that: i) both domains of YsfF displayed thioesterase activities; ii) YsfF/YsfFHBT could accomplish the 6π-electrocyclic ring closure for benzene ring formation; and iii) YsfF and cyclase YsfX together were responsible for the ACP-tethered dearomatization/dimerization process, possibly through an unprecedented Michael-type addition reaction. Moreover, site-directed mutagenesis experiments demonstrated that N301, E483 and H566 of YsfF are critical residues for both the 6π-electrocyclization and dimerization processes. This study enhances our understanding of the multifunctionality of the TE protein family.
Collapse
Affiliation(s)
- Zirong Deng
- State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, Shaanxi Key Laboratory of Natural Products & Chemical Biology, College of Chemistry & Pharmacy, Northwest A&F University, Yangling, Shannxi, 712100, China
- Key Laboratory of Marine Drugs, Ministry of Education, School of Medicine and Pharmacy, Ocean University of China, Qingdao, Shandong, 266003, China
| | - Chunni Liu
- State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, Shaanxi Key Laboratory of Natural Products & Chemical Biology, College of Chemistry & Pharmacy, Northwest A&F University, Yangling, Shannxi, 712100, China
| | - Fang Wang
- Key Laboratory of Marine Drugs, Ministry of Education, School of Medicine and Pharmacy, Ocean University of China, Qingdao, Shandong, 266003, China
| | - Ni Song
- Key Laboratory of Marine Drugs, Ministry of Education, School of Medicine and Pharmacy, Ocean University of China, Qingdao, Shandong, 266003, China
| | - Jing Liu
- Key Laboratory of Marine Drugs, Ministry of Education, School of Medicine and Pharmacy, Ocean University of China, Qingdao, Shandong, 266003, China
| | - Huayue Li
- Key Laboratory of Marine Drugs, Ministry of Education, School of Medicine and Pharmacy, Ocean University of China, Qingdao, Shandong, 266003, China
- Laboratory for Marine Drugs and Bioproducts of Qingdao National Laboratory for Marine Science and Technology, Qingdao, Shandong, 266237, China
| | - Siyu Liu
- State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, Shaanxi Key Laboratory of Natural Products & Chemical Biology, College of Chemistry & Pharmacy, Northwest A&F University, Yangling, Shannxi, 712100, China
| | - Tong Li
- Key Laboratory of Marine Drugs, Ministry of Education, School of Medicine and Pharmacy, Ocean University of China, Qingdao, Shandong, 266003, China
| | - Zengzhi Liu
- Key Laboratory of Marine Drugs, Ministry of Education, School of Medicine and Pharmacy, Ocean University of China, Qingdao, Shandong, 266003, China
| | - Fei Xiao
- Key Laboratory of Marine Drugs, Ministry of Education, School of Medicine and Pharmacy, Ocean University of China, Qingdao, Shandong, 266003, China
| | - Wenli Li
- State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, Shaanxi Key Laboratory of Natural Products & Chemical Biology, College of Chemistry & Pharmacy, Northwest A&F University, Yangling, Shannxi, 712100, China
- Key Laboratory of Marine Drugs, Ministry of Education, School of Medicine and Pharmacy, Ocean University of China, Qingdao, Shandong, 266003, China
- Laboratory for Marine Drugs and Bioproducts of Qingdao National Laboratory for Marine Science and Technology, Qingdao, Shandong, 266237, China
| |
Collapse
|
7
|
Fijalkowski I, Snauwaert V, Van Damme P. Proteins à la carte: riboproteogenomic exploration of bacterial N-terminal proteoform expression. mBio 2024; 15:e0033324. [PMID: 38511928 PMCID: PMC11005335 DOI: 10.1128/mbio.00333-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Accepted: 02/28/2024] [Indexed: 03/22/2024] Open
Abstract
In recent years, it has become evident that the true complexity of bacterial proteomes remains underestimated. Gene annotation tools are known to propagate biases and overlook certain classes of truly expressed proteins, particularly proteoforms-protein isoforms arising from a single gene. Recent (re-)annotation efforts heavily rely on ribosome profiling by providing a direct readout of translation to fully describe bacterial proteomes. In this study, we employ a robust riboproteogenomic pipeline to conduct a systematic census of expressed N-terminal proteoform pairs, representing two isoforms encoded by a single gene raised by annotated and alternative translation initiation, in Salmonella. Intriguingly, conditional-dependent changes in relative utilization of annotated and alternative translation initiation sites (TIS) were observed in several cases. This suggests that TIS selection is subject to regulatory control, adding yet another layer of complexity to our understanding of bacterial proteomes. IMPORTANCE With the emerging theme of genes within genes comprising the existence of alternative open reading frames (ORFs) generated by translation initiation at in-frame start codons, mechanisms that control the relative utilization of annotated and alternative TIS need to be unraveled and our molecular understanding of resulting proteoforms broadened. Utilizing complementary ribosome profiling strategies to map ORF boundaries, we uncovered dual-encoding ORFs generated by in-frame TIS usage in Salmonella. Besides demonstrating that alternative TIS usage may generate proteoforms with different characteristics, such as differential localization and specialized function, quantitative aspects of conditional retapamulin-assisted ribosome profiling (Ribo-RET) translation initiation maps offer unprecedented insights into the relative utilization of annotated and alternative TIS, enabling the exploration of gene regulatory mechanisms that control TIS usage and, consequently, the translation of N-terminal proteoform pairs.
Collapse
Affiliation(s)
- Igor Fijalkowski
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, Ghent, Belgium
| | - Valdes Snauwaert
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, Ghent, Belgium
| | - Petra Van Damme
- iRIP Unit, Laboratory of Microbiology, Department of Biochemistry and Microbiology, Ghent University, Ghent, Belgium
| |
Collapse
|
8
|
Fuchs S, Engelmann S. Small proteins in bacteria - Big challenges in prediction and identification. Proteomics 2023; 23:e2200421. [PMID: 37609810 DOI: 10.1002/pmic.202200421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 08/03/2023] [Accepted: 08/10/2023] [Indexed: 08/24/2023]
Abstract
Proteins with up to 100 amino acids have been largely overlooked due to the challenges associated with predicting and identifying them using traditional methods. Recent advances in bioinformatics and machine learning, DNA sequencing, RNA and Ribo-seq technologies, and mass spectrometry (MS) have greatly facilitated the detection and characterisation of these elusive proteins in recent years. This has revealed their crucial role in various cellular processes including regulation, signalling and transport, as toxins and as folding helpers for protein complexes. Consequently, the systematic identification and characterisation of these proteins in bacteria have emerged as a prominent field of interest within the microbial research community. This review provides an overview of different strategies for predicting and identifying these proteins on a large scale, leveraging the power of these advanced technologies. Furthermore, the review offers insights into the future developments that may be expected in this field.
Collapse
Affiliation(s)
- Stephan Fuchs
- Genome Competence Center (MF1), Department MFI, Robert-Koch-Institut, Berlin, Germany
| | - Susanne Engelmann
- Institute for Microbiology, Technische Universität Braunschweig, Braunschweig, Germany
- Microbial Proteomics, Helmholtzzentrum für Infektionsforschung GmbH, Braunschweig, Germany
| |
Collapse
|
9
|
Zhong A, Jiang X, Hickman AB, Klier K, Teodoro GIC, Dyda F, Laub MT, Storz G. Toxic antiphage defense proteins inhibited by intragenic antitoxin proteins. Proc Natl Acad Sci U S A 2023; 120:e2307382120. [PMID: 37487082 PMCID: PMC10400941 DOI: 10.1073/pnas.2307382120] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 06/21/2023] [Indexed: 07/26/2023] Open
Abstract
Recombination-promoting nuclease (Rpn) proteins are broadly distributed across bacterial phyla, yet their functions remain unclear. Here, we report that these proteins are toxin-antitoxin systems, comprised of genes-within-genes, that combat phage infection. We show the small, highly variable Rpn C-terminal domains (RpnS), which are translated separately from the full-length proteins (RpnL), directly block the activities of the toxic RpnL. The crystal structure of RpnAS revealed a dimerization interface encompassing α helix that can have four amino acid repeats whose number varies widely among strains of the same species. Consistent with strong selection for the variation, we document that plasmid-encoded RpnP2L protects Escherichia coli against certain phages. We propose that many more intragenic-encoded proteins that serve regulatory roles remain to be discovered in all organisms.
Collapse
Affiliation(s)
- Aoshu Zhong
- Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD20892
| | - Xiaofang Jiang
- Intramural Research Program, National Library of Medicine, NIH, Bethesda, MD20894
| | - Alison B. Hickman
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD20892
| | - Katherine Klier
- Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD20892
| | | | - Fred Dyda
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD20892
| | - Michael T. Laub
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA02139
- HHMI, Massachusetts Institute of Technology, Cambridge, MA02139
| | - Gisela Storz
- Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD20892
| |
Collapse
|
10
|
Chlebek JL, Leonard SP, Kang-Yun C, Yung MC, Ricci DP, Jiao Y, Park DM. Prolonging genetic circuit stability through adaptive evolution of overlapping genes. Nucleic Acids Res 2023; 51:7094-7108. [PMID: 37260076 PMCID: PMC10359631 DOI: 10.1093/nar/gkad484] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Revised: 05/12/2023] [Accepted: 05/23/2023] [Indexed: 06/02/2023] Open
Abstract
The development of synthetic biological circuits that maintain functionality over application-relevant time scales remains a significant challenge. Here, we employed synthetic overlapping sequences in which one gene is encoded or 'entangled' entirely within an alternative reading frame of another gene. In this design, the toxin-encoding relE was entangled within ilvA, which encodes threonine deaminase, an enzyme essential for isoleucine biosynthesis. A functional entanglement construct was obtained upon modification of the ribosome-binding site of the internal relE gene. Using this optimized design, we found that the selection pressure to maintain functional IlvA stabilized the production of burdensome RelE for >130 generations, which compares favorably with the most stable kill-switch circuits developed to date. This stabilizing effect was achieved through a complete alteration of the allowable landscape of mutations such that mutations inactivating the entangled genes were disfavored. Instead, the majority of lineages accumulated mutations within the regulatory region of ilvA. By reducing baseline relE expression, these more 'benign' mutations lowered circuit burden, which suppressed the accumulation of relE-inactivating mutations, thereby prolonging kill-switch function. Overall, this work demonstrates the utility of sequence entanglement paired with an adaptive laboratory evolution campaign to increase the evolutionary stability of burdensome synthetic circuits.
Collapse
Affiliation(s)
- Jennifer L Chlebek
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, CA 94550, USA
| | - Sean P Leonard
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, CA 94550, USA
| | - Christina Kang-Yun
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, CA 94550, USA
| | - Mimi C Yung
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, CA 94550, USA
| | - Dante P Ricci
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, CA 94550, USA
| | - Yongqin Jiao
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, CA 94550, USA
| | - Dan M Park
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, CA 94550, USA
| |
Collapse
|
11
|
Kienzle L, Bettinazzi S, Choquette T, Brunet M, Khorami HH, Jacques JF, Moreau M, Roucou X, Landry CR, Angers A, Breton S. A small protein coded within the mitochondrial canonical gene nd4 regulates mitochondrial bioenergetics. BMC Biol 2023; 21:111. [PMID: 37198654 DOI: 10.1186/s12915-023-01609-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2022] [Accepted: 05/03/2023] [Indexed: 05/19/2023] Open
Abstract
BACKGROUND Mitochondria have a central role in cellular functions, aging, and in certain diseases. They possess their own genome, a vestige of their bacterial ancestor. Over the course of evolution, most of the genes of the ancestor have been lost or transferred to the nucleus. In humans, the mtDNA is a very small circular molecule with a functional repertoire limited to only 37 genes. Its extremely compact nature with genes arranged one after the other and separated by short non-coding regions suggests that there is little room for evolutionary novelties. This is radically different from bacterial genomes, which are also circular but much larger, and in which we can find genes inside other genes. These sequences, different from the reference coding sequences, are called alternatives open reading frames or altORFs, and they are involved in key biological functions. However, whether altORFs exist in mitochondrial protein-coding genes or elsewhere in the human mitogenome has not been fully addressed. RESULTS We found a downstream alternative ATG initiation codon in the + 3 reading frame of the human mitochondrial nd4 gene. This newly characterized altORF encodes a 99-amino-acid-long polypeptide, MTALTND4, which is conserved in primates. Our custom antibody, but not the pre-immune serum, was able to immunoprecipitate MTALTND4 from HeLa cell lysates, confirming the existence of an endogenous MTALTND4 peptide. The protein is localized in mitochondria and cytoplasm and is also found in the plasma, and it impacts cell and mitochondrial physiology. CONCLUSIONS Many human mitochondrial translated ORFs might have so far gone unnoticed. By ignoring mtaltORFs, we have underestimated the coding potential of the mitogenome. Alternative mitochondrial peptides such as MTALTND4 may offer a new framework for the investigation of mitochondrial functions and diseases.
Collapse
Affiliation(s)
- Laura Kienzle
- Département de sciences biologiques, Université de Montréal, Montréal, Canada
| | - Stefano Bettinazzi
- Département de sciences biologiques, Université de Montréal, Montréal, Canada
| | - Thierry Choquette
- Département de sciences biologiques, Université de Montréal, Montréal, Canada
| | - Marie Brunet
- Service de génétique médicale, Département de pédiatrie, Université de Sherbrooke, Sherbrooke, Canada
- Centre de recherche du Centre hospitalier universitaire de Sherbrooke (CRCHUS), Sherbrooke, Canada
| | | | - Jean-François Jacques
- Département de biochimie et génomique fonctionnelle, Université de Sherbrooke, Sherbrooke, Canada
| | - Mathilde Moreau
- Département de biochimie et génomique fonctionnelle, Université de Sherbrooke, Sherbrooke, Canada
| | - Xavier Roucou
- Centre de recherche du Centre hospitalier universitaire de Sherbrooke (CRCHUS), Sherbrooke, Canada
- Département de biochimie et génomique fonctionnelle, Université de Sherbrooke, Sherbrooke, Canada
| | - Christian R Landry
- Département de biochimie, de microbiologie et de bio-informatique, Faculté des sciences et de génie, Université Laval, Québec, Canada
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- PROTEO, Le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, Canada
- Centre de recherche sur les données massives, Université Laval, Québec, Canada
- Département de biologie, Faculté des sciences et de génie, Université Laval, Québec, Canada
| | - Annie Angers
- Département de sciences biologiques, Université de Montréal, Montréal, Canada
| | - Sophie Breton
- Département de sciences biologiques, Université de Montréal, Montréal, Canada.
| |
Collapse
|
12
|
Zhong A, Jiang X, Hickman AB, Klier K, Teodoro GIC, Dyda F, Laub MT, Storz G. Toxic anti-phage defense proteins inhibited by intragenic antitoxin proteins. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.02.539157. [PMID: 37425788 PMCID: PMC10327210 DOI: 10.1101/2023.05.02.539157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Recombination-promoting nuclease (Rpn) proteins are broadly distributed across bacterial phyla, yet their functions remain unclear. Here we report these proteins are new toxin-antitoxin systems, comprised of genes-within-genes, that combat phage infection. We show the small, highly variable Rpn C -terminal domains (Rpn S ), which are translated separately from the full-length proteins (Rpn L ), directly block the activities of the toxic full-length proteins. The crystal structure of RpnA S revealed a dimerization interface encompassing a helix that can have four amino acid repeats whose number varies widely among strains of the same species. Consistent with strong selection for the variation, we document plasmid-encoded RpnP2 L protects Escherichia coli against certain phages. We propose many more intragenic-encoded proteins that serve regulatory roles remain to be discovered in all organisms. Significance Here we document the function of small genes-within-genes, showing they encode antitoxin proteins that block the functions of the toxic DNA endonuclease proteins encoded by the longer rpn genes. Intriguingly, a sequence present in both long and short protein shows extensive variation in the number of four amino acid repeats. Consistent with a strong selection for the variation, we provide evidence that the Rpn proteins represent a phage defense system.
Collapse
|
13
|
Smith C, Canestrari JG, Wang AJ, Champion MM, Derbyshire KM, Gray TA, Wade JT. Pervasive translation in Mycobacterium tuberculosis. eLife 2022; 11:e73980. [PMID: 35343439 PMCID: PMC9094748 DOI: 10.7554/elife.73980] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Accepted: 03/25/2022] [Indexed: 11/13/2022] Open
Abstract
Most bacterial ORFs are identified by automated prediction algorithms. However, these algorithms often fail to identify ORFs lacking canonical features such as a length of >50 codons or the presence of an upstream Shine-Dalgarno sequence. Here, we use ribosome profiling approaches to identify actively translated ORFs in Mycobacterium tuberculosis. Most of the ORFs we identify have not been previously described, indicating that the M. tuberculosis transcriptome is pervasively translated. The newly described ORFs are predominantly short, with many encoding proteins of ≤50 amino acids. Codon usage of the newly discovered ORFs suggests that most have not been subject to purifying selection, and hence are unlikely to contribute to cell fitness. Nevertheless, we identify 90 new ORFs (median length of 52 codons) that bear the hallmarks of purifying selection. Thus, our data suggest that pervasive translation of short ORFs in Mycobacterium tuberculosis serves as a rich source for the evolution of new functional proteins.
Collapse
Affiliation(s)
- Carol Smith
- Wadsworth Center, Division of Genetics, New York State Department of HealthAlbanyUnited States
| | - Jill G Canestrari
- Wadsworth Center, Division of Genetics, New York State Department of HealthAlbanyUnited States
| | - Archer J Wang
- Wadsworth Center, Division of Genetics, New York State Department of HealthAlbanyUnited States
| | - Matthew M Champion
- Department of Chemistry and Biochemistry, University of Notre DameNotre DameUnited States
| | - Keith M Derbyshire
- Wadsworth Center, Division of Genetics, New York State Department of HealthAlbanyUnited States
- Department of Biomedical Sciences, School of Public Health, University at AlbanyNew YorkUnited States
| | - Todd A Gray
- Wadsworth Center, Division of Genetics, New York State Department of HealthAlbanyUnited States
- Department of Biomedical Sciences, School of Public Health, University at AlbanyNew YorkUnited States
| | - Joseph T Wade
- Wadsworth Center, Division of Genetics, New York State Department of HealthAlbanyUnited States
- Department of Biomedical Sciences, School of Public Health, University at AlbanyNew YorkUnited States
| |
Collapse
|
14
|
Gelhausen R, Müller T, Svensson SL, Alkhnbashi OS, Sharma CM, Eggenhofer F, Backofen R. RiboReport - benchmarking tools for ribosome profiling-based identification of open reading frames in bacteria. Brief Bioinform 2022; 23:bbab549. [PMID: 35037022 PMCID: PMC8921622 DOI: 10.1093/bib/bbab549] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2021] [Revised: 11/22/2021] [Accepted: 11/29/2021] [Indexed: 11/19/2022] Open
Abstract
Small proteins encoded by short open reading frames (ORFs) with 50 codons or fewer are emerging as an important class of cellular macromolecules in diverse organisms. However, they often evade detection by proteomics or in silico methods. Ribosome profiling (Ribo-seq) has revealed widespread translation in genomic regions previously thought to be non-coding, driving the development of ORF detection tools using Ribo-seq data. However, only a handful of tools have been designed for bacteria, and these have not yet been systematically compared. Here, we aimed to identify tools that use Ribo-seq data to correctly determine the translational status of annotated bacterial ORFs and also discover novel translated regions with high sensitivity. To this end, we generated a large set of annotated ORFs from four diverse bacterial organisms, manually labeled for their translation status based on Ribo-seq data, which are available for future benchmarking studies. This set was used to investigate the predictive performance of seven Ribo-seq-based ORF detection tools (REPARATION_blast, DeepRibo, Ribo-TISH, PRICE, smORFer, ribotricer and SPECtre), as well as IRSOM, which uses coding potential and RNA-seq coverage only. DeepRibo and REPARATION_blast robustly predicted translated ORFs, including sORFs, with no significant difference for ORFs in close proximity to other genes versus stand-alone genes. However, no tool predicted a set of novel, experimentally verified sORFs with high sensitivity. Start codon predictions with smORFer show the value of initiation site profiling data to further improve the sensitivity of ORF prediction tools in bacteria. Overall, we find that bacterial tools perform well for sORF detection, although there is potential for improving their performance, applicability, usability and reproducibility.
Collapse
Affiliation(s)
- Rick Gelhausen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, 79110, Freiburg, Germany
| | - Teresa Müller
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, 79110, Freiburg, Germany
| | - Sarah L Svensson
- Department of Molecular Infection Biology II, Institute of Molecular Infection Biology (IMIB), University of Würzburg, Josef-Schneider-Str. 2 / D15, 97080, Würzburg, Germany
| | - Omer S Alkhnbashi
- Information and Computer Science Department, King Fahd University of Petroleum and Minerals, Saudi Arabia
- SDAIA-KFUPM Joint Research Center for Artificial Intelligence (JRC-AI), King Fahd University of Petroleum and Minerals, Saudi Arabia
| | - Cynthia M Sharma
- Department of Molecular Infection Biology II, Institute of Molecular Infection Biology (IMIB), University of Würzburg, Josef-Schneider-Str. 2 / D15, 97080, Würzburg, Germany
| | - Florian Eggenhofer
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, 79110, Freiburg, Germany
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, 79110, Freiburg, Germany
- Signalling Research Centres BIOSS and CIBSS, University of Freiburg, Schänzlestr. 18, 79104, State, Germany
| |
Collapse
|
15
|
Abstract
Modern genome-scale methods that identify new genes, such as proteogenomics and ribosome profiling, have revealed, to the surprise of many, that overlap in genes, open reading frames and even coding sequences is widespread and functionally integrated into prokaryotic, eukaryotic and viral genomes. In parallel, the constraints that overlapping regions place on genome sequences and their evolution can be harnessed in bioengineering to build more robust synthetic strains and constructs. With a focus on overlapping protein-coding and RNA-coding genes, this Review examines their discovery, topology and biogenesis in the context of their genome biology. We highlight exciting new uses for sequence overlap to control translation, compress synthetic genetic constructs, and protect against mutation.
Collapse
|
16
|
Wichmann S, Scherer S, Ardern Z. Biological factors in the synthetic construction of overlapping genes. BMC Genomics 2021; 22:888. [PMID: 34895142 PMCID: PMC8665328 DOI: 10.1186/s12864-021-08181-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2020] [Accepted: 11/17/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Overlapping genes (OLGs) with long protein-coding overlapping sequences are disallowed by standard genome annotation programs, outside of viruses. Recently however they have been discovered in Archaea, diverse Bacteria, and Mammals. The biological factors underlying life's ability to create overlapping genes require more study, and may have important applications in understanding evolution and in biotechnology. A previous study claimed that protein domains from viruses were much better suited to forming overlaps than those from other cellular organisms - in this study we assessed this claim, in order to discover what might underlie taxonomic differences in the creation of gene overlaps. RESULTS After overlapping arbitrary Pfam domain pairs and evaluating them with Hidden Markov Models we find OLG construction to be much less constrained than expected. For instance, close to 10% of the constructed sequences cannot be distinguished from typical sequences in their protein family. Most are also indistinguishable from natural protein sequences regarding identity and secondary structure. Surprisingly, contrary to a previous study, virus domains were much less suitable for designing OLGs than bacterial or eukaryotic domains were. In general, the amount of amino acid change required to force a domain to overlap is approximately equal to the variation observed within a typical domain family. The resulting high similarity between natural sequences and those altered so as to overlap is mostly due to the combination of high redundancy in the genetic code and the evolutionary exchangeability of many amino acids. CONCLUSIONS Synthetic overlapping genes which closely resemble natural gene sequences, as measured by HMM profiles, are remarkably easy to construct, and most arbitrary domain pairs can be altered so as to overlap while retaining high similarity to the original sequences. Future work however will need to assess important factors not considered such as intragenic interactions which affect protein folding. While the analysis here is not sufficient to guarantee functional folding proteins, further analysis of constructed OLGs will improve our understanding of the origin of these remarkable genetic elements across life and opens up exciting possibilities for synthetic biology.
Collapse
Affiliation(s)
- Stefan Wichmann
- Chair of Microbial Ecology, Department of Molecular Life Sciences, Technical University of Munich, Freising, Germany
| | - Siegfried Scherer
- Chair of Microbial Ecology, Department of Molecular Life Sciences, Technical University of Munich, Freising, Germany
| | - Zachary Ardern
- Chair of Microbial Ecology, Department of Molecular Life Sciences, Technical University of Munich, Freising, Germany.
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.
| |
Collapse
|
17
|
Nelson CW, Ardern Z, Wei X. OLGenie: Estimating Natural Selection to Predict Functional Overlapping Genes. Mol Biol Evol 2021; 37:2440-2449. [PMID: 32243542 PMCID: PMC7531306 DOI: 10.1093/molbev/msaa087] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Purifying (negative) natural selection is a hallmark of functional biological sequences, and can be detected in protein-coding genes using the ratio of nonsynonymous to synonymous substitutions per site (dN/dS). However, when two genes overlap the same nucleotide sites in different frames, synonymous changes in one gene may be nonsynonymous in the other, perturbing dN/dS. Thus, scalable methods are needed to estimate functional constraint specifically for overlapping genes (OLGs). We propose OLGenie, which implements a modification of the Wei–Zhang method. Assessment with simulations and controls from viral genomes (58 OLGs and 176 non-OLGs) demonstrates low false-positive rates and good discriminatory ability in differentiating true OLGs from non-OLGs. We also apply OLGenie to the unresolved case of HIV-1’s putative antisense protein gene, showing significant purifying selection. OLGenie can be used to study known OLGs and to predict new OLGs in genome annotation. Software and example data are freely available at https://github.com/chasewnelson/OLGenie (last accessed April 10, 2020).
Collapse
Affiliation(s)
- Chase W Nelson
- Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY.,Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | - Zachary Ardern
- Microbial Ecology, ZIEL-Institute for Food & Health, Technische Universität München, Freising, Germany
| | - Xinzhu Wei
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI.,Department of Integrative Biology and Statistics, University of California, Berkeley, CA
| |
Collapse
|
18
|
McBride TM, Schwartz EA, Kumar A, Taylor DW, Fineran PC, Fagerlund RD. Diverse CRISPR-Cas Complexes Require Independent Translation of Small and Large Subunits from a Single Gene. Mol Cell 2020; 80:971-979.e7. [PMID: 33248026 DOI: 10.1016/j.molcel.2020.11.003] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Revised: 10/22/2020] [Accepted: 10/29/2020] [Indexed: 12/26/2022]
Abstract
CRISPR-Cas adaptive immune systems provide prokaryotes with defense against viruses by degradation of specific invading nucleic acids. Despite advances in the biotechnological exploitation of select systems, multiple CRISPR-Cas types remain uncharacterized. Here, we investigated the previously uncharacterized type I-D interference complex and revealed that it is a genetic and structural hybrid with similarity to both type I and type III systems. Surprisingly, formation of the functional complex required internal in-frame translation of small subunits from within the large subunit gene. We further show that internal translation to generate small subunits is widespread across diverse type I-D, I-B, and I-C systems, which account for roughly one quarter of CRISPR-Cas systems. Our work reveals the unexpected expansion of protein coding potential from within single cas genes, which has important implications for understanding CRISPR-Cas function and evolution.
Collapse
Affiliation(s)
- Tess M McBride
- Department of Microbiology and Immunology, University of Otago, PO Box 56, Dunedin 9054, New Zealand
| | - Evan A Schwartz
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712-1597, USA; Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712-1597, USA
| | - Abhishek Kumar
- Centre for Protein Research, University of Otago, PO Box 56, Dunedin 9054, New Zealand
| | - David W Taylor
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712-1597, USA; Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712-1597, USA; Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712-1597, USA; LIVESTRONG Cancer Institutes, Dell Medical School, Austin, TX 78712-1597, USA
| | - Peter C Fineran
- Department of Microbiology and Immunology, University of Otago, PO Box 56, Dunedin 9054, New Zealand; Bio-Protection Research Centre, University of Otago, PO Box 56, Dunedin 9054, New Zealand; Genetics Otago, University of Otago, Dunedin, New Zealand
| | - Robert D Fagerlund
- Department of Microbiology and Immunology, University of Otago, PO Box 56, Dunedin 9054, New Zealand; Genetics Otago, University of Otago, Dunedin, New Zealand.
| |
Collapse
|
19
|
Orr MW, Mao Y, Storz G, Qian SB. Alternative ORFs and small ORFs: shedding light on the dark proteome. Nucleic Acids Res 2020; 48:1029-1042. [PMID: 31504789 DOI: 10.1093/nar/gkz734] [Citation(s) in RCA: 183] [Impact Index Per Article: 36.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 08/03/2019] [Accepted: 08/15/2019] [Indexed: 02/06/2023] Open
Abstract
Traditional annotation of protein-encoding genes relied on assumptions, such as one open reading frame (ORF) encodes one protein and minimal lengths for translated proteins. With the serendipitous discoveries of translated ORFs encoded upstream and downstream of annotated ORFs, from alternative start sites nested within annotated ORFs and from RNAs previously considered noncoding, it is becoming clear that these initial assumptions are incorrect. The findings have led to the realization that genetic information is more densely coded and that the proteome is more complex than previously anticipated. As such, interest in the identification and characterization of the previously ignored 'dark proteome' is increasing, though we note that research in eukaryotes and bacteria has largely progressed in isolation. To bridge this gap and illustrate exciting findings emerging from studies of the dark proteome, we highlight recent advances in both eukaryotic and bacterial cells. We discuss progress in the detection of alternative ORFs as well as in the understanding of functions and the regulation of their expression and posit questions for future work.
Collapse
Affiliation(s)
- Mona Wu Orr
- Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD 20892, USA
| | - Yuanhui Mao
- Division of Nutritional Sciences, Cornell University, Ithaca, NY 14853, USA
| | - Gisela Storz
- Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD 20892, USA
| | - Shu-Bing Qian
- Division of Nutritional Sciences, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
20
|
D'Agostino PM, Al-Sinawi B, Mazmouz R, Muenchhoff J, Neilan BA, Moffitt MC. Identification of promoter elements in the Dolichospermum circinale AWQC131C saxitoxin gene cluster and the experimental analysis of their use for heterologous expression. BMC Microbiol 2020; 20:35. [PMID: 32070286 PMCID: PMC7027233 DOI: 10.1186/s12866-020-1720-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Accepted: 02/03/2020] [Indexed: 01/06/2023] Open
Abstract
Background Dolichospermum circinale is a filamentous bloom-forming cyanobacterium responsible for biosynthesis of the paralytic shellfish toxins (PST), including saxitoxin. PSTs are neurotoxins and in their purified form are important analytical standards for monitoring the quality of water and seafood and biomedical research tools for studying neuronal sodium channels. More recently, PSTs have been recognised for their utility as local anaesthetics. Characterisation of the transcriptional elements within the saxitoxin (sxt) biosynthetic gene cluster (BGC) is a first step towards accessing these molecules for biotechnology. Results In D. circinale AWQC131C the sxt BGC is transcribed from two bidirectional promoter regions encoding five individual promoters. These promoters were identified experimentally using 5′ RACE and their activity assessed via coupling to a lux reporter system in E. coli and Synechocystis sp. PCC 6803. Transcription of the predicted drug/metabolite transporter (DMT) encoded by sxtPER was found to initiate from two promoters, PsxtPER1 and PsxtPER2. In E. coli, strong expression of lux from PsxtP, PsxtD and PsxtPER1 was observed while expression from Porf24 and PsxtPER2 was remarkably weaker. In contrast, heterologous expression in Synechocystis sp. PCC 6803 showed that expression of lux from PsxtP, PsxtPER1, and Porf24 promoters was statistically higher compared to the non-promoter control, while PsxtD showed poor activity under the described conditions. Conclusions Both of the heterologous hosts investigated in this study exhibited high expression levels from three of the five sxt promoters. These results indicate that the majority of the native sxt promoters appear active in different heterologous hosts, simplifying initial cloning efforts. Therefore, heterologous expression of the sxt BGC in either E. coli or Synechocystis could be a viable first option for producing PSTs for industrial or biomedical purposes.
Collapse
Affiliation(s)
- Paul M D'Agostino
- School of Science, Western Sydney University, Sydney, NSW, Australia.,School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia.,Biosystems Chemistry, Department of Chemistry, Technische Universität München, Garching, Germany.,Technical Biochemistry, Faculty of Chemistry and Food Chemistry, Technische Universität Dresden, Dresden, Germany
| | - Bakir Al-Sinawi
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia
| | - Rabia Mazmouz
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia
| | - Julia Muenchhoff
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia.,Centre for Healthy Brain Ageing, School of Psychiatry, University of New South Wales, Sydney, Australia
| | - Brett A Neilan
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia. .,School of Environmental and Life Sciences, University of Newcastle, Callaghan, Australia.
| | | |
Collapse
|
21
|
Clauwaert J, Menschaert G, Waegeman W. DeepRibo: a neural network for precise gene annotation of prokaryotes by combining ribosome profiling signal and binding site patterns. Nucleic Acids Res 2019; 47:e36. [PMID: 30753697 PMCID: PMC6451124 DOI: 10.1093/nar/gkz061] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Revised: 01/02/2019] [Accepted: 01/30/2019] [Indexed: 12/13/2022] Open
Abstract
Annotation of gene expression in prokaryotes often finds itself corrected due to small variations of the annotated gene regions observed between different (sub)-species. It has become apparent that traditional sequence alignment algorithms, used for the curation of genomes, are not able to map the full complexity of the genomic landscape. We present DeepRibo, a novel neural network utilizing features extracted from ribosome profiling information and binding site sequence patterns that shows to be a precise tool for the delineation and annotation of expressed genes in prokaryotes. The neural network combines recurrent memory cells and convolutional layers, adapting the information gained from both the high-throughput ribosome profiling data and ribosome binding translation initiation sequence region into one model. DeepRibo is designed as a single model trained on a variety of ribosome profiling experiments, used for the identification of open reading frames in prokaryotes without a priori knowledge of the translational landscape. Through extensive validation of the model trained on various sets of data, multiple species sequence similarity, mass spectrometry and Edman degradation verified proteins, the effectiveness of DeepRibo is highlighted.
Collapse
Affiliation(s)
- Jim Clauwaert
- KERMIT, Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000 Gent, Belgium
| | - Gerben Menschaert
- Biobix, Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000 Gent, Belgium
| | - Willem Waegeman
- KERMIT, Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000 Gent, Belgium
| |
Collapse
|
22
|
Georg J, Hess WR. Widespread Antisense Transcription in Prokaryotes. Microbiol Spectr 2018; 6:10.1128/microbiolspec.rwr-0029-2018. [PMID: 30003872 PMCID: PMC11633618 DOI: 10.1128/microbiolspec.rwr-0029-2018] [Citation(s) in RCA: 60] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2018] [Indexed: 12/15/2022] Open
Abstract
Although bacterial genomes are usually densely protein-coding, genome-wide mapping approaches of transcriptional start sites revealed that a significant fraction of the identified promoters drive the transcription of noncoding RNAs. These can be trans-acting RNAs, mainly originating from intergenic regions and, in many studied examples, possessing regulatory functions. However, a significant fraction of these noncoding RNAs consist of natural antisense transcripts (asRNAs), which overlap other transcriptional units. Naturally occurring asRNAs were first observed to play a role in bacterial plasmid replication and in bacteriophage λ more than 30 years ago. Today's view is that asRNAs abound in all three domains of life. There are several examples of asRNAs in bacteria with clearly defined functions. Nevertheless, many asRNAs appear to result from pervasive initiation of transcription, and some data point toward global functions of such widespread transcriptional activity, explaining why the search for a specific regulatory role is sometimes futile. In this review, we give an overview about the occurrence of antisense transcription in bacteria, highlight particular examples of functionally characterized asRNAs, and discuss recent evidence pointing at global relevance in RNA processing and transcription-coupled DNA repair.
Collapse
MESH Headings
- Bacteria/genetics
- Bacterial Proteins/genetics
- Bacterial Proteins/metabolism
- DNA Repair/physiology
- Evolution, Molecular
- Gene Expression Regulation, Bacterial
- Genome, Bacterial
- Plasmids
- RNA, Antisense/genetics
- RNA, Antisense/physiology
- RNA, Bacterial/genetics
- RNA, Bacterial/physiology
- RNA, Untranslated
- Transcription, Genetic/genetics
- Transcription, Genetic/physiology
Collapse
Affiliation(s)
- Jens Georg
- University of Freiburg, Faculty of Biology, Institute of Biology III, Genetics and Experimental Bioinformatics, D-79104 Freiburg, Germany
| | - Wolfgang R Hess
- University of Freiburg, Faculty of Biology, Institute of Biology III, Genetics and Experimental Bioinformatics, D-79104 Freiburg, Germany
| |
Collapse
|