1
|
Karampatakis T, Tsergouli K, Behzadi P. Carbapenem-Resistant Pseudomonas aeruginosa's Resistome: Pan-Genomic Plasticity, the Impact of Transposable Elements and Jumping Genes. Antibiotics (Basel) 2025; 14:353. [PMID: 40298491 PMCID: PMC12024412 DOI: 10.3390/antibiotics14040353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2025] [Revised: 03/23/2025] [Accepted: 03/26/2025] [Indexed: 04/30/2025] Open
Abstract
Pseudomonas aeruginosa, a Gram-negative, motile bacterium, may cause significant infections in both community and hospital settings, leading to substantial morbidity and mortality. This opportunistic pathogen can thrive in various environments, making it a public health concern worldwide. P. aeruginosa's genomic pool is highly dynamic and diverse, with a pan-genome size ranging from 5.5 to 7.76 Mbp. This versatility arises from its ability to acquire genes through horizontal gene transfer (HGT) via different genetic elements (GEs), such as mobile genetic elements (MGEs). These MGEs, collectively known as the mobilome, facilitate the spread of genes encoding resistance to antimicrobials (ARGs), resistance to heavy metals (HMRGs), virulence (VGs), and metabolic functions (MGs). Of particular concern are the acquired carbapenemase genes (ACGs) and other β-lactamase genes, such as classes A, B [metallo-β-lactamases (MBLs)], and D carbapenemases, which can lead to increased antimicrobial resistance. This review emphasizes the importance of the mobilome in understanding antimicrobial resistance in P. aeruginosa.
Collapse
Affiliation(s)
- Theodoros Karampatakis
- Department of Clinical Microbiology, University Hospital Kerry, V92 NX94 Tralee, Ireland; (T.K.); (K.T.)
| | - Katerina Tsergouli
- Department of Clinical Microbiology, University Hospital Kerry, V92 NX94 Tralee, Ireland; (T.K.); (K.T.)
| | - Payam Behzadi
- Department of Microbiology, Shahr-e-Qods Branch, Islamic Azad University, Tehran 37541-374, Iran
| |
Collapse
|
2
|
Heieck K, Brück T. Localization of Insertion Sequences in Plasmids for L-Cysteine Production in E. coli. Genes (Basel) 2023; 14:1317. [PMID: 37510222 PMCID: PMC10379815 DOI: 10.3390/genes14071317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 06/19/2023] [Accepted: 06/21/2023] [Indexed: 07/30/2023] Open
Abstract
Insertion sequence elements (ISE) are often found to be responsible for the collapse of production in synthetically engineered Escherichia coli. By the transposition of ISE into the open reading frame of the synthetic pathway, E. coli cells gain selection advantage over cells expressing the metabolic burdensome production genes. Here, we present the exact entry sites of insertion sequence (IS) families 3 and 5 within plasmids for l-cysteine production in evolved E. coli populations. Furthermore, we identified an uncommon occurrence of an 8-bp direct repeat of IS5 which is atypical for this particular family, potentially indicating a new IS5 target site.
Collapse
Affiliation(s)
- Kevin Heieck
- School of Natural Sciences, Technical University of Munich, Lichtenbergstraße 4, 85748 Garching, Germany
| | - Thomas Brück
- School of Natural Sciences, Technical University of Munich, Lichtenbergstraße 4, 85748 Garching, Germany
| |
Collapse
|
3
|
Fallon AM, Carroll EM. Virus-like Particles from Wolbachia-Infected Cells May Include a Gene Transfer Agent. INSECTS 2023; 14:516. [PMID: 37367332 DOI: 10.3390/insects14060516] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 05/24/2023] [Accepted: 05/30/2023] [Indexed: 06/28/2023]
Abstract
Wolbachia are obligate intracellular bacteria that occur in insects and filarial worms. Strains that infect insects have genomes that encode mobile genetic elements, including diverse lambda-like prophages called Phage WO. Phage WO packages an approximately 65 kb viral genome that includes a unique eukaryotic association module, or EAM, that encodes unusually large proteins thought to mediate interactions between the bacterium, its virus, and the eukaryotic host cell. The Wolbachia supergroup B strain, wStri from the planthopper Laodelphax striatellus, produces phage-like particles that can be recovered from persistently infected mosquito cells by ultracentrifugation. Illumina sequencing, assembly, and manual curation of DNA from two independent preparations converged on an identical 15,638 bp sequence that encoded packaging, assembly, and structural proteins. The absence of an EAM and regulatory genes defined for Phage WO from the wasp, Nasonia vitripennis, was consistent with the possibility that the 15,638 bp sequence represents an element related to a gene transfer agent (GTA), characterized by a signature head-tail region encoding structural proteins that package host chromosomal DNA. Future investigation of GTA function will be supported by the improved recovery of physical particles, electron microscopic examination of potential diversity among particles, and rigorous examination of DNA content by methods independent of sequence assembly.
Collapse
Affiliation(s)
- Ann M Fallon
- Department of Entomology, University of Minnesota, 1980 Folwell Ave., St. Paul, MN 55108, USA
| | - Elissa M Carroll
- Department of Entomology, University of Minnesota, 1980 Folwell Ave., St. Paul, MN 55108, USA
| |
Collapse
|
4
|
Gaydukova SA, Moldovan MA, Vallesi A, Heaphy SM, Atkins JF, Gelfand MS, Baranov PV. Nontriplet feature of genetic code in Euplotes ciliates is a result of neutral evolution. Proc Natl Acad Sci U S A 2023; 120:e2221683120. [PMID: 37216548 PMCID: PMC10235951 DOI: 10.1073/pnas.2221683120] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Accepted: 04/12/2023] [Indexed: 05/24/2023] Open
Abstract
The triplet nature of the genetic code is considered a universal feature of known organisms. However, frequent stop codons at internal mRNA positions in Euplotes ciliates ultimately specify ribosomal frameshifting by one or two nucleotides depending on the context, thus posing a nontriplet feature of the genetic code of these organisms. Here, we sequenced transcriptomes of eight Euplotes species and assessed evolutionary patterns arising at frameshift sites. We show that frameshift sites are currently accumulating more rapidly by genetic drift than they are removed by weak selection. The time needed to reach the mutational equilibrium is several times longer than the age of Euplotes and is expected to occur after a several-fold increase in the frequency of frameshift sites. This suggests that Euplotes are at an early stage of the spread of frameshifting in expression of their genome. In addition, we find the net fitness burden of frameshift sites to be noncritical for the survival of Euplotes. Our results suggest that fundamental genome-wide changes such as a violation of the triplet character of genetic code can be introduced and maintained solely by neutral evolution.
Collapse
Affiliation(s)
- Sofya A. Gaydukova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow199911, Russia
| | - Mikhail A. Moldovan
- A. A. Kharkevich Institute for Information Transmission Problems RAS, Moscow127051, Russia
| | - Adriana Vallesi
- Laboratory of Eukaryotic Microbiology and Animal Biology, School of Biosciences and Veterinary Medicine, University of Camerino, Camerino62032, Italy
| | - Stephen M. Heaphy
- School of Biochemistry and Cell Biology, University College Cork, CorkT12 XF62, Ireland
| | - John F. Atkins
- School of Biochemistry and Cell Biology, University College Cork, CorkT12 XF62, Ireland
- Department of Human Genetics, University of Utah, Salt Lake City, UT84112
| | - Mikhail S. Gelfand
- A. A. Kharkevich Institute for Information Transmission Problems RAS, Moscow127051, Russia
| | - Pavel V. Baranov
- School of Biochemistry and Cell Biology, University College Cork, CorkT12 XF62, Ireland
| |
Collapse
|
5
|
Antonov IV, O’Loughlin S, Gorohovski AN, O’Connor PB, Baranov PV, Atkins JF. Streptomyces rare codon UUA: from features associated with 2 adpA related locations to candidate phage regulatory translational bypassing. RNA Biol 2023; 20:926-942. [PMID: 37968863 PMCID: PMC10732093 DOI: 10.1080/15476286.2023.2270812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 10/02/2023] [Indexed: 11/17/2023] Open
Abstract
In Streptomyces species, the cell cycle involves a switch from an early and vegetative state to a later phase where secondary products including antibiotics are synthesized, aerial hyphae form and sporulation occurs. AdpA, which has two domains, activates the expression of numerous genes involved in the switch from the vegetative growth phase. The adpA mRNA of many Streptomyces species has a UUA codon in a linker region between 5' sequence encoding one domain and 3' sequence encoding its other and C-terminal domain. UUA codons are exceptionally rare in Streptomyces, and its functional cognate tRNA is not present in a fully modified and acylated form, in the early and vegetative phase of the cell cycle though it is aminoacylated later. Here, we report candidate recoding signals that may influence decoding of the linker region UUA. Additionally, a short ORF 5' of the main ORF has been identified with a GUG at, or near, its 5' end and an in-frame UUA near its 3' end. The latter is commonly 5 nucleotides 5' of the main ORF start. Ribosome profiling data show translation of that 5' region. Ten years ago, UUA-mediated translational bypassing was proposed as a sensor by a Streptomyces phage of its host's cell cycle stage and an effector of its lytic/lysogeny switch. We provide the first experimental evidence supportive of this proposal.
Collapse
Affiliation(s)
- Ivan V. Antonov
- Russian Academy of Science, Institute of Bioengineering, Research Center of Biotechnology, Moscow, Russia
- Laboratory of Bioinformatics, Faculty of Computer Science, National Research University Higher School of Economics, Moscow, Russia
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Sinéad O’Loughlin
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Alessandro N. Gorohovski
- Russian Academy of Science, Institute of Bioengineering, Research Center of Biotechnology, Moscow, Russia
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | | | - Pavel V. Baranov
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - John F. Atkins
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| |
Collapse
|
6
|
Feng Y, Wang Z, Chien KY, Chen HL, Liang YH, Hua X, Chiu CH. "Pseudo-pseudogenes" in bacterial genomes: Proteogenomics reveals a wide but low protein expression of pseudogenes in Salmonella enterica. Nucleic Acids Res 2022; 50:5158-5170. [PMID: 35489061 PMCID: PMC9122581 DOI: 10.1093/nar/gkac302] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 04/11/2022] [Accepted: 04/14/2022] [Indexed: 12/03/2022] Open
Abstract
Pseudogenes (genes disrupted by frameshift or in-frame stop codons) are ubiquitously present in the bacterial genome and considered as nonfunctional fossil. Here, we used RNA-seq and mass-spectrometry technologies to measure the transcriptomes and proteomes of Salmonella enterica serovars Paratyphi A and Typhi. All pseudogenes’ mRNA sequences remained disrupted, and were present at comparable levels to their intact homologs. At the protein level, however, 101 out of 161 pseudogenes suggested successful translation, with their low expression regardless of growth conditions, genetic background and pseudogenization causes. The majority of frameshifting detected was compensatory for -1 frameshift mutations. Readthrough of in-frame stop codons primarily involved UAG; and cytosine was the most frequent base adjacent to the codon. Using a fluorescence reporter system, fifteen pseudogenes were confirmed to express successfully in vivo in Escherichia coli. Expression of the intact copy of the fifteen pseudogenes in S. Typhi affected bacterial pathogenesis as revealed in human macrophage and epithelial cell infection models. The above findings suggest the need to revisit the nonstandard translation mechanism as well as the biological role of pseudogenes in the bacterial genome.
Collapse
Affiliation(s)
- Ye Feng
- Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, People's Republic of China.,Institute of Translational Medicine, Zhejiang University School of Medicine, Hangzhou, People's Republic of China
| | - Zeyu Wang
- Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, People's Republic of China.,Institute of Translational Medicine, Zhejiang University School of Medicine, Hangzhou, People's Republic of China
| | - Kun-Yi Chien
- Graduate Institute of Biomedical Sciences, Chang Gung University College of Medicine, Taoyuan, Republic of China
| | - Hsiu-Ling Chen
- Molecular Infectious Disease Research Center, Chang Gung Memorial Hospital, Taoyuan, Republic of China
| | - Yi-Hua Liang
- Molecular Infectious Disease Research Center, Chang Gung Memorial Hospital, Taoyuan, Republic of China
| | - Xiaoting Hua
- Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, People's Republic of China
| | - Cheng-Hsun Chiu
- Graduate Institute of Biomedical Sciences, Chang Gung University College of Medicine, Taoyuan, Republic of China.,Molecular Infectious Disease Research Center, Chang Gung Memorial Hospital, Taoyuan, Republic of China.,Division of Pediatric Infectious Diseases, Department of Pediatrics, Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Taoyuan, Republic of China
| |
Collapse
|
7
|
De Lise F, Strazzulli A, Iacono R, Curci N, Di Fenza M, Maurelli L, Moracci M, Cobucci-Ponzano B. Programmed Deviations of Ribosomes From Standard Decoding in Archaea. Front Microbiol 2021; 12:688061. [PMID: 34149676 PMCID: PMC8211752 DOI: 10.3389/fmicb.2021.688061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Accepted: 05/04/2021] [Indexed: 11/13/2022] Open
Abstract
Genetic code decoding, initially considered to be universal and immutable, is now known to be flexible. In fact, in specific genes, ribosomes deviate from the standard translational rules in a programmed way, a phenomenon globally termed recoding. Translational recoding, which has been found in all domains of life, includes a group of events occurring during gene translation, namely stop codon readthrough, programmed ± 1 frameshifting, and ribosome bypassing. These events regulate protein expression at translational level and their mechanisms are well known and characterized in viruses, bacteria and eukaryotes. In this review we summarize the current state-of-the-art of recoding in the third domain of life. In Archaea, it was demonstrated and extensively studied that translational recoding regulates the decoding of the 21st and the 22nd amino acids selenocysteine and pyrrolysine, respectively, and only one case of programmed -1 frameshifting has been reported so far in Saccharolobus solfataricus P2. However, further putative events of translational recoding have been hypothesized in other archaeal species, but not extensively studied and confirmed yet. Although this phenomenon could have some implication for the physiology and adaptation of life in extreme environments, this field is still underexplored and genes whose expression could be regulated by recoding are still poorly characterized. The study of these recoding episodes in Archaea is urgently needed.
Collapse
Affiliation(s)
- Federica De Lise
- Institute of Biosciences and BioResources - National Research Council of Italy, Naples, Italy
| | - Andrea Strazzulli
- Department of Biology, University of Naples Federico II, Complesso Universitario di Monte S. Angelo, Naples, Italy.,Task Force on Microbiome Studies, University of Naples Federico II, Naples, Italy
| | - Roberta Iacono
- Institute of Biosciences and BioResources - National Research Council of Italy, Naples, Italy.,Department of Biology, University of Naples Federico II, Complesso Universitario di Monte S. Angelo, Naples, Italy
| | - Nicola Curci
- Institute of Biosciences and BioResources - National Research Council of Italy, Naples, Italy.,Department of Biology, University of Naples Federico II, Complesso Universitario di Monte S. Angelo, Naples, Italy
| | - Mauro Di Fenza
- Institute of Biosciences and BioResources - National Research Council of Italy, Naples, Italy
| | - Luisa Maurelli
- Institute of Biosciences and BioResources - National Research Council of Italy, Naples, Italy
| | - Marco Moracci
- Institute of Biosciences and BioResources - National Research Council of Italy, Naples, Italy.,Department of Biology, University of Naples Federico II, Complesso Universitario di Monte S. Angelo, Naples, Italy.,Task Force on Microbiome Studies, University of Naples Federico II, Naples, Italy
| | | |
Collapse
|
8
|
Antonov IV. Two Cobalt Chelatase Subunits Can Be Generated from a Single chlD Gene via Programed Frameshifting. Mol Biol Evol 2020; 37:2268-2278. [PMID: 32211852 DOI: 10.1093/molbev/msaa081] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Magnesium chelatase chlIDH and cobalt chelatase cobNST enzymes are required for biosynthesis of (bacterio)chlorophyll and cobalamin (vitamin B12), respectively. Each enzyme consists of large, medium, and small subunits. Structural and primary sequence similarities indicate common evolutionary origin of the corresponding subunits. It has been reported earlier that some of vitamin B12 synthesizing organisms utilized unusual cobalt chelatase enzyme consisting of a large cobalt chelatase subunit (cobN) along with a medium (chlD) and a small (chlI) subunits of magnesium chelatase. In attempt to understand the nature of this phenomenon, we analyzed >1,200 diverse genomes of cobalamin and/or chlorophyll producing prokaryotes. We found that, surprisingly, genomes of many cobalamin producers contained cobN and chlD genes only; a small subunit gene was absent. Further on, we have discovered a diverse group of chlD genes with functional programed ribosomal frameshifting signals. Given a high similarity between the small subunit and the N-terminal part of the medium subunit, we proposed that programed translational frameshifting may allow chlD mRNA to produce both subunits. Indeed, in genomes where genes for small subunits were absent, we observed statistically significant enrichment of programed frameshifting signals in chlD genes. Interestingly, the details of the frameshifting mechanisms producing small and medium subunits from a single chlD gene could be prokaryotic taxa specific. All over, this programed frameshifting phenomenon was observed to be highly conserved and present in both bacteria and archaea.
Collapse
Affiliation(s)
- Ivan V Antonov
- Institute of Bioengineering, Federal Research Centre Fundamentals of Biotechnology, Moscow, Russia
- Department of Biological and Medical Physics, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, Russia
| |
Collapse
|
9
|
Pfeifenschneider J, Markert B, Stolzenberger J, Brautaset T, Wendisch VF. Transaldolase in Bacillus methanolicus: biochemical characterization and biological role in ribulose monophosphate cycle. BMC Microbiol 2020; 20:63. [PMID: 32204692 PMCID: PMC7092467 DOI: 10.1186/s12866-020-01750-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2019] [Accepted: 03/11/2020] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND The Gram-positive facultative methylotrophic bacterium Bacillus methanolicus uses the sedoheptulose-1,7-bisphosphatase (SBPase) variant of the ribulose monophosphate (RuMP) cycle for growth on the C1 carbon source methanol. Previous genome sequencing of the physiologically different B. methanolicus wild-type strains MGA3 and PB1 has unraveled all putative RuMP cycle genes and later, several of the RuMP cycle enzymes of MGA3 have been biochemically characterized. In this study, the focus was on the characterization of the transaldolase (Ta) and its possible role in the RuMP cycle in B. methanolicus. RESULTS The Ta genes of B. methanolicus MGA3 and PB1 were recombinantly expressed in Escherichia coli, and the gene products were purified and characterized. The PB1 Ta protein was found to be active as a homodimer with a molecular weight of 54 kDa and displayed KM of 0.74 mM and Vmax of 16.3 U/mg using Fructose-6 phosphate as the substrate. In contrast, the MGA3 Ta gene, which encodes a truncated Ta protein lacking 80 amino acids at the N-terminus, showed no Ta activity. Seven different mutant genes expressing various full-length MGA3 Ta proteins were constructed and all gene products displayed Ta activities. Moreover, MGA3 cells displayed Ta activities similar as PB1 cells in crude extracts. CONCLUSIONS While it is well established that B. methanolicus can use the SBPase variant of the RuMP cycle this study indicates that B. methanolicus possesses Ta activity and may also operate the Ta variant of the RuMP.
Collapse
Affiliation(s)
- Johannes Pfeifenschneider
- Genetics of Prokaryotes, Faculty of Biology & Center for Biotechnology, Bielefeld University, Universitätsstraße 25, 33615, Bielefeld, Germany
| | - Benno Markert
- Genetics of Prokaryotes, Faculty of Biology & Center for Biotechnology, Bielefeld University, Universitätsstraße 25, 33615, Bielefeld, Germany
| | - Jessica Stolzenberger
- Genetics of Prokaryotes, Faculty of Biology & Center for Biotechnology, Bielefeld University, Universitätsstraße 25, 33615, Bielefeld, Germany
| | - Trygve Brautaset
- Department of Biotechnology, NTNU, Norwegian University of Science and Technology, Trondheim, Norway
| | - Volker F Wendisch
- Genetics of Prokaryotes, Faculty of Biology & Center for Biotechnology, Bielefeld University, Universitätsstraße 25, 33615, Bielefeld, Germany.
| |
Collapse
|
10
|
Di Franco A, Poujol R, Baurain D, Philippe H. Evaluating the usefulness of alignment filtering methods to reduce the impact of errors on evolutionary inferences. BMC Evol Biol 2019; 19:21. [PMID: 30634908 PMCID: PMC6330419 DOI: 10.1186/s12862-019-1350-2] [Citation(s) in RCA: 74] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2018] [Accepted: 01/02/2019] [Indexed: 11/10/2022] Open
Abstract
Background Multiple Sequence Alignments (MSAs) are the starting point of molecular evolutionary analyses. Errors in MSAs generate a non-historical signal that can lead to incorrect inferences. Therefore, numerous efforts have been made to reduce the impact of alignment errors, by improving alignment algorithms and by developing methods to filter out poorly aligned regions. However, MSAs do not only contain alignment errors, but also primary sequence errors. Such errors may originate from sequencing errors, from assembly errors, or from erroneous structural annotations (such as incorrect intron/exon boundaries). Even though their existence is acknowledged, the impact of primary sequence errors on evolutionary inference is poorly characterized. Results In a first step to fill this gap, we have developed a program called HmmCleaner, which detects and eliminates these errors from MSAs. It uses profile hidden Markov models (pHMM) to identify sequence segments that poorly fit their MSA and selectively removes them. We assessed its performances using > 700 amino-acid MSAs from prokaryotes and eukaryotes, in which we introduced several types of simulated primary sequence errors. The sensitivity of HmmCleaner towards simulated primary sequence errors was > 95%. In a second step, we compared the impact of segment filtering software (HmmCleaner and PREQUAL) relative to commonly used block-filtering software (BMGE and TrimAI) on evolutionary analyses. Using real data from vertebrates, we observed that segment-filtering methods improve the quality of evolutionary inference more than the currently used block-filtering methods. The formers were especially effective at improving branch length inferences, and at reducing false positive rate during detection of positive selection. Conclusions Segment filtering methods such as HmmCleaner accurately detect simulated primary sequence errors. Our results suggest that these errors are more detrimental than alignment errors. However, they also show that stochastic (sampling) error is predominant in single-gene evolutionary inferences. Therefore, we argue that MSA filtering should focus on segment instead of block removal and that more studies are required to find the optimal balance between accuracy improvement and stochastic error increase brought by data removal. Electronic supplementary material The online version of this article (10.1186/s12862-019-1350-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Arnaud Di Franco
- Station d'Ecologie Théorique et Expérimentale de Moulis, CNRS, Moulis, France
| | - Raphaël Poujol
- Département de Biochimie, Centre Robert-Cedergren, Université de Montréal, Montréal, Québec, Canada
| | - Denis Baurain
- InBioS-PhytoSYSTEMS, Unité de Phylogénomique des Eucaryotes, Université de Liège, Liège, Belgium
| | - Hervé Philippe
- Station d'Ecologie Théorique et Expérimentale de Moulis, CNRS, Moulis, France. .,Département de Biochimie, Centre Robert-Cedergren, Université de Montréal, Montréal, Québec, Canada.
| |
Collapse
|
11
|
Koscielniak D, Wons E, Wilkowska K, Sektas M. Non-programmed transcriptional frameshifting is common and highly RNA polymerase type-dependent. Microb Cell Fact 2018; 17:184. [PMID: 30474557 PMCID: PMC6260861 DOI: 10.1186/s12934-018-1034-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Accepted: 11/19/2018] [Indexed: 12/15/2022] Open
Abstract
Background The viral or host systems for a gene expression assume repeatability of the process and high quality of the protein product. Since level and fidelity of transcription primarily determines the overall efficiency, all factors contributing to their decrease should be identified and optimized. Among many observed processes, non-programmed insertion/deletion (indel) of nucleotide during transcription (slippage) occurring at homopolymeric A/T sequences within a gene can considerably impact its expression. To date, no comparative study of the most utilized Escherichia coli and T7 bacteriophage RNA polymerases (RNAP) propensity for this type of erroneous mRNA synthesis has been reported. To address this issue we evaluated the influence of shift-prone A/T sequences by assessing indel-dependent phenotypic changes. RNAP-specific expression profile was examined using two of the most potent promoters, ParaBAD of E. coli and φ10 of phage T7. Results Here we report on the first systematic study on requirements for efficient transcriptional slippage by T7 phage and cellular RNAPs considering three parameters: homopolymer length, template type, and frameshift directionality preferences. Using a series of out-of-frame gfp reporter genes fused to a variety of A/T homopolymeric sequences we show that T7 RNAP has an exceptional potential for generating frameshifts and is capable of slipping on as few as three adenine or four thymidine residues in a row, in a flanking sequence-dependent manner. In contrast, bacterial RNAP exhibits a relatively low ability to baypass indel mutations and requires a run of at least 7 tymidine and even more adenine residues. This difference comes from involvement of various intrinsic proofreading properties. Our studies demonstrate distinct preference towards a specific homopolymer in slippage induction. Whereas insertion slippage performed by T7 RNAP (but not deletion) occurs tendentiously on poly(A) rather than on poly(T) runs, strong bias towards poly(T) for the host RNAP is observed. Conclusions Intrinsic RNAP slippage properties involve trade-offs between accuracy, speed and processivity of transcription. Viral T7 RNAP manifests far greater inclinations to the transcriptional slippage than E. coli RNAP. This possibly plays an important role in driving bacteriophage adaptation and therefore could be considered as beneficial. However, from biotechnological and experimental viewpoint, this might create some problems, and strongly argues for employing bacterial expression systems, stocked with proofreading mechanisms. Electronic supplementary material The online version of this article (10.1186/s12934-018-1034-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Dawid Koscielniak
- Department of Microbiology, Faculty of Biology, University of Gdansk, Wita Stwosza 59, 80-308, Gdansk, Poland
| | - Ewa Wons
- Department of Microbiology, Faculty of Biology, University of Gdansk, Wita Stwosza 59, 80-308, Gdansk, Poland
| | - Karolina Wilkowska
- Department of Microbiology, Faculty of Biology, University of Gdansk, Wita Stwosza 59, 80-308, Gdansk, Poland
| | - Marian Sektas
- Department of Microbiology, Faculty of Biology, University of Gdansk, Wita Stwosza 59, 80-308, Gdansk, Poland.
| |
Collapse
|
12
|
Liu YJ, Qi K, Zhang J, Chen C, Cui Q, Feng Y. Firmicutes-enriched IS 1447 represents a group of IS 3-family insertion sequences exhibiting unique + 1 transcriptional slippage. BIOTECHNOLOGY FOR BIOFUELS 2018; 11:300. [PMID: 30410575 PMCID: PMC6211511 DOI: 10.1186/s13068-018-1304-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/25/2018] [Accepted: 10/27/2018] [Indexed: 05/12/2023]
Abstract
BACKGROUND Bacterial insertion sequences (ISs) are ubiquitous mobile genetic elements that play important roles in genome plasticity, cell adaptability, and function evolution. ISs of various families and subgroups contain significantly diverse molecular features and functional mechanisms that are not fully understood. RESULTS IS1447 is a member of the widespread IS3 family and was previously detected to have transposing activity in a typical thermophilic and cellulolytic microorganism Clostridium thermocellum. Phylogenetic analysis showed that IS1447-like elements are widely distributed in Firmicutes and possess unique features in the IS3 family. Therefore, IS1447 may represent a novel subgroup of the IS3 family. Unlike other well-known IS3 subgroups performing programmed - 1 translational frameshifting for the expression of the transposase, IS1447 exhibits transcriptional slippage in both the + 1 and - 1 directions, each with a frequency of ~ 16%, and only + 1 slippage results in full-length and functional transposase. The slippage-prone region of IS1447 contains a run of nine A nucleotides following a stem-loop structure in mRNA, but mutagenesis analysis indicated that seven of them are sufficient for the observed slippage. Western blot analysis indicated that IS1447 produces three types of transposases with alternative initiations. Furthermore, the IS1447-subgroup elements are abundant in the genomes of several cellulolytic bacteria. CONCLUSION Our result indicated that IS1447 represents a new Firmicutes-enriched subgroup of the IS3 family. The characterization of the novel IS3-family member will enrich our understanding of the transposition behavior of IS elements and may provide insight into developing IS-based mutagenesis tools for thermophiles.
Collapse
Affiliation(s)
- Ya-Jun Liu
- CAS Key Laboratory of Biofuels, Shandong Provincial Key Laboratory of Energy Genetics, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, China
- Dalian National Laboratory for Clean Energy, Dalian, China
| | - Kuan Qi
- CAS Key Laboratory of Biofuels, Shandong Provincial Key Laboratory of Energy Genetics, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, China
- Dalian National Laboratory for Clean Energy, Dalian, China
- University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing, China
| | - Jie Zhang
- CAS Key Laboratory of Biofuels, Shandong Provincial Key Laboratory of Energy Genetics, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, China
- Dalian National Laboratory for Clean Energy, Dalian, China
- University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing, China
- Present Address: Department of Biosystems Engineering, Auburn University, Auburn, AL 36849 USA
| | - Chao Chen
- CAS Key Laboratory of Biofuels, Shandong Provincial Key Laboratory of Energy Genetics, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, China
- Dalian National Laboratory for Clean Energy, Dalian, China
| | - Qiu Cui
- CAS Key Laboratory of Biofuels, Shandong Provincial Key Laboratory of Energy Genetics, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, China
- Dalian National Laboratory for Clean Energy, Dalian, China
| | - Yingang Feng
- CAS Key Laboratory of Biofuels, Shandong Provincial Key Laboratory of Energy Genetics, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, China
- Dalian National Laboratory for Clean Energy, Dalian, China
| |
Collapse
|
13
|
Lomsadze A, Gemayel K, Tang S, Borodovsky M. Modeling leaderless transcription and atypical genes results in more accurate gene prediction in prokaryotes. Genome Res 2018; 28:1079-1089. [PMID: 29773659 PMCID: PMC6028130 DOI: 10.1101/gr.230615.117] [Citation(s) in RCA: 120] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2017] [Accepted: 05/16/2018] [Indexed: 11/24/2022]
Abstract
In a conventional view of the prokaryotic genome organization, promoters precede operons and ribosome binding sites (RBSs) with Shine-Dalgarno consensus precede genes. However, recent experimental research suggesting a more diverse view motivated us to develop an algorithm with improved gene-finding accuracy. We describe GeneMarkS-2, an ab initio algorithm that uses a model derived by self-training for finding species-specific (native) genes, along with an array of precomputed "heuristic" models designed to identify harder-to-detect genes (likely horizontally transferred). Importantly, we designed GeneMarkS-2 to identify several types of distinct sequence patterns (signals) involved in gene expression control, among them the patterns characteristic for leaderless transcription as well as noncanonical RBS patterns. To assess the accuracy of GeneMarkS-2, we used genes validated by COG (Clusters of Orthologous Groups) annotation, proteomics experiments, and N-terminal protein sequencing. We observed that GeneMarkS-2 performed better on average in all accuracy measures when compared with the current state-of-the-art gene prediction tools. Furthermore, the screening of ∼5000 representative prokaryotic genomes made by GeneMarkS-2 predicted frequent leaderless transcription in both archaea and bacteria. We also observed that the RBS sites in some species with leadered transcription did not necessarily exhibit the Shine-Dalgarno consensus. The modeling of different types of sequence motifs regulating gene expression prompted a division of prokaryotic genomes into five categories with distinct sequence patterns around the gene starts.
Collapse
Affiliation(s)
- Alexandre Lomsadze
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Tech, Atlanta, Georgia 30332, USA
- Gene Probe, Incorporated, Atlanta, Georgia 30324, USA
| | - Karl Gemayel
- School of Computational Science and Engineering, Georgia Tech, Atlanta, Georgia 30332, USA
| | - Shiyuyun Tang
- School of Biological Sciences, Georgia Tech, Atlanta, Georgia 30332, USA
| | - Mark Borodovsky
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Tech, Atlanta, Georgia 30332, USA
- Gene Probe, Incorporated, Atlanta, Georgia 30324, USA
- School of Computational Science and Engineering, Georgia Tech, Atlanta, Georgia 30332, USA
- School of Biological Sciences, Georgia Tech, Atlanta, Georgia 30332, USA
- Department of Biological and Medical Physics, Moscow Institute of Physics and Technology, Moscow, 141700, Russia
| |
Collapse
|
14
|
Hücker SM, Vanderhaeghen S, Abellan-Schneyder I, Wecko R, Simon S, Scherer S, Neuhaus K. A novel short L-arginine responsive protein-coding gene (laoB) antiparallel overlapping to a CadC-like transcriptional regulator in Escherichia coli O157:H7 Sakai originated by overprinting. BMC Evol Biol 2018; 18:21. [PMID: 29433444 PMCID: PMC5810103 DOI: 10.1186/s12862-018-1134-0] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2017] [Accepted: 01/31/2018] [Indexed: 11/10/2022] Open
Abstract
Background Due to the DNA triplet code, it is possible that the sequences of two or more protein-coding genes overlap to a large degree. However, such non-trivial overlaps are usually excluded by genome annotation pipelines and, thus, only a few overlapping gene pairs have been described in bacteria. In contrast, transcriptome and translatome sequencing reveals many signals originated from the antisense strand of annotated genes, of which we analyzed an example gene pair in more detail. Results A small open reading frame of Escherichia coli O157:H7 strain Sakai (EHEC), designated laoB (L-arginine responsive overlapping gene), is embedded in reading frame −2 in the antisense strand of ECs5115, encoding a CadC-like transcriptional regulator. This overlapping gene shows evidence of transcription and translation in Luria-Bertani (LB) and brain-heart infusion (BHI) medium based on RNA sequencing (RNAseq) and ribosomal-footprint sequencing (RIBOseq). The transcriptional start site is 289 base pairs (bp) upstream of the start codon and transcription termination is 155 bp downstream of the stop codon. Overexpression of LaoB fused to an enhanced green fluorescent protein (EGFP) reporter was possible. The sequence upstream of the transcriptional start site displayed strong promoter activity under different conditions, whereas promoter activity was significantly decreased in the presence of L-arginine. A strand-specific translationally arrested mutant of laoB provided a significant growth advantage in competitive growth experiments in the presence of L-arginine compared to the wild type, which returned to wild type level after complementation of laoB in trans. A phylostratigraphic analysis indicated that the novel gene is restricted to the Escherichia/Shigella clade and might have originated recently by overprinting leading to the expression of part of the antisense strand of ECs5115. Conclusions Here, we present evidence of a novel small protein-coding gene laoB encoded in the antisense frame −2 of the annotated gene ECs5115. Clearly, laoB is evolutionarily young and it originated in the Escherichia/Shigella clade by overprinting, a process which may cause the de novo evolution of bacterial genes like laoB. Electronic supplementary material The online version of this article (10.1186/s12862-018-1134-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sarah M Hücker
- Chair for Microbial Ecology, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.,Fraunhofer ITEM-R, Am Biopark 9, 93053, Regensburg, Germany
| | - Sonja Vanderhaeghen
- Chair for Microbial Ecology, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany
| | - Isabel Abellan-Schneyder
- Chair for Microbial Ecology, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.,Core Facility Microbiome/NGS, ZIEL - Institute for Food & Health, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany
| | - Romy Wecko
- Chair for Microbial Ecology, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany
| | - Svenja Simon
- Department of Computer and Information Science, University of Konstanz, Box 78, 78457, Konstanz, Germany
| | - Siegfried Scherer
- Chair for Microbial Ecology, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.,ZIEL - Institute for Food & Health, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany
| | - Klaus Neuhaus
- Chair for Microbial Ecology, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany. .,Core Facility Microbiome/NGS, ZIEL - Institute for Food & Health, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.
| |
Collapse
|
15
|
Penno C, Kumari R, Baranov PV, van Sinderen D, Atkins JF. Stimulation of reverse transcriptase generated cDNAs with specific indels by template RNA structure: retrotransposon, dNTP balance, RT-reagent usage. Nucleic Acids Res 2017; 45:10143-10155. [PMID: 28973469 PMCID: PMC5737552 DOI: 10.1093/nar/gkx689] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2017] [Accepted: 07/24/2017] [Indexed: 01/03/2023] Open
Abstract
RNA dependent DNA-polymerases, reverse transcriptases, are key enzymes for retroviruses and retroelements. Their fidelity, including indel generation, is significant for their use as reagents including for deep sequencing. Here, we report that certain RNA template structures and G-rich sequences, ahead of diverse reverse transcriptases can be strong stimulators for slippage at slippage-prone template motif sequence 3′ of such ‘slippage-stimulatory’ structures. Where slippage is stimulated, the resulting products have one or more additional base(s) compared to the corresponding template motif. Such structures also inhibit slippage-mediated base omission which can be more frequent in the absence of a relevant stem–loop. Slippage directionality, base insertion and omission, is sensitive to the relative concentration ratio of dNTPs specified by the RNA template slippage-prone sequence and its 5′ adjacent base. The retrotransposon-derived enzyme TGIRT exhibits more slippage in vitro than the retroviral enzymes tested including that from HIV. Structure-mediated slippage may be exhibited by other polymerases and enrich gene expression. A cassette from Drosophila retrotransposon Dme1_chrX_2630566, a candidate for utilizing slippage for its GagPol synthesis, exhibits strong slippage in vitro. Given the widespread occurrence and importance of retrotransposons, systematic studies to reveal the extent of their functional utilization of RT slippage are merited.
Collapse
Affiliation(s)
- Christophe Penno
- School of Biochemistry, University College Cork, Cork, Ireland.,School of Microbiology, University College Cork, Cork, Ireland.,Alimentary Pharmabiotic Centre, University College Cork, Cork, Ireland
| | - Romika Kumari
- School of Biochemistry, University College Cork, Cork, Ireland
| | - Pavel V Baranov
- School of Biochemistry, University College Cork, Cork, Ireland
| | - Douwe van Sinderen
- School of Microbiology, University College Cork, Cork, Ireland.,Alimentary Pharmabiotic Centre, University College Cork, Cork, Ireland
| | - John F Atkins
- School of Biochemistry, University College Cork, Cork, Ireland.,School of Microbiology, University College Cork, Cork, Ireland.,Department of Human Genetics, University of Utah, Salt Lake City, UT 84112-5330, USA
| |
Collapse
|
16
|
Lobanov AV, Heaphy SM, Turanov AA, Gerashchenko MV, Pucciarelli S, Devaraj RR, Xie F, Petyuk VA, Smith RD, Klobutcher LA, Atkins JF, Miceli C, Hatfield DL, Baranov PV, Gladyshev VN. Position-dependent termination and widespread obligatory frameshifting in Euplotes translation. Nat Struct Mol Biol 2017; 24:61-68. [PMID: 27870834 PMCID: PMC5295771 DOI: 10.1038/nsmb.3330] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2016] [Accepted: 10/31/2016] [Indexed: 11/09/2022]
Abstract
The ribosome can change its reading frame during translation in a process known as programmed ribosomal frameshifting. These rare events are supported by complex mRNA signals. However, we found that the ciliates Euplotes crassus and Euplotes focardii exhibit widespread frameshifting at stop codons. 47 different codons preceding stop signals resulted in either +1 or +2 frameshifts, and +1 frameshifting at AAA was the most frequent. The frameshifts showed unusual plasticity and rapid evolution, and had little influence on translation rates. The proximity of a stop codon to the 3' mRNA end, rather than its occurrence or sequence context, appeared to designate termination. Thus, a 'stop codon' is not a sufficient signal for translation termination, and the default function of stop codons in Euplotes is frameshifting, whereas termination is specific to certain mRNA positions and probably requires additional factors.
Collapse
Affiliation(s)
- Alexei V. Lobanov
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusets, USA
| | - Stephen M. Heaphy
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Anton A. Turanov
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusets, USA
| | - Maxim V. Gerashchenko
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusets, USA
| | - Sandra Pucciarelli
- School of Biosciences and Biotechnology, University of Camerino, Camerino, Italy
| | - Raghul R. Devaraj
- School of Biosciences and Biotechnology, University of Camerino, Camerino, Italy
| | - Fang Xie
- Pacific Northwest National Laboratory, Richland, Washington, USA
| | | | - Richard D. Smith
- Pacific Northwest National Laboratory, Richland, Washington, USA
| | - Lawrence A. Klobutcher
- Department of Molecular Biology and Biophysics, University of Connecticut Health Center, Farmington, Connecticut, USA
| | - John F. Atkins
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Cristina Miceli
- School of Biosciences and Biotechnology, University of Camerino, Camerino, Italy
| | - Dolph L. Hatfield
- Molecular Biology of Selenium Section, Mouse Cancer Genetics Program, Center for Cancer Research, National Institutes of Health, Bethesda, Maryland, USA
| | - Pavel V. Baranov
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Vadim N. Gladyshev
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusets, USA
| |
Collapse
|
17
|
Atkins JF, Loughran G, Bhatt PR, Firth AE, Baranov PV. Ribosomal frameshifting and transcriptional slippage: From genetic steganography and cryptography to adventitious use. Nucleic Acids Res 2016; 44:7007-78. [PMID: 27436286 PMCID: PMC5009743 DOI: 10.1093/nar/gkw530] [Citation(s) in RCA: 176] [Impact Index Per Article: 19.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2016] [Accepted: 05/26/2016] [Indexed: 12/15/2022] Open
Abstract
Genetic decoding is not ‘frozen’ as was earlier thought, but dynamic. One facet of this is frameshifting that often results in synthesis of a C-terminal region encoded by a new frame. Ribosomal frameshifting is utilized for the synthesis of additional products, for regulatory purposes and for translational ‘correction’ of problem or ‘savior’ indels. Utilization for synthesis of additional products occurs prominently in the decoding of mobile chromosomal element and viral genomes. One class of regulatory frameshifting of stable chromosomal genes governs cellular polyamine levels from yeasts to humans. In many cases of productively utilized frameshifting, the proportion of ribosomes that frameshift at a shift-prone site is enhanced by specific nascent peptide or mRNA context features. Such mRNA signals, which can be 5′ or 3′ of the shift site or both, can act by pairing with ribosomal RNA or as stem loops or pseudoknots even with one component being 4 kb 3′ from the shift site. Transcriptional realignment at slippage-prone sequences also generates productively utilized products encoded trans-frame with respect to the genomic sequence. This too can be enhanced by nucleic acid structure. Together with dynamic codon redefinition, frameshifting is one of the forms of recoding that enriches gene expression.
Collapse
Affiliation(s)
- John F Atkins
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland School of Microbiology, University College Cork, Cork, Ireland Department of Human Genetics, University of Utah, Salt Lake City, UT 84112, USA
| | - Gary Loughran
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Pramod R Bhatt
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| | - Andrew E Firth
- Division of Virology, Department of Pathology, University of Cambridge, Hills Road, Cambridge CB2 0QQ, UK
| | - Pavel V Baranov
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| |
Collapse
|
18
|
Gascuel Q, Bordat A, Sallet E, Pouilly N, Carrere S, Roux F, Vincourt P, Godiard L. Effector Polymorphisms of the Sunflower Downy Mildew Pathogen Plasmopara halstedii and Their Use to Identify Pathotypes from Field Isolates. PLoS One 2016; 11:e0148513. [PMID: 26845339 PMCID: PMC4742249 DOI: 10.1371/journal.pone.0148513] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2015] [Accepted: 01/19/2016] [Indexed: 01/23/2023] Open
Abstract
The obligate biotroph oomycete Plasmopara halstedii causes downy mildew on sunflower crop, Helianthus annuus. The breakdown of several Pl resistance genes used in sunflower hybrids over the last 25 years came along with the appearance of new Pl. halstedii isolates showing modified virulence profiles. In oomycetes, two classes of effector proteins, key players of pathogen virulence, are translocated into the host: RXLR and CRN effectors. We identified 54 putative CRN or RXLR effector genes from transcriptomic data and analyzed their genetic diversity in seven Pl. halstedii pathotypes representative of the species variability. Pl. halstedii effector genes were on average more polymorphic at both the nucleic and protein levels than random non-effector genes, suggesting a potential adaptive dynamics of pathogen virulence over the last 25 years. Twenty-two KASP (Competitive Allele Specific PCR) markers designed on polymorphic effector genes were genotyped on 35 isolates belonging to 14 Pl. halstedii pathotypes. Polymorphism analysis based on eight KASP markers aims at proposing a determination key suitable to classify the eight multi-isolate pathotypes into six groups. This is the first report of a molecular marker set able to discriminate Pl. halstedii pathotypes based on the polymorphism of pathogenicity effectors. Compared to phenotypic tests handling living spores used until now to discriminate Pl. halstedii pathotypes, this set of molecular markers constitutes a first step in faster pathotype diagnosis of Pl. halstedii isolates. Hence, emerging sunflower downy mildew isolates could be more rapidly characterized and thus, assessment of plant resistance breakdown under field conditions should be improved.
Collapse
Affiliation(s)
- Quentin Gascuel
- Institut National de la Recherche Agronomique, INRA, Laboratoire des Interactions Plantes-Microorganismes (LIPM), Unité Mixte de Recherches UMR441, F-31326 Castanet-Tolosan, France
- Centre National de la Recherche Scientifique, CNRS, Laboratoire des Interactions Plantes-Microorganismes (LIPM), Unité Mixte de Recherches UMR2594, F-31326 Castanet-Tolosan, France
| | - Amandine Bordat
- Institut National de la Recherche Agronomique, INRA, Laboratoire des Interactions Plantes-Microorganismes (LIPM), Unité Mixte de Recherches UMR441, F-31326 Castanet-Tolosan, France
- Centre National de la Recherche Scientifique, CNRS, Laboratoire des Interactions Plantes-Microorganismes (LIPM), Unité Mixte de Recherches UMR2594, F-31326 Castanet-Tolosan, France
| | - Erika Sallet
- Institut National de la Recherche Agronomique, INRA, Laboratoire des Interactions Plantes-Microorganismes (LIPM), Unité Mixte de Recherches UMR441, F-31326 Castanet-Tolosan, France
- Centre National de la Recherche Scientifique, CNRS, Laboratoire des Interactions Plantes-Microorganismes (LIPM), Unité Mixte de Recherches UMR2594, F-31326 Castanet-Tolosan, France
| | - Nicolas Pouilly
- Institut National de la Recherche Agronomique, INRA, Laboratoire des Interactions Plantes-Microorganismes (LIPM), Unité Mixte de Recherches UMR441, F-31326 Castanet-Tolosan, France
- Centre National de la Recherche Scientifique, CNRS, Laboratoire des Interactions Plantes-Microorganismes (LIPM), Unité Mixte de Recherches UMR2594, F-31326 Castanet-Tolosan, France
| | - Sébastien Carrere
- Institut National de la Recherche Agronomique, INRA, Laboratoire des Interactions Plantes-Microorganismes (LIPM), Unité Mixte de Recherches UMR441, F-31326 Castanet-Tolosan, France
- Centre National de la Recherche Scientifique, CNRS, Laboratoire des Interactions Plantes-Microorganismes (LIPM), Unité Mixte de Recherches UMR2594, F-31326 Castanet-Tolosan, France
| | - Fabrice Roux
- Institut National de la Recherche Agronomique, INRA, Laboratoire des Interactions Plantes-Microorganismes (LIPM), Unité Mixte de Recherches UMR441, F-31326 Castanet-Tolosan, France
- Centre National de la Recherche Scientifique, CNRS, Laboratoire des Interactions Plantes-Microorganismes (LIPM), Unité Mixte de Recherches UMR2594, F-31326 Castanet-Tolosan, France
| | - Patrick Vincourt
- Institut National de la Recherche Agronomique, INRA, Laboratoire des Interactions Plantes-Microorganismes (LIPM), Unité Mixte de Recherches UMR441, F-31326 Castanet-Tolosan, France
- Centre National de la Recherche Scientifique, CNRS, Laboratoire des Interactions Plantes-Microorganismes (LIPM), Unité Mixte de Recherches UMR2594, F-31326 Castanet-Tolosan, France
| | - Laurence Godiard
- Institut National de la Recherche Agronomique, INRA, Laboratoire des Interactions Plantes-Microorganismes (LIPM), Unité Mixte de Recherches UMR441, F-31326 Castanet-Tolosan, France
- Centre National de la Recherche Scientifique, CNRS, Laboratoire des Interactions Plantes-Microorganismes (LIPM), Unité Mixte de Recherches UMR2594, F-31326 Castanet-Tolosan, France
- * E-mail:
| |
Collapse
|
19
|
Gordon JL, Lefeuvre P, Escalon A, Barbe V, Cruveiller S, Gagnevin L, Pruvost O. Comparative genomics of 43 strains of Xanthomonas citri pv. citri reveals the evolutionary events giving rise to pathotypes with different host ranges. BMC Genomics 2015; 16:1098. [PMID: 26699528 PMCID: PMC4690215 DOI: 10.1186/s12864-015-2310-x] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2015] [Accepted: 12/15/2015] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND The identification of factors involved in the host range definition and evolution is a pivotal challenge in the goal to predict and prevent the emergence of plant bacterial disease. To trace the evolution and find molecular differences between three pathotypes of Xanthomonas citri pv. citri that may explain their distinctive host ranges, 42 strains of X. citri pv. citri and one outgroup strain, Xanthomonas citri pv. bilvae were sequenced and compared. RESULTS The strains from each pathotype form monophyletic clades, with a short branch shared by the A(w) and A pathotypes. Pathotype-specific recombination was detected in seven regions of the alignment. Using Ancestral Character Estimation, 426 SNPs were mapped to the four branches at the base of the A, A*, A(w) and A/A(w) clades. Several genes containing pathotype-specific nonsynonymous mutations have functions related to pathogenicity. The A pathotype is enriched for SNP-containing genes involved in defense mechanisms, while A* is significantly depleted for genes that are involved in transcription. The pathotypes differ by four gene islands that largely coincide with regions of recombination and include genes with a role in virulence. Both A* and A(w) are missing genes involved in defense mechanisms. In contrast to a recent study, we find that there are an extremely small number of pathotype-specific gene presences and absences. CONCLUSIONS The three pathotypes of X. citri pv. citri that differ in their host ranges largely show genomic differences related to recombination, horizontal gene transfer and single nucleotide polymorphism. We detail the phylogenetic relationship of the pathotypes and provide a set of candidate genes involved in pathotype-specific evolutionary events that could explain to the differences in host range and pathogenicity between them.
Collapse
Affiliation(s)
- Jonathan L Gordon
- Université de la Réunion, UMR PVBMT, 97410, Saint-Pierre, La Réunion, France.
- Current Address: CIRAD, UMR CMAEE, F-97170, Petit-Bourg, Guadeloupe, France.
| | | | - Aline Escalon
- CIRAD, UMR PVBMT, 97410, Saint-Pierre, La Réunion, France.
| | - Valérie Barbe
- CEA/DSV/IG/Genoscope, 2 rue Gaston Crémieux, BP5706, 91057, Evry, France.
| | | | - Lionel Gagnevin
- CIRAD, UMR PVBMT, 97410, Saint-Pierre, La Réunion, France.
- Current Address: UMR IPME, IRD-CIRAD-Université Montpellier, 34394, Montpellier, France.
| | | |
Collapse
|
20
|
Baranov PV, Atkins JF, Yordanova MM. Augmented genetic decoding: global, local and temporal alterations of decoding processes and codon meaning. Nat Rev Genet 2015; 16:517-29. [PMID: 26260261 DOI: 10.1038/nrg3963] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
The non-universality of the genetic code is now widely appreciated. Codes differ between organisms, and certain genes are known to alter the decoding rules in a site-specific manner. Recently discovered examples of decoding plasticity are particularly spectacular. These examples include organisms and organelles with disruptions of triplet continuity during the translation of many genes, viruses that alter the entire genetic code of their hosts and organisms that adjust their genetic code in response to changing environments. In this Review, we outline various modes of alternative genetic decoding and expand existing terminology to accommodate recently discovered manifestations of this seemingly sophisticated phenomenon.
Collapse
Affiliation(s)
- Pavel V Baranov
- School of Biochemistry and Cell Biology, University College Cork, Ireland
| | - John F Atkins
- 1] School of Biochemistry and Cell Biology, University College Cork, Ireland. [2] Department of Human Genetics, University of Utah, 15 N 2030 E Rm. 7410, Salt Lake City, Utah 84112-5330, USA
| | | |
Collapse
|
21
|
Productive mRNA stem loop-mediated transcriptional slippage: Crucial features in common with intrinsic terminators. Proc Natl Acad Sci U S A 2015; 112:E1984-93. [PMID: 25848054 DOI: 10.1073/pnas.1418384112] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Escherichia coli and yeast DNA-dependent RNA polymerases are shown to mediate efficient nascent transcript stem loop formation-dependent RNA-DNA hybrid realignment. The realignment was discovered on the heteropolymeric sequence T5C5 and yields transcripts lacking a C residue within a corresponding U5C4. The sequence studied is derived from a Roseiflexus insertion sequence (IS) element where the resulting transcriptional slippage is required for transposase synthesis. The stability of the RNA structure, the proximity of the stem loop to the slippage site, the length and composition of the slippage site motif, and the identity of its 3' adjacent nucleotides (nt) are crucial for transcripts lacking a single C. In many respects, the RNA structure requirements for this slippage resemble those for hairpin-dependent transcription termination. In a purified in vitro system, the slippage efficiency ranges from 5% to 75% depending on the concentration ratios of the nucleotides specified by the slippage sequence and the 3' nt context. The only previous proposal of stem loop mediated slippage, which was in Ebola virus expression, was based on incorrect data interpretation. We propose a mechanical slippage model involving the RNAP translocation state as the main motor in slippage directionality and efficiency. It is distinct from previously described models, including the one proposed for paramyxovirus, where following random movement efficiency is mainly dependent on the stability of the new realigned hybrid. In broadening the scope for utilization of transcription slippage for gene expression, the stimulatory structure provides parallels with programmed ribosomal frameshifting at the translation level.
Collapse
|
22
|
Abstract
ABSTRACT
The number and diversity of known prokaryotic insertion sequences (IS) have increased enormously since their discovery in the late 1960s. At present the sequences of more than 4000 different IS have been deposited in the specialized ISfinder database. Over time it has become increasingly apparent that they are important actors in the evolution of their host genomes and are involved in sequestering, transmitting, mutating and activating genes, and in the rearrangement of both plasmids and chromosomes. This review presents an overview of our current understanding of these transposable elements (TE), their organization and their transposition mechanism as well as their distribution and genomic impact. In spite of their diversity, they share only a very limited number of transposition mechanisms which we outline here. Prokaryotic IS are but one example of a variety of diverse TE which are being revealed due to the advent of extensive genome sequencing projects. A major conclusion from sequence comparisons of various TE is that frontiers between the different types are becoming less clear. We detail these receding frontiers between different IS-related TE. Several, more specialized chapters in this volume include additional detailed information concerning a number of these.
In a second section of the review, we provide a detailed description of the expanding variety of IS, which we have divided into families for convenience. Our perception of these families continues to evolve and families emerge regularly as more IS are identified. This section is designed as an aid and a source of information for consultation by interested specialist readers.
Collapse
|
23
|
Wons E, Furmanek-Blaszk B, Sektas M. RNA editing by T7 RNA polymerase bypasses InDel mutations causing unexpected phenotypic changes. Nucleic Acids Res 2015; 43:3950-63. [PMID: 25824942 PMCID: PMC4417176 DOI: 10.1093/nar/gkv269] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2015] [Accepted: 03/17/2015] [Indexed: 12/26/2022] Open
Abstract
DNA-dependent T7 RNA polymerase (T7 RNAP) is the most powerful tool for both gene expression and in vitro transcription. By using a Next Generation Sequencing (NGS) approach we have analyzed the polymorphism of a T7 RNAP-generated mRNA pool of the mboIIM2 gene. We find that the enzyme displays a relatively high level of template-dependent transcriptional infidelity. The nucleotide misincorporations and multiple insertions in A/T-rich tracts of homopolymers in mRNA (0.20 and 0.089%, respectively) cause epigenetic effects with significant impact on gene expression that is disproportionally high to their frequency of appearance. The sequence-dependent rescue of single and even double InDel frameshifting mutants and wild-type phenotype recovery is observed as a result. As a consequence, a heterogeneous pool of functional and non-functional proteins of almost the same molecular mass is produced where the proteins are indistinguishable from each other upon ordinary analysis. We suggest that transcriptional infidelity as a general feature of the most effective RNAPs may serve to repair and/or modify a protein function, thus increasing the repertoire of phenotypic variants, which in turn has a high evolutionary potential.
Collapse
Affiliation(s)
- Ewa Wons
- Department of Microbiology, University of Gdansk, Gdansk 80-308, Poland
| | | | - Marian Sektas
- Department of Microbiology, University of Gdansk, Gdansk 80-308, Poland
| |
Collapse
|
24
|
Ribosomal frameshifting and dual-target antiactivation restrict quorum-sensing-activated transfer of a mobile genetic element. Proc Natl Acad Sci U S A 2015; 112:4104-9. [PMID: 25787256 DOI: 10.1073/pnas.1501574112] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Symbiosis islands are integrative and conjugative mobile genetic elements that convert nonsymbiotic rhizobia into nitrogen-fixing symbionts of leguminous plants. Excision of the Mesorhizobium loti symbiosis island ICEMlSym(R7A) is indirectly activated by quorum sensing through TraR-dependent activation of the excisionase gene rdfS. Here we show that a +1 programmed ribosomal frameshift (PRF) fuses the coding sequences of two TraR-activated genes, msi172 and msi171, producing an activator of rdfS expression named Frameshifted excision activator (FseA). Mass-spectrometry and mutational analyses indicated that the PRF occurred through +1 slippage of the tRNA(phe) from UUU to UUC within a conserved msi172-encoded motif. FseA activated rdfS expression in the absence of ICEMlSym(R7A), suggesting that it directly activated rdfS transcription, despite being unrelated to any characterized DNA-binding proteins. Bacterial two-hybrid and gene-reporter assays demonstrated that FseA was also bound and inhibited by the ICEMlSym(R7A)-encoded quorum-sensing antiactivator QseM. Thus, activation of ICEMlSym(R7A) excision is counteracted by TraR antiactivation, ribosomal frameshifting, and FseA antiactivation. This robust suppression likely dampens the inherent biological noise present in the quorum-sensing autoinduction circuit and ensures that ICEMlSym(R7A) transfer only occurs in a subpopulation of cells in which both qseM expression is repressed and FseA is translated. The architecture of the ICEMlSym(R7A) transfer regulatory system provides an example of how a set of modular components have assembled through evolution to form a robust genetic toggle that regulates gene transcription and translation at both single-cell and cell-population levels.
Collapse
|
25
|
Gueguen E, Wills NM, Atkins JF, Cascales E. Transcriptional frameshifting rescues Citrobacter rodentium type VI secretion by the production of two length variants from the prematurely interrupted tssM gene. PLoS Genet 2014; 10:e1004869. [PMID: 25474156 PMCID: PMC4256274 DOI: 10.1371/journal.pgen.1004869] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2013] [Accepted: 11/03/2014] [Indexed: 11/30/2022] Open
Abstract
The Type VI secretion system (T6SS) mediates toxin delivery into both eukaryotic and prokaryotic cells. It is composed of a cytoplasmic structure resembling the tail of contractile bacteriophages anchored to the cell envelope through a membrane complex composed of the TssL and TssM inner membrane proteins and of the TssJ outer membrane lipoprotein. The C-terminal domain of TssM is required for its interaction with TssJ, and for the function of the T6SS. In Citrobacter rodentium, the tssM1 gene does not encode the C-terminal domain. However, the stop codon is preceded by a run of 11 consecutive adenosines. In this study, we demonstrate that this poly-A tract is a transcriptional slippery site that induces the incorporation of additional adenosines, leading to frameshifting, and hence the production of two TssM1 variants, including a full-length canonical protein. We show that both forms of TssM1, and the ratio between these two forms, are required for the function of the T6SS in C. rodentium. Finally, we demonstrate that the tssM gene associated with the Yersinia pseudotuberculosis T6SS-3 gene cluster is also subjected to transcriptional frameshifting. Nonstandard decoding mechanisms lead to the synthesis of different protein variants from a single DNA sequence. These mechanisms are particularly important when the genome length has to be limited such as viral genomes, limited by the available space in the capsid, or to synthesize two different polypeptides that have distinct functional properties. Here, we report that tssM, a gene encoded within the Citrobacter rodentium Type VI secretion (T6S) gene cluster, is interrupted by a premature stop codon; however, the stop codon is preceded by a slippery site constituted by 11 consecutive adenosines. Reiterative transcription leads to the incorporation of additional nucleotides in the mRNA and therefore restores the original framing. As a consequence, two different TssM variants are created by transcriptional frameshifting, including a full-length 130-kDa protein and an 88-kDa truncated variant. We further show that both forms, and the ratio between these two forms, are required for the function of the transport apparatus. Interestingly, a similar mechanism regulates the synthesis of two TssM variants in Yersinia pseudotuberculosis.
Collapse
Affiliation(s)
- Erwan Gueguen
- Laboratoire d'Ingénierie des Systèmes Macromoléculaires (LISM), Institut de Microbiologie de la Méditerranée, CNRS – Aix-Marseille Université, UMR 7255, Marseille, France
- * E-mail: (EG); (EC)
| | - Norma M. Wills
- Department of Human Genetics, University of Utah, Salt Lake City, Utah, United States of America
| | - John F. Atkins
- Department of Human Genetics, University of Utah, Salt Lake City, Utah, United States of America
- Departments of Biochemistry and Microbiology, University College Cork, Cork, Ireland
| | - Eric Cascales
- Laboratoire d'Ingénierie des Systèmes Macromoléculaires (LISM), Institut de Microbiologie de la Méditerranée, CNRS – Aix-Marseille Université, UMR 7255, Marseille, France
- * E-mail: (EG); (EC)
| |
Collapse
|
26
|
Firth AE. Mapping overlapping functional elements embedded within the protein-coding regions of RNA viruses. Nucleic Acids Res 2014; 42:12425-39. [PMID: 25326325 PMCID: PMC4227794 DOI: 10.1093/nar/gku981] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2014] [Revised: 09/20/2014] [Accepted: 10/04/2014] [Indexed: 12/29/2022] Open
Abstract
Identification of the full complement of genes and other functional elements in any virus is crucial to fully understand its molecular biology and guide the development of effective control strategies. RNA viruses have compact multifunctional genomes that frequently contain overlapping genes and non-coding functional elements embedded within protein-coding sequences. Overlapping features often escape detection because it can be difficult to disentangle the multiple roles of the constituent nucleotides via mutational analyses, while high-throughput experimental techniques are often unable to distinguish functional elements from incidental features. However, RNA viruses evolve very rapidly so that, even within a single species, substitutions rapidly accumulate at neutral or near-neutral sites providing great potential for comparative genomics to distinguish the signature of purifying selection. Computationally identified features can then be efficiently targeted for experimental analysis. Here we analyze alignments of protein-coding virus sequences to identify regions where there is a statistically significant reduction in the degree of variability at synonymous sites, a characteristic signature of overlapping functional elements. Having previously tested this technique by experimental verification of discoveries in selected viruses, we now analyze sequence alignments for ∼700 RNA virus species to identify hundreds of such regions, many of which have not been previously described.
Collapse
Affiliation(s)
- Andrew E Firth
- Division of Virology, Department of Pathology, University of Cambridge, Cambridge CB2 1QP, UK
| |
Collapse
|
27
|
Sharma V, Prère MF, Canal I, Firth AE, Atkins JF, Baranov PV, Fayet O. Analysis of tetra- and hepta-nucleotides motifs promoting -1 ribosomal frameshifting in Escherichia coli. Nucleic Acids Res 2014; 42:7210-25. [PMID: 24875478 PMCID: PMC4066793 DOI: 10.1093/nar/gku386] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Programmed ribosomal -1 frameshifting is a non-standard decoding process occurring when ribosomes encounter a signal embedded in the mRNA of certain eukaryotic and prokaryotic genes. This signal has a mandatory component, the frameshift motif: it is either a Z_ZZN tetramer or a X_XXZ_ZZN heptamer (where ZZZ and XXX are three identical nucleotides) allowing cognate or near-cognate repairing to the -1 frame of the A site or A and P sites tRNAs. Depending on the signal, the frameshifting frequency can vary over a wide range, from less than 1% to more than 50%. The present study combines experimental and bioinformatics approaches to carry out (i) a systematic analysis of the frameshift propensity of all possible motifs (16 Z_ZZN tetramers and 64 X_XXZ_ZZN heptamers) in Escherichia coli and (ii) the identification of genes potentially using this mode of expression amongst 36 Enterobacteriaceae genomes. While motif efficiency varies widely, a major distinctive rule of bacterial -1 frameshifting is that the most efficient motifs are those allowing cognate re-pairing of the A site tRNA from ZZN to ZZZ. The outcome of the genomic search is a set of 69 gene clusters, 59 of which constitute new candidates for functional utilization of -1 frameshifting.
Collapse
Affiliation(s)
- Virag Sharma
- School of Biochemistry and Cell biology, University College Cork, Cork, Ireland
| | - Marie-Françoise Prère
- Laboratoire de Microbiologie et Génétique moléculaire, UMR5100, Centre National de la Recherche Scientifique, Université Paul Sabatier-Toulouse III, 118 route de Narbonne, Toulouse 31062-cedex, France
| | - Isabelle Canal
- Laboratoire de Microbiologie et Génétique moléculaire, UMR5100, Centre National de la Recherche Scientifique, Université Paul Sabatier-Toulouse III, 118 route de Narbonne, Toulouse 31062-cedex, France
| | - Andrew E Firth
- Department of Pathology, University of Cambridge, Cambridge CB2 1QP, UK
| | - John F Atkins
- School of Biochemistry and Cell biology, University College Cork, Cork, Ireland Department of Human Genetics, University of Utah, 15N 2030E, Rm7410, Salt Lake City, UT 84112-5330, USA
| | - Pavel V Baranov
- School of Biochemistry and Cell biology, University College Cork, Cork, Ireland
| | - Olivier Fayet
- Laboratoire de Microbiologie et Génétique moléculaire, UMR5100, Centre National de la Recherche Scientifique, Université Paul Sabatier-Toulouse III, 118 route de Narbonne, Toulouse 31062-cedex, France
| |
Collapse
|
28
|
Bypass of the pre-60S ribosomal quality control as a pathway to oncogenesis. Proc Natl Acad Sci U S A 2014; 111:5640-5. [PMID: 24706786 DOI: 10.1073/pnas.1400247111] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Ribosomopathies are a class of diseases caused by mutations that affect the biosynthesis and/or functionality of the ribosome. Although they initially present as hypoproliferative disorders, such as anemia, patients have elevated risk of hyperproliferative disease (cancer) by midlife. Here, this paradox is explored using the rpL10-R98S (uL16-R98S) mutant yeast model of the most commonly identified ribosomal mutation in acute lymphoblastic T-cell leukemia. This mutation causes a late-stage 60S subunit maturation failure that targets mutant ribosomes for degradation. The resulting deficit in ribosomes causes the hypoproliferative phenotype. This 60S subunit shortage, in turn, exerts pressure on cells to select for suppressors of the ribosome biogenesis defect, allowing them to reestablish normal levels of ribosome production and cell proliferation. However, suppression at this step releases structurally and functionally defective ribosomes into the translationally active pool, and the translational fidelity defects of these mutants culminate in destabilization of selected mRNAs and shortened telomeres. We suggest that in exchange for resolving their short-term ribosome deficits through compensatory trans-acting suppressors, cells are penalized in the long term by changes in gene expression that ultimately undermine cellular homeostasis.
Collapse
|
29
|
Siguier P, Gourbeyre E, Chandler M. Bacterial insertion sequences: their genomic impact and diversity. FEMS Microbiol Rev 2014; 38:865-91. [PMID: 24499397 PMCID: PMC7190074 DOI: 10.1111/1574-6976.12067] [Citation(s) in RCA: 428] [Impact Index Per Article: 38.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2013] [Revised: 01/19/2014] [Accepted: 01/22/2014] [Indexed: 01/06/2023] Open
Abstract
Insertion sequences (ISs), arguably the smallest and most numerous autonomous transposable elements (TEs), are important players in shaping their host genomes. This review focuses on prokaryotic ISs. We discuss IS distribution and impact on genome evolution. We also examine their effects on gene expression, especially their role in activating neighbouring genes, a phenomenon of particular importance in the recent upsurge of bacterial antibiotic resistance. We explain how ISs are identified and classified into families by a combination of characteristics including their transposases (Tpases), their overall genetic organisation and the accessory genes which some ISs carry. We then describe the organisation of autonomous and nonautonomous IS‐related elements. This is used to illustrate the growing recognition that the boundaries between different types of mobile element are becoming increasingly difficult to define as more are being identified. We review the known Tpase types, their different catalytic activities used in cleaving and rejoining DNA strands during transposition, their organisation into functional domains and the role of this in regulation. Finally, we consider examples of prokaryotic IS domestication. In a more speculative section, we discuss the necessity of constructing more quantitative dynamic models to fully appreciate the continuing impact of TEs on prokaryotic populations.
Collapse
Affiliation(s)
- Patricia Siguier
- Laboratoire de Microbiologie et Génétique Moléculaires, Unité Mixte de Recherche 5100, Centre National de Recherche Scientifique, Toulouse Cedex, France
| | | | | |
Collapse
|
30
|
Niu S, Cao S, Wong SM. An infectious RNA with a hepta-adenosine stretch responsible for programmed -1 ribosomal frameshift derived from a full-length cDNA clone of Hibiscus latent Singapore virus. Virology 2014; 449:229-34. [PMID: 24418557 PMCID: PMC7127180 DOI: 10.1016/j.virol.2013.11.021] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2013] [Revised: 11/09/2013] [Accepted: 11/12/2013] [Indexed: 11/27/2022]
Abstract
Hibiscus latent Singapore virus (HLSV) is a member of Tobamovirus and its full-length cDNA clones were constructed. The in vitro transcripts from two HLSV full-length cDNA clones, which contain a hepta-adenosine stretch (pHLSV-7A) and an octo-adenosine stretch (pHLSV-8A), are both infectious. The replication level of HLSV-7A in Nicotiana benthamiana protoplasts was 5-fold lower, as compared to that of HLSV-8A. The replicase proteins of HLSV-7A were produced through programmed -1 ribosomal frameshift (-1 PRF) and the 7A stretch was a slippery sequence for -1 PRF. Mutations to the downstream pseudoknot of 7A stretch showed that the pseudoknot was not required for the frameshift in vitro. The stretch was found to be extended to 8A after subsequent replication cycles in vivo. It is envisaged that HLSV employs the monotonous runs of A and -1 PRF to convert its 7A to 8A to reach higher replication for its survival in plants.
Collapse
Affiliation(s)
- Shengniao Niu
- Department of Biological Sciences, National University of Singapore, 14 Science Drive 4, Lower Kent Ridge Road, Singapore 117543, Singapore
| | - Shishu Cao
- Department of Biological Sciences, National University of Singapore, 14 Science Drive 4, Lower Kent Ridge Road, Singapore 117543, Singapore
| | - Sek-Man Wong
- Department of Biological Sciences, National University of Singapore, 14 Science Drive 4, Lower Kent Ridge Road, Singapore 117543, Singapore; Temasek Life Sciences Laboratory, 1 Research Link, Singapore 117604, Singapore; National University of Singapore Suzhou Research Institute, Suzhou, Jiangsu 215123, China.
| |
Collapse
|
31
|
Peeters N, Carrère S, Anisimova M, Plener L, Cazalé AC, Genin S. Repertoire, unified nomenclature and evolution of the Type III effector gene set in the Ralstonia solanacearum species complex. BMC Genomics 2013; 14:859. [PMID: 24314259 PMCID: PMC3878972 DOI: 10.1186/1471-2164-14-859] [Citation(s) in RCA: 156] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2013] [Accepted: 11/29/2013] [Indexed: 12/21/2022] Open
Abstract
Background Ralstonia solanacearum is a soil-borne beta-proteobacterium that causes bacterial wilt disease in many food crops and is a major problem for agriculture in intertropical regions. R. solanacearum is a heterogeneous species, both phenotypically and genetically, and is considered as a species complex. Pathogenicity of R. solanacearum relies on the Type III secretion system that injects Type III effector (T3E) proteins into plant cells. T3E collectively perturb host cell processes and modulate plant immunity to enable bacterial infection. Results We provide the catalogue of T3E in the R. solanacearum species complex, as well as candidates in newly sequenced strains. 94 T3E orthologous groups were defined on phylogenetic bases and ordered using a uniform nomenclature. This curated T3E catalog is available on a public website and a bioinformatic pipeline has been designed to rapidly predict T3E genes in newly sequenced strains. Systematical analyses were performed to detect lateral T3E gene transfer events and identify T3E genes under positive selection. Our analyses also pinpoint the RipF translocon proteins as major discriminating determinants among the phylogenetic lineages. Conclusions Establishment of T3E repertoires in strains representatives of the R. solanacearum biodiversity allowed determining a set of 22 T3E present in all the strains but provided no clues on host specificity determinants. The definition of a standardized nomenclature and the optimization of predictive tools will pave the way to understanding how variation of these repertoires is correlated to the diversification of this species complex and how they contribute to the different strain pathotypes.
Collapse
Affiliation(s)
- Nemo Peeters
- INRA, Laboratoire des Interactions Plantes-Microorganismes (LIPM), UMR441, F-31326 Castanet-Tolosan, France.
| | | | | | | | | | | |
Collapse
|
32
|
Liang J, Blumenthal RM. Naturally-occurring, dually-functional fusions between restriction endonucleases and regulatory proteins. BMC Evol Biol 2013; 13:218. [PMID: 24083337 PMCID: PMC3850674 DOI: 10.1186/1471-2148-13-218] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2013] [Accepted: 10/01/2013] [Indexed: 01/03/2023] Open
Abstract
Background Restriction-modification (RM) systems appear to play key roles in modulating gene flow among bacteria and archaea. Because the restriction endonuclease (REase) is potentially lethal to unmethylated new host cells, regulation to ensure pre-expression of the protective DNA methyltransferase (MTase) is essential to the spread of RM genes. This is particularly true for Type IIP RM systems, in which the REase and MTase are separate, independently-active proteins. A substantial subset of Type IIP RM systems are controlled by an activator-repressor called C protein. In these systems, C controls the promoter for its own gene, and for the downstream REase gene that lacks its own promoter. Thus MTase is expressed immediately after the RM genes enter a new cell, while expression of REase is delayed until sufficient C protein accumulates. To study the variation in and evolution of this regulatory mechanism, we searched for RM systems closely related to the well-studied C protein-dependent PvuII RM system. Unexpectedly, among those found were several in which the C protein and REase genes were fused. Results The gene for CR.NsoJS138I fusion protein (nsoJS138ICR, from the bacterium Niabella soli) was cloned, and the fusion protein produced and partially purified. Western blots provided no evidence that, under the conditions tested, anything other than full-length fusion protein is produced. This protein had REase activity in vitro and, as expected from the sequence similarity, its specificity was indistinguishable from that for PvuII REase, though the optimal reaction conditions were different. Furthermore, the fusion was active as a C protein, as revealed by in vivo activation of a lacZ reporter fusion to the promoter region for the nsoJS138ICR gene. Conclusions Fusions between C proteins and REases have not previously been characterized, though other fusions have (such as between REases and MTases). These results reinforce the evidence for impressive modularity among RM system proteins, and raise important questions about the implications of the C-REase fusions on expression kinetics of these RM systems.
Collapse
Affiliation(s)
- Jixiao Liang
- Department of Medical Microbiology & Immunology, College of Medicine and Life Sciences, University of Toledo, 3100 Transverse Drive, Toledo, OH 43614, USA.
| | | |
Collapse
|
33
|
Viral proteins originated de novo by overprinting can be identified by codon usage: application to the "gene nursery" of Deltaretroviruses. PLoS Comput Biol 2013; 9:e1003162. [PMID: 23966842 PMCID: PMC3744397 DOI: 10.1371/journal.pcbi.1003162] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2012] [Accepted: 06/13/2013] [Indexed: 12/24/2022] Open
Abstract
A well-known mechanism through which new protein-coding genes originate is by modification of pre-existing genes, e.g. by duplication or horizontal transfer. In contrast, many viruses generate protein-coding genes de novo, via the overprinting of a new reading frame onto an existing (“ancestral”) frame. This mechanism is thought to play an important role in viral pathogenicity, but has been poorly explored, perhaps because identifying the de novo frames is very challenging. Therefore, a new approach to detect them was needed. We assembled a reference set of overlapping genes for which we could reliably determine the ancestral frames, and found that their codon usage was significantly closer to that of the rest of the viral genome than the codon usage of de novo frames. Based on this observation, we designed a method that allowed the identification of de novo frames based on their codon usage with a very good specificity, but intermediate sensitivity. Using our method, we predicted that the Rex gene of deltaretroviruses has originated de novo by overprinting the Tax gene. Intriguingly, several genes in the same genomic region have also originated de novo and encode proteins that regulate the functions of Tax. Such “gene nurseries” may be common in viral genomes. Finally, our results confirm that the genomic GC content is not the only determinant of codon usage in viruses and suggest that a constraint linked to translation must influence codon usage. How does novelty originate in nature? It is commonly thought that new genes are generated mainly by modifications of existing genes (the “tinkering” model). In contrast, we have shown recently that in viruses, numerous genes are generated entirely de novo (“from scratch”). The role of these genes remains underexplored, however, because they are difficult to identify. We have therefore developed a new method to detect genes originated de novo in viral genomes, based on the observation that each viral genome has a unique “signature”, which genes originated de novo do not share. We applied this method to analyze the genes of Human T-Lymphotropic Virus 1 (HTLV1), a relative of the HIV virus and also a major human pathogen that infects about twenty million people worldwide. The life cycle of HTLV1 is finely regulated – it can stay dormant for long periods and can provoke blood cancers (leukemias) after a very long incubation. We discovered that several of the genes of HTLV1 have originated de novo. These novel genes play a key role in regulating the life cycle of HTLV1, and presumably its pathogenicity. Our investigations suggest that such “gene nurseries” may be common in viruses.
Collapse
|
34
|
Antonov I, Coakley A, Atkins JF, Baranov PV, Borodovsky M. Identification of the nature of reading frame transitions observed in prokaryotic genomes. Nucleic Acids Res 2013; 41:6514-30. [PMID: 23649834 PMCID: PMC3711429 DOI: 10.1093/nar/gkt274] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2012] [Revised: 02/22/2013] [Accepted: 03/22/2013] [Indexed: 12/11/2022] Open
Abstract
Our goal was to identify evolutionary conserved frame transitions in protein coding regions and to uncover an underlying functional role of these structural aberrations. We used the ab initio frameshift prediction program, GeneTack, to detect reading frame transitions in 206 991 genes (fs-genes) from 1106 complete prokaryotic genomes. We grouped 102 731 fs-genes into 19 430 clusters based on sequence similarity between protein products (fs-proteins) as well as conservation of predicted position of the frameshift and its direction. We identified 4010 pseudogene clusters and 146 clusters of fs-genes apparently using recoding (local deviation from using standard genetic code) due to possessing specific sequence motifs near frameshift positions. Particularly interesting was finding of a novel type of organization of the dnaX gene, where recoding is required for synthesis of the longer subunit, τ. We selected 20 clusters of predicted recoding candidates and designed a series of genetic constructs with a reporter gene or affinity tag whose expression would require a frameshift event. Expression of the constructs in Escherichia coli demonstrated enrichment of the set of candidates with sequences that trigger genuine programmed ribosomal frameshifting; we have experimentally confirmed four new families of programmed frameshifts.
Collapse
Affiliation(s)
- Ivan Antonov
- School of Computational Science and Engineering at Georgia Tech, Atlanta, GA 30332, USA, Department of Biochemistry, University College Cork, Ireland, Department of Biological and Medical Physics, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141700, Russia, Center for Bioinformatics and Computational Genomics at Georgia Tech and Joint Georgia Tech and Emory Wallace H Coulter Department of Biomedical Engineering, Atlanta, GA 30332, USA
| | - Arthur Coakley
- School of Computational Science and Engineering at Georgia Tech, Atlanta, GA 30332, USA, Department of Biochemistry, University College Cork, Ireland, Department of Biological and Medical Physics, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141700, Russia, Center for Bioinformatics and Computational Genomics at Georgia Tech and Joint Georgia Tech and Emory Wallace H Coulter Department of Biomedical Engineering, Atlanta, GA 30332, USA
| | - John F. Atkins
- School of Computational Science and Engineering at Georgia Tech, Atlanta, GA 30332, USA, Department of Biochemistry, University College Cork, Ireland, Department of Biological and Medical Physics, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141700, Russia, Center for Bioinformatics and Computational Genomics at Georgia Tech and Joint Georgia Tech and Emory Wallace H Coulter Department of Biomedical Engineering, Atlanta, GA 30332, USA
| | - Pavel V. Baranov
- School of Computational Science and Engineering at Georgia Tech, Atlanta, GA 30332, USA, Department of Biochemistry, University College Cork, Ireland, Department of Biological and Medical Physics, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141700, Russia, Center for Bioinformatics and Computational Genomics at Georgia Tech and Joint Georgia Tech and Emory Wallace H Coulter Department of Biomedical Engineering, Atlanta, GA 30332, USA
| | - Mark Borodovsky
- School of Computational Science and Engineering at Georgia Tech, Atlanta, GA 30332, USA, Department of Biochemistry, University College Cork, Ireland, Department of Biological and Medical Physics, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141700, Russia, Center for Bioinformatics and Computational Genomics at Georgia Tech and Joint Georgia Tech and Emory Wallace H Coulter Department of Biomedical Engineering, Atlanta, GA 30332, USA
| |
Collapse
|
35
|
Allelic variation in a simple sequence repeat element of neisserial pglB2 and its consequences for protein expression and protein glycosylation. J Bacteriol 2013; 195:3476-85. [PMID: 23729645 DOI: 10.1128/jb.00276-13] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
Neisseria species express an O-linked glycosylation system in which functionally distinct proteins are elaborated with variable glycans. A major source of glycan diversity in N. meningitidis results from two distinct pglB alleles responsible for the synthesis of either N,N'-diacetylbacillosamine or glyceramido-acetamido trideoxyhexose that occupy the reducing end of the oligosaccharides. Alternative modifications at C-4 of the precursor UDP-4-amino are attributable to distinct C-terminal domains that dictate either acetyltransferase or glyceramidotransferase activity, encoded by pglB and pglB2, respectively. Naturally occurring alleles of pglB2 have homopolymeric tracts of either 7 or 8 adenosines (As) bridging the C-terminal open reading frame (ORF) and the ORF encompassing the conserved N-terminal domain associated with phosphoglycosyltransferase activity. In the work presented here, we explored the consequences of such pglB2 allele variation and found that, although both alleles are functional vis-à-vis glycosylation, the 7A form results in the expression of a single, multidomain protein, while the 8A variant elicits two single-domain proteins. We also found that the glyceramidotransferase activity-encoding domain is essential to protein glycosylation, showing the critical role of the C-4 modification of the precursor UDP-4-amino in the pathway. These findings were further extended and confirmed by examining the phenotypic consequences of extended poly(A) tract length variation. Although ORFs related to those of pglB2 are broadly distributed in eubacteria, they are primarily found as two distinct, juxtaposed ORFs. Thus, the neisserial pglB2 system provides novel insights into the potential influence of hypermutability on modular evolution of proteins by providing a unique snapshot of the progression of ongoing gene fusion.
Collapse
|
36
|
Guo FB, Xiong L, Teng JLL, Yuen KY, Lau SKP, Woo PCY. Re-annotation of protein-coding genes in 10 complete genomes of Neisseriaceae family by combining similarity-based and composition-based methods. DNA Res 2013; 20:273-86. [PMID: 23571676 PMCID: PMC3686433 DOI: 10.1093/dnares/dst009] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
In this paper, we performed a comprehensive re-annotation of protein-coding genes by a systematic method combining composition- and similarity-based approaches in 10 complete bacterial genomes of the family Neisseriaceae. First, 418 hypothetical genes were predicted as non-coding using the composition-based method and 413 were eliminated from the gene list. Both the scatter plot and cluster of orthologous groups (COG) fraction analyses supported the result. Second, from 20 to 400 hypothetical proteins were assigned with functions in each of the 10 strains based on the homology search. Among newly assigned functions, 397 are so detailed to have definite gene names. Third, 106 genes missed by the original annotations were picked up by an ab initio gene finder combined with similarity alignment. Transcriptional experiments validated the effectiveness of this method in Laribacter hongkongensis and Chromobacterium violaceum. Among the 106 newly found genes, some deserve particular interests. For example, 27 transposases were newly found in Neiserria meningitidis alpha14. In Neiserria gonorrhoeae NCCP11945, four new genes with putative functions and definite names (nusG, rpsN, rpmD and infA) were found and homologues of them usually are essential for survival in bacteria. The updated annotations for the 10 Neisseriaceae genomes provide a more accurate prediction of protein-coding genes and a more detailed functional information of hypothetical proteins. It will benefit research into the lifestyle, metabolism, environmental adaption and pathogenicity of the Neisseriaceae species. The re-annotation procedure could be used directly, or after the adaption of detailed methods, for checking annotations of any other bacterial or archaeal genomes.
Collapse
Affiliation(s)
- Feng-Biao Guo
- Department of Microbiology, The University of Hong Kong, Special Administrative Region, Hong Kong, People's Republic of China
| | | | | | | | | | | |
Collapse
|
37
|
Antonov I, Baranov P, Borodovsky M. GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences. Nucleic Acids Res 2012; 41:D152-6. [PMID: 23161689 PMCID: PMC3531167 DOI: 10.1093/nar/gks1062] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Database annotations of prokaryotic genomes and eukaryotic mRNA sequences pay relatively low attention to frame transitions that disrupt protein-coding genes. Frame transitions (frameshifts) could be caused by sequencing errors or indel mutations inside protein-coding regions. Other observed frameshifts are related to recoding events (that evolved to control expression of some genes). Earlier, we have developed an algorithm and software program GeneTack for ab initio frameshift finding in intronless genes. Here, we describe a database (freely available at http://topaz.gatech.edu/GeneTack/db.html) containing genes with frameshifts (fs-genes) predicted by GeneTack. The database includes 206 991 fs-genes from 1106 complete prokaryotic genomes and 45 295 frameshifts predicted in mRNA sequences from 100 eukaryotic genomes. The whole set of fs-genes was grouped into clusters based on sequence similarity between fs-proteins (conceptually translated fs-genes), conservation of the frameshift position and frameshift direction (−1, +1). The fs-genes can be retrieved by similarity search to a given query sequence via a web interface, by fs-gene cluster browsing, etc. Clusters of fs-genes are characterized with respect to their likely origin, such as pseudogenization, phase variation, etc. The largest clusters contain fs-genes with programed frameshifts (related to recoding events).
Collapse
Affiliation(s)
- Ivan Antonov
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | | | | |
Collapse
|
38
|
Hsu MK, Chen FC. Selective constraint on the upstream open reading frames that overlap with coding sequences in animals. PLoS One 2012; 7:e48413. [PMID: 23133632 PMCID: PMC3486843 DOI: 10.1371/journal.pone.0048413] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2012] [Accepted: 09/24/2012] [Indexed: 11/18/2022] Open
Abstract
Upstream open reading frames (uORFs) are translational regulatory elements located in 5′ untranslated regions. They can significantly repress the translation of the downstream coding sequences (CDS), and participate in the spatio-temporal regulations of protein translation. Notwithstanding this biological significance, the selective constraint on uORFs remains underexplored. Particularly, the uORFs that partially overlap with CDS with a different reading frame (overlapping uORFs, or “VuORFs”) may lead to strong translational inhibition or N-terminal truncation of the peptides encoded by the affected CDS. By analyzing VuORF-containing transcripts (designated as “VuORF transcripts”) in human, mouse, and fruit fly, we demonstrate that VuORFs are in general slightly deleterious - the proportion of genes that encode at least one VuORF transcript is significantly smaller than expected in all of the three examined species. In addition, this proportion is significantly smaller in fruit fly than in mammals, indicating a higher efficiency of removing VuORFs in the former species because of its larger effective population size. Furthermore, the deleterious effect of a VuORF depends on the sequence context of its start codon (VuAUG). VuORFs with an optimal VuAUG context are more strongly disfavored than those with a suboptimal context in all of the three examined species. And the propensity to remove optimal-context VuAUGs is stronger in fruit fly than in mammals. Intriguingly, however, the currently observable optimal-context VuAUGs (but not suboptimal-context VuAUGs) are more conserved than expected. These observations suggest that the regulatory functions of VuORFs may have been gained fortuitously in organisms with a small effective population size because the slightly deleterious effect of these elements can be better tolerated in these organisms, thus allowing opportunities for the development of novel biological functions. Nevertheless, once the functions of VuORFs were established, they became subject to negative selection.
Collapse
Affiliation(s)
- Ming-Kung Hsu
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Miaoli County, Taiwan
| | - Feng-Chi Chen
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Miaoli County, Taiwan
- Department of Life Sciences, National Chiao-Tung University, Hsinchu, Taiwan
- Department of Dentistry, China Medical University, Taichung, Taiwan
- * E-mail:
| |
Collapse
|
39
|
Kiran AM, O'Mahony JJ, Sanjeev K, Baranov PV. Darned in 2013: inclusion of model organisms and linking with Wikipedia. Nucleic Acids Res 2012; 41:D258-61. [PMID: 23074185 PMCID: PMC3531090 DOI: 10.1093/nar/gks961] [Citation(s) in RCA: 85] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
DARNED (DAtabase of RNa EDiting, available at http://darned.ucc.ie) is a centralized repository of reference genome coordinates corresponding to RNA nucleotides having altered templated identities in the process of RNA editing. The data in DARNED are derived from published datasets of RNA editing events. RNA editing instances have been identified with various methods, such as bioinformatics screenings, deep sequencing and/or biochemical techniques. Here we report our current progress in the development and expansion of the DARNED. In addition to novel database features the DARNED update describes inclusion of Drosophila melanogaster and Mus musculus RNA editing events and the launch of a community-based annotation in the RNA WikiProject.
Collapse
Affiliation(s)
- Anmol M Kiran
- Biochemistry Department, University College Cork, Cork, Ireland
| | | | | | | |
Collapse
|
40
|
Translational recoding in archaea. Extremophiles 2012; 16:793-803. [DOI: 10.1007/s00792-012-0482-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2012] [Accepted: 09/09/2012] [Indexed: 12/31/2022]
|
41
|
Hypomorphic glycosyltransferase alleles and recoding at contingency loci influence glycan microheterogeneity in the protein glycosylation system of Neisseria species. J Bacteriol 2012; 194:5034-43. [PMID: 22797763 DOI: 10.1128/jb.00950-12] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
As more bacterial protein glycosylation systems are identified and characterized, a central question that arises is, what governs the prevalence of particular glycans associated with them? In addition, accumulating evidence shows that bacterial protein glycans can be subject to the phenomenon of microheterogeneity, in which variant glycan structures are found at specific attachment sites of a given glycoprotein. Although factors underlying microheterogeneity in reconstituted expression systems have been identified and modeled, those impacting natural systems largely remain enigmatic. On the basis of a sensitive and specific glycan serotyping system, microheterogeneity has been reported for the broad-spectrum, O-linked protein glycosylation system in species within the genus Neisseria. To elucidate the mechanisms involved, a genetic approach was used to identify a hypomorphic allele of pglA (encoding the PglA galactosyltransferase) as a significant contributor to simultaneous expression of multiple glycoforms. Moreover, this phenotype was mapped to a single amino acid polymorphism in PglA. Further analyses revealed that many pglA phase-off variants (containing out-of-frame configurations in simple nucleotide repeats within the open reading frame) were associated with disproportionally high levels of the N,N'-diacetylbacillosamine-Gal disaccharide glycoform generated by PglA. This phenotype is emblematic of nonstandard decoding involving programmed ribosomal frameshifting and/or programmed transcriptional realignment. Together, these findings provide new information regarding the mechanisms of neisserial protein glycan microheterogeneity and the anticipatory nature of contingency loci.
Collapse
|
42
|
Sharma V, Murphy DP, Provan G, Baranov PV. CodonLogo: a sequence logo-based viewer for codon patterns. Bioinformatics 2012; 28:1935-6. [PMID: 22595210 PMCID: PMC3389775 DOI: 10.1093/bioinformatics/bts295] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Motivation: Conserved patterns across a multiple sequence alignment can be visualized by generating sequence logos. Sequence logos show each column in the alignment as stacks of symbol(s) where the height of a stack is proportional to its informational content, whereas the height of each symbol within the stack is proportional to its frequency in the column. Sequence logos use symbols of either nucleotide or amino acid alphabets. However, certain regulatory signals in messenger RNA (mRNA) act as combinations of codons. Yet no tool is available for visualization of conserved codon patterns. Results: We present the first application which allows visualization of conserved regions in a multiple sequence alignment in the context of codons. CodonLogo is based on WebLogo3 and uses the same heuristics but treats codons as inseparable units of a 64-letter alphabet. CodonLogo can discriminate patterns of codon conservation from patterns of nucleotide conservation that appear indistinguishable in standard sequence logos. Availability: The CodonLogo source code and its implementation (in a local version of the Galaxy Browser) are available at http://recode.ucc.ie/CodonLogo and through the Galaxy Tool Shed at http://toolshed.g2.bx.psu.edu/. Contact:p.baranov@ucc.ie or brave.oval.pan@gmail.com
Collapse
Affiliation(s)
- Virag Sharma
- Department of Biochemistry, College Cork, Cork, Ireland
| | | | | | | |
Collapse
|