1
|
Nunes-Alves AK, Abrahão JS, de Farias ST. Yaravirus brasiliense genomic structure analysis and its possible influence on the metabolism. Genet Mol Biol 2025; 48:e20240139. [PMID: 39918235 PMCID: PMC11803573 DOI: 10.1590/1678-4685-gmb-2024-0139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Accepted: 12/11/2024] [Indexed: 02/11/2025] Open
Abstract
Here we analyze the Yaravirus brasiliense, an amoeba-infecting 80-nm-sized virus with a 45-kbp dsDNA, using structural molecular modeling. Almost all of its 74 genes were previously identified as ORFans. Considering its unprecedented genetic content, we analyzed Yaravirus genome to understand its genetic organization, its proteome, and how it interacts with its host. We reported possible functions for all Yaravirus proteins. Our results suggest the first ever report of a fragment proteome, in which the proteins are separated in modules and joined together at a protein level. Given the structural resemblance between some Yaravirus proteins and proteins related to tricarboxylic acid cycle (TCA), glyoxylate cycle, and the respiratory complexes, our work also allows us to hypothesize that these viral proteins could be modulating cell metabolism by upregulation. The presence of these TCA cycle-related enzymes specifically could be trying to overcome the cycle's control points, since they are strategic proteins that maintain malate and oxaloacetate levels. Therefore, we propose that Yaravirus proteins are redirecting energy and resources towards viral production, and avoiding TCA cycle control points, "unlocking" the cycle. Altogether, our data helped understand a previously almost completely unknown virus, and a little bit more of the incredible diversity of viruses.
Collapse
Affiliation(s)
- Ana Karoline Nunes-Alves
- Universidade Federal da Paraíba, Departamento de Biologia Molecular,
Laboratório de Genética Evolutiva Paulo Leminski, João Pessoa, PB, Brazil
| | - Jônatas Santos Abrahão
- Universidade Federal de Minas Gerais, Instituto de Ciências
Biológicas, Departamento de Microbiologia, Laboratório de Vírus, Belo Horizonte, MG,
Brazil
| | - Sávio Torres de Farias
- Universidade Federal da Paraíba, Departamento de Biologia Molecular,
Laboratório de Genética Evolutiva Paulo Leminski, João Pessoa, PB, Brazil
- Network of Researchers on the Chemical Evolution of Life (NoRCEL),
Leeds, United Kingdom
| |
Collapse
|
2
|
Liu X, Xia X, Martynowycz MW, Gonen T, Zhou ZH. Molecular sociology of virus-induced cellular condensates supporting reovirus assembly and replication. Nat Commun 2024; 15:10638. [PMID: 39639006 PMCID: PMC11621325 DOI: 10.1038/s41467-024-54968-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2024] [Accepted: 11/26/2024] [Indexed: 12/07/2024] Open
Abstract
Virus-induced cellular condensates, or viral factories, are poorly understood high-density phases where replication of many viruses occurs. Here, by cryogenic electron tomography (cryoET) of focused ion beam (FIB) milling-produced lamellae of mammalian reovirus (MRV)-infected cells, we visualized the molecular organization and interplay (i.e., "molecular sociology") of host and virus in 3D at two time points post-infection, enabling a detailed description of these condensates and a mechanistic understanding of MRV replication within them. Expanding over time, the condensate fashions host ribosomes at its periphery, and host microtubules, lipid membranes, and viral molecules in its interior, forming a 3D architecture that supports the dynamic processes of viral genome replication and capsid assembly. A total of six MRV assembly intermediates are identified inside the condensate: star core, empty and genome-containing cores, empty and full virions, and outer shell particle. Except for star core, these intermediates are visualized at atomic resolution by cryogenic electron microscopy (cryoEM) of cellular extracts. The temporal sequence and spatial rearrangement among these viral intermediates choreograph the viral life cycle within the condensates. Together, the molecular sociology of MRV-induced cellular condensate highlights the functional advantage of transient enrichment of molecules at the right location and time for viral replication.
Collapse
Affiliation(s)
- Xiaoyu Liu
- Department of Microbiology, Immunology and Molecular Genetics, University of California, Los Angeles, CA, USA
- California NanoSystems Institute, University of California, Los Angeles, CA, USA
| | - Xian Xia
- Department of Microbiology, Immunology and Molecular Genetics, University of California, Los Angeles, CA, USA
- California NanoSystems Institute, University of California, Los Angeles, CA, USA
| | - Michael W Martynowycz
- Howard Hughes Medical Institute, University of California, Los Angeles, CA, USA
- Department of Biological Chemistry, University of California, Los Angeles, CA, USA
- Hauptman-Woodward Medical Research Institute, Buffalo, NY, USA
| | - Tamir Gonen
- Howard Hughes Medical Institute, University of California, Los Angeles, CA, USA
- Department of Biological Chemistry, University of California, Los Angeles, CA, USA
| | - Z Hong Zhou
- Department of Microbiology, Immunology and Molecular Genetics, University of California, Los Angeles, CA, USA.
- California NanoSystems Institute, University of California, Los Angeles, CA, USA.
| |
Collapse
|
3
|
Valli AA, Domingo-Calap ML, González de Prádena A, García JA, Cui H, Desbiez C, López-Moya JJ. Reconceptualizing transcriptional slippage in plant RNA viruses. mBio 2024; 15:e0212024. [PMID: 39287447 PMCID: PMC11481541 DOI: 10.1128/mbio.02120-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Accepted: 08/19/2024] [Indexed: 09/19/2024] Open
Abstract
RNA viruses have evolved sophisticated strategies to exploit the limited encoded information within their typically compact genomes. One of them, named transcriptional slippage (TS), is characterized by the appearance of indels in nascent viral RNAs, leading to changes in the open reading frame (ORF). Although members of unrelated viral families express key proteins via TS, the available information about this phenomenon is still limited. In potyvirids (members of the Potyviridae family), TS has been defined by the insertion of an additional A at An motifs (n ≥ 6) in newly synthesized transcripts at a low frequency, modulated by nucleotides flanking the A-rich motif. Here, by using diverse experimental approaches and a collection of plant/virus combinations, we discover cases not following this definition. In summary, we observe (i) a high rate of single-nucleotide deletions at slippage motifs, (ii) overlapping ORFs acceded by slippage at an U8 stretch, and (iii) changes in slippage rates induced by factors not related to cognate viruses. Moreover, a survey of whole-genome sequences from potyvirids shows a widespread occurrence of species-specific An/Un (n ≥ 6) motifs. Even though many of them, but not all, lead to the production of truncated proteins rather than access to overlapping ORFs, these results suggest that slippage motifs appear more frequently than expected and play relevant roles during virus evolution. Considering the potential of this phenomenon to expand the viral proteome by acceding to overlapping ORFs and/or producing truncated proteins, a re-evaluation of TS significance during infections of RNA viruses is required.IMPORTANCETranscriptional slippage (TS) is used by RNA viruses as another strategy to maximize the coding information in their genomes. This phenomenon is based on a peculiar feature of viral replicases: they may produce indels in a small fraction of newly synthesized viral RNAs when transcribing certain motifs and then produce alternative proteins due to a change of the reading frame or truncated products by premature termination. Here, using plant-infecting RNA viruses as models, we discover cases expanding on previously established features of plant virus TS, prompting us to reconsider and redefine this expression strategy. An interesting conclusion from our study is that TS might be more relevant during RNA virus evolution and infection processes than previously assumed.
Collapse
Affiliation(s)
| | - María Luisa Domingo-Calap
- Center for Research in Agricultural Genomics (CRAG-CSIC-IRTA-UAB-UB), Campus UAB, Bellaterra, Spain
- Evolving Therapeutics SL., Parc Científic de la Universitat de València, Paterna, Spain
| | | | | | - Hongguang Cui
- Key Laboratory of Green Prevention and Control of Tropical Plant Diseases and Pests, Ministry of Education and College of Plant Protection, Hainan University, Haikou, Hainan, China
| | | | - Juan José López-Moya
- Center for Research in Agricultural Genomics (CRAG-CSIC-IRTA-UAB-UB), Campus UAB, Bellaterra, Spain
| |
Collapse
|
4
|
Pavesi A, Romerio F. Creation of the HIV-1 antisense gene asp coincided with the emergence of the pandemic group M and is associated with faster disease progression. Microbiol Spectr 2024; 12:e0380223. [PMID: 38230940 PMCID: PMC10846101 DOI: 10.1128/spectrum.03802-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 12/19/2023] [Indexed: 01/18/2024] Open
Abstract
Despite being first identified more than three decades ago, the antisense gene asp of HIV-1 remains an enigma. asp is present uniquely in pandemic (group M) HIV-1 strains, and it is absent in all non-pandemic (out-of-M) HIV-1 strains and virtually all non-human primate lentiviruses. This suggests that the creation of asp may have contributed to HIV-1 fitness or worldwide spread. It also raises the question of which evolutionary processes were at play in the creation of asp. Here, we show that HIV-1 genomes containing an intact asp gene are associated with faster HIV-1 disease progression. Furthermore, we demonstrate that the creation of a full-length asp gene occurred via the evolution of codon usage in env overlapping asp on the opposite strand. This involved differential use of synonymous codons or conservative amino acid substitution in env that eliminated internal stop codons in asp, and redistribution of synonymous codons in env that minimized the likelihood of new premature stops arising in asp. Nevertheless, the creation of a full-length asp gene reduced the genetic diversity of env. The Luria-Delbruck fluctuation test suggests that the interrupted asp open reading frame (ORF) is the progenitor of the intact ORF, rather than a descendant under random genetic drift. Therefore, the existence of group-M isolates with a truncated asp ORF indicates an incomplete transition process. For the first time, our study links the presence of a full-length asp ORF to faster disease progression, thus warranting further investigation into the cellular processes and molecular mechanisms through which the ASP protein impacts HIV-1 replication, transmission, and pathogenesis.IMPORTANCEOverlapping genes engage in a tug-of-war, constraining each other's evolution. The creation of a new gene overlapping an existing one comes at an evolutionary cost. Thus, its conservation must be advantageous, or it will be lost, especially if the pre-existing gene is essential for the viability of the virus or cell. We found that the creation and conservation of the HIV-1 antisense gene asp occurred through differential use of synonymous codons or conservative amino acid substitutions within the overlapping gene, env. This process did not involve amino acid changes in ENV that benefited its function, but rather it constrained the evolution of ENV. Nonetheless, the creation of asp brought a net selective advantage to HIV-1 because asp is conserved especially among high-prevalence strains. The association between the presence of an intact asp gene and faster HIV-1 disease progression supports that conclusion and warrants further investigation.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma, Italy
| | - Fabio Romerio
- Department of Molecular and Comparative Pathobiology, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| |
Collapse
|
5
|
Bukhnikashvili L. Overlaps Between CDS Regions of Protein-Coding Genes in the Human Genome: A Case Study on the NR1D1-THRA Gene Pair. J Mol Evol 2023; 91:963-975. [PMID: 38006429 DOI: 10.1007/s00239-023-10147-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2023] [Accepted: 11/12/2023] [Indexed: 11/27/2023]
Abstract
For several decades, it has been known that a substantial number of genes within human DNA exhibit overlap; however, the biological and evolutionary significance of these overlaps remain poorly understood. This study focused on investigating specific instances of overlap where the overlapping DNA region encompasses the coding DNA sequences (CDSs) of protein-coding genes. The results revealed that proteins encoded by overlapping CDSs exhibit greater disorder than those from nonoverlapping CDSs. Additionally, these DNA regions were identified as GC-rich. This could be partially attributed to the absence of stop codons from two distinct reading frames rather than one. Furthermore, these regions were found to harbour fewer single-nucleotide polymorphism (SNP) sites, possibly due to constraints arising from the overlapping state where mutations could affect two genes simultaneously.While elucidating these properties, the NR1D1-THRA gene pair emerged as an exceptional case with highly structured proteins and a distinctly conserved sequence across eutherian mammals. Both NR1D1 and THRA are nuclear receptors lacking a ligand-binding domain at their C-terminus, which is the region where these gene pairs overlap. The NR1D1 gene is involved in the regulation of circadian rhythm, while the THRA gene encodes a thyroid hormone receptor, and both play crucial roles in various physiological processes. This study suggests that, in addition to their well-established functions, the specifically overlapping CDS regions of these genes may encode protein segments with additional, yet undiscovered, biological roles.
Collapse
|
6
|
Goldberg TL, Blevins E, Leis EM, Standish IF, Richard JC, Lueder MR, Cer RZ, Bishop-Lilly KA. Plasticity, Paralogy, and Pseudogenization: Rhabdoviruses of Freshwater Mussels Elucidate Mechanisms of Viral Genome Diversification and the Evolution of the Finfish-Infecting Rhabdoviral Genera. J Virol 2023; 97:e0019623. [PMID: 37154732 PMCID: PMC10231222 DOI: 10.1128/jvi.00196-23] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Accepted: 04/07/2023] [Indexed: 05/10/2023] Open
Abstract
Viruses in the family Rhabdoviridae display remarkable genomic variation and ecological diversity. This plasticity occurs despite the fact that, as negative sense RNA viruses, rhabdoviruses rarely if ever recombine. Here, we describe nonrecombinatorial evolutionary processes leading to genomic diversification in the Rhabdoviridae inferred from two novel rhabdoviruses of freshwater mussels (Mollusca: Bivalvia: Unionida). Killamcar virus 1 (KILLV-1) from a plain pocketbook (Lampsilis cardium) is closely related phylogenetically and transcriptionally to finfish-infecting viruses in the subfamily Alpharhabdovirinae. KILLV-1 offers a novel example of glycoprotein gene duplication, differing from previous examples in that the paralogs overlap. Evolutionary analyses reveal a clear pattern of relaxed selection due to subfunctionalization in rhabdoviral glycoprotein paralogs, which has not previously been described in RNA viruses. Chemarfal virus 1 (CHMFV-1) from a western pearlshell (Margaritifera falcata) is closely related phylogenetically and transcriptionally to viruses in the genus Novirhabdovirus, the sole recognized genus in the subfamily Gammarhabdovirinae, representing the first known gammarhabdovirus of a host other than finfish. The CHMFV-1 G-L noncoding region contains a nontranscribed remnant gene of precisely the same length as the NV gene of most novirhabdoviruses, offering a compelling example of pseudogenization. The unique reproductive strategy of freshwater mussels involves an obligate parasitic stage in which larvae encyst in the tissues of finfish, offering a plausible ecological mechanism for viral host-switching. IMPORTANCE Viruses in the family Rhabdoviridae infect a variety of hosts, including vertebrates, invertebrates, plants and fungi, with important consequences for health and agriculture. This study describes two newly discovered viruses of freshwater mussels from the United States. One virus from a plain pocketbook (Lampsilis cardium) is closely related to fish-infecting viruses in the subfamily Alpharhabdovirinae. The other virus from a western pearlshell (Margaritifera falcata) is closely related to viruses in the subfamily Gammarhabdovirinae, which until now were only known to infect finfish. Genome features of both viruses provide new evidence of how rhabdoviruses evolved their extraordinary variability. Freshwater mussel larvae attach to fish and feed on tissues and blood, which may explain how rhabdoviruses originally jumped between mussels and fish. The significance of this research is that it improves our understanding of rhabdovirus ecology and evolution, shedding new light on these important viruses and the diseases they cause.
Collapse
Affiliation(s)
- Tony L. Goldberg
- Department of Pathobiological Sciences, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Emilie Blevins
- Xerces Society for Invertebrate Conservation, Portland, Oregon, USA
| | - Eric M. Leis
- U.S. Fish and Wildlife Service, La Crosse Fish Health Center, Midwest Fisheries Center, Onalaska, Wisconsin, USA
| | - Isaac F. Standish
- U.S. Fish and Wildlife Service, La Crosse Fish Health Center, Midwest Fisheries Center, Onalaska, Wisconsin, USA
| | - Jordan C. Richard
- Department of Pathobiological Sciences, University of Wisconsin-Madison, Madison, Wisconsin, USA
- U.S. Fish and Wildlife Service, Southwestern Virginia Field Office, Abingdon, Virginia, USA
| | - Matthew R. Lueder
- Leidos, Reston, Virginia, USA
- Biological Defense Research Directorate, Naval Medical Research Command–Frederick, Fort Detrick, Maryland, USA
| | - Regina Z. Cer
- Biological Defense Research Directorate, Naval Medical Research Command–Frederick, Fort Detrick, Maryland, USA
| | - Kimberly A. Bishop-Lilly
- Biological Defense Research Directorate, Naval Medical Research Command–Frederick, Fort Detrick, Maryland, USA
| |
Collapse
|
7
|
Zhang Y, Liang X, Zhao M, Qi T, Guo H, Zhao J, Zhao J, Zhan G, Kang Z, Zheng L. A novel ambigrammatic mycovirus, PsV5, works hand in glove with wheat stripe rust fungus to facilitate infection. PLANT COMMUNICATIONS 2023; 4:100505. [PMID: 36527233 DOI: 10.1016/j.xplc.2022.100505] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 11/16/2022] [Accepted: 12/14/2022] [Indexed: 05/11/2023]
Abstract
Here we describe a novel narnavirus, Puccinia striiformis virus 5 (PsV5), from the devastating wheat stripe rust fungus P. striiformis f. sp. tritici (Pst). The genome of PsV5 contains two predicted open reading frames (ORFs) that largely overlap on reverse strands: an RNA-dependent RNA polymerase (RdRp) and a reverse-frame ORF (rORF) with unknown function. Protein translations of both ORFs were demonstrated by immune technology. Transgenic wheat lines overexpressing PsV5 (RdRp-rORF), RdRp ORF, or rORF were more susceptible to Pst infection, whereas PsV5-RNA interference (RNAi) lines were more resistant. Overexpression of PsV5 (RdRp-rORF), RdRp ORF, or rORF in Fusarium graminearum also boosted fungal virulence. We thus report a novel ambigrammatic mycovirus that promotes the virulence of its fungal host. The results are a significant addition to our understanding of virosphere diversity and offer insights for sustainable wheat rust disease control.
Collapse
Affiliation(s)
- Yanhui Zhang
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant Protection, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Xiaofei Liang
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant Protection, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Mengxin Zhao
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant Protection, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Tuo Qi
- State Key Laboratory of Crop Gene Exploration and Utilization in Southwest China, State Key Laboratory of Hybrid Rice, Key Laboratory of Major Crop Diseases & Collaborative Innovation Center for Hybrid Rice in Yangtze River Basin, Rice Research Institute, Sichuan Agricultural University at Wenjiang, Chengdu, Sichuan 611130, China
| | - Hualong Guo
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant Protection, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Jing Zhao
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant Protection, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Jie Zhao
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant Protection, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Gangming Zhan
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant Protection, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Zhensheng Kang
- State Key Laboratory of Crop Stress Biology for Arid Areas and College of Plant Protection, Northwest A&F University, Yangling, Shaanxi 712100, China.
| | - Li Zheng
- Sanya Nanfan Research Institute of Hainan University, Hainan Yazhou Bay Seed Laboratory, Sanya 572025, China; Key Laboratory of Green Prevention and Control of Tropical Plant Diseases and Pests, Ministry of Education and School of Plant Protection, Hainan University, Haikou, Hainan 570228, China.
| |
Collapse
|
8
|
Çakır U, Gabed N, Brunet M, Roucou X, Kryvoruchko I. Mosaic translation hypothesis: chimeric polypeptides produced via multiple ribosomal frameshifting as a basis for adaptability. FEBS J 2023; 290:370-378. [PMID: 34743413 DOI: 10.1111/febs.16269] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 10/03/2021] [Accepted: 11/05/2021] [Indexed: 02/05/2023]
Abstract
How many different proteins can be produced from a single spliced transcript? Genome annotation projects overlook the coding potential of reading frames other than that of the reference open reading frames (refORFs). Recently, alternative open reading frames (altORFs) and their translational products, alternative proteins, have been shown to carry out important functions in various organisms. AltORFs overlapping refORFs or other altORFs in a different reading frame may be involved in one fundamental mechanism so far overlooked. A few years ago, it was proposed that altORFs may act as building blocks for chimeric (mosaic) polypeptides, which are produced via multiple ribosomal frameshifting events from a single mature transcript. We adopt terminology from that earlier discussion and call this mechanism mosaic translation. This way of extracting and combining genetic information may significantly increase proteome diversity. Thus, we hypothesize that this mechanism may have contributed to the flexibility and adaptability of organisms to a variety of environmental conditions. Specialized ribosomes acting as sensors probably played a central role in this process. Importantly, mosaic translation may be the main source of protein diversity in genomes that lack alternative splicing. The idea of mosaic translation is a testable hypothesis, although its direct demonstration is challenging. Should mosaic translation occur, we would currently highly underestimate the complexity of translation mechanisms and thus the proteome.
Collapse
Affiliation(s)
- Umut Çakır
- Molecular Biology and Genetics Department, Faculty of Arts and Sciences, Boğaziçi University, Istanbul, Turkey
| | - Noujoud Gabed
- Cellular and Molecular Biology Department, Oran High School of Biological Sciences (ESSBO), Oran, Algeria
| | - Marie Brunet
- Department of Pediatrics, Medical Genetics Service, Université de Sherbrooke, QC, Canada.,Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), QC, Canada
| | - Xavier Roucou
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), QC, Canada.,Department of Biochemistry and Functional Genomics, Université de Sherbrooke, QC, Canada
| | - Igor Kryvoruchko
- Molecular Biology and Genetics Department, Faculty of Arts and Sciences, Boğaziçi University, Istanbul, Turkey
| |
Collapse
|
9
|
Price AM, Steinbock RT, Lauman R, Charman M, Hayer KE, Kumar N, Halko E, Lum KK, Wei M, Wilson AC, Garcia BA, Depledge DP, Weitzman MD. Novel viral splicing events and open reading frames revealed by long-read direct RNA sequencing of adenovirus transcripts. PLoS Pathog 2022; 18:e1010797. [PMID: 36095031 PMCID: PMC9499273 DOI: 10.1371/journal.ppat.1010797] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 09/22/2022] [Accepted: 08/05/2022] [Indexed: 01/07/2023] Open
Abstract
Adenovirus is a common human pathogen that relies on host cell processes for transcription and processing of viral RNA and protein production. Although adenoviral promoters, splice junctions, and polyadenylation sites have been characterized using low-throughput biochemical techniques or short read cDNA-based sequencing, these technologies do not fully capture the complexity of the adenoviral transcriptome. By combining Illumina short-read and nanopore long-read direct RNA sequencing approaches, we mapped transcription start sites and RNA cleavage and polyadenylation sites across the adenovirus genome. In addition to confirming the known canonical viral early and late RNA cassettes, our analysis of splice junctions within long RNA reads revealed an additional 35 novel viral transcripts that meet stringent criteria for expression. These RNAs include fourteen new splice junctions which lead to expression of canonical open reading frames (ORFs), six novel ORF-containing transcripts, and 15 transcripts encoding for messages that could alter protein functions through truncation or fusion of canonical ORFs. In addition, we detect RNAs that bypass canonical cleavage sites and generate potential chimeric proteins by linking distinct gene transcription units. Among these chimeric proteins we detected an evolutionarily conserved protein containing the N-terminus of E4orf6 fused to the downstream DBP/E2A ORF. Loss of this novel protein, E4orf6/DBP, was associated with aberrant viral replication center morphology and poor viral spread. Our work highlights how long-read sequencing technologies combined with mass spectrometry can reveal further complexity within viral transcriptomes and resulting proteomes.
Collapse
Affiliation(s)
- Alexander M. Price
- Division of Protective Immunity, Department of Pathology and Laboratory Medicine, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America
| | - Robert T. Steinbock
- Cell & Molecular Biology Graduate Group, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Richard Lauman
- Division of Protective Immunity, Department of Pathology and Laboratory Medicine, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America
- Graduate Group in Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Matthew Charman
- Division of Protective Immunity, Department of Pathology and Laboratory Medicine, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America
| | - Katharina E. Hayer
- Department of Biomedical and Health Informatics, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America
| | - Namrata Kumar
- Division of Protective Immunity, Department of Pathology and Laboratory Medicine, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America
| | - Edwin Halko
- Division of Protective Immunity, Department of Pathology and Laboratory Medicine, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America
| | - Krystal K. Lum
- Division of Protective Immunity, Department of Pathology and Laboratory Medicine, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America
| | - Monica Wei
- Cell & Molecular Biology Graduate Group, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Angus C. Wilson
- Department of Microbiology, New York University School of Medicine, New York city, New York, United States of America
| | - Benjamin A. Garcia
- Department of Biochemistry and Biophysics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, United States of America
| | - Daniel P. Depledge
- Department of Microbiology, New York University School of Medicine, New York city, New York, United States of America
- Institute of Virology, Hannover Medical School, Hannover, Germany
- German Center for Infection Research (DZIF), partner site Hannover-Braunschweig, Hannover, Germany
| | - Matthew D. Weitzman
- Division of Protective Immunity, Department of Pathology and Laboratory Medicine, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America
- Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
10
|
Pinto D, Gonçalo R, Louro M, Silva MS, Hernandez G, Cordeiro TN, Cordeiro C, São-José C. On the Occurrence and Multimerization of Two-Polypeptide Phage Endolysins Encoded in Single Genes. Microbiol Spectr 2022; 10:e0103722. [PMID: 35876588 PMCID: PMC9430671 DOI: 10.1128/spectrum.01037-22] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 07/05/2022] [Indexed: 11/20/2022] Open
Abstract
Bacteriophages (phages) and other viruses are extremely efficient in packing their genetic information, with several described cases of overlapping genes encoded in different open reading frames (ORFs). While less frequently reported, specific cases exist in which two overlapping ORFs are in frame and share the stop codon. Here, we studied the occurrence of this genetic arrangement in endolysins, the phage enzymes that cut the bacterial cell wall peptidoglycan to release the virion progeny. After screening over 3,000 endolysin sequences of phages infecting Gram-positive bacteria, we found evidence that this coding strategy is frequent in endolysin genes. Our bioinformatics predictions were experimentally validated by demonstrating that two polypeptides are indeed produced from these genes. Additionally, we show that in some cases the two polypeptides need to interact and multimerize to generate the active endolysin. By studying in detail one selected example, we uncovered a heteromeric endolysin with a 1:5 subunit stoichiometry that has never been described before. Hence, we conclude that the occurrence of endolysin genes encoding two polypeptide isoforms by in-frame overlapping ORFs, as well as their organization as enzymatic complexes, appears more common than previously thought, therefore challenging the established view of endolysins being mostly formed by single, monomeric polypeptide chains. IMPORTANCE Bacteriophages use endolysins to cleave the host bacteria cell wall, a crucial event underlying cell lysis for virion progeny release. These bacteriolytic enzymes are generally thought to work as single, monomeric polypeptides, but a few examples have been described in which a single gene produces two endolysin isoforms. These are encoded by two in-frame overlapping ORFs, with a shorter ORF being defined by an internal translation start site. This work shows evidence that this endolysin coding strategy is frequent in phages infecting Gram-positive bacteria, and not just an eccentricity of a few phages. In one example studied in detail, we show that the two isoforms are inactive until they assemble to generate a multimeric active endolysin, with a 1:5 subunit stoichiometry never described before. This study challenges the established view of endolysins, with possible implications in their current exploration and design as alternative antibacterials.
Collapse
Affiliation(s)
- Daniela Pinto
- Research Institute for Medicines (iMed.ULisboa), Faculdade de Farmácia da Universidade de Lisboa, Lisbon, Portugal
| | - Raquel Gonçalo
- Research Institute for Medicines (iMed.ULisboa), Faculdade de Farmácia da Universidade de Lisboa, Lisbon, Portugal
| | - Mariana Louro
- Laboratório de FT-ICR e Espectrometria de Massa Estrutural, MARE – Marine and Environmental Sciences Centre, Faculdade de Ciências da Universidade de Lisboa, Lisbon, Portugal
| | - Marta Sousa Silva
- Laboratório de FT-ICR e Espectrometria de Massa Estrutural, MARE – Marine and Environmental Sciences Centre, Faculdade de Ciências da Universidade de Lisboa, Lisbon, Portugal
| | - Guillem Hernandez
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Oeiras, Portugal
| | - Tiago N. Cordeiro
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Oeiras, Portugal
| | - Carlos Cordeiro
- Laboratório de FT-ICR e Espectrometria de Massa Estrutural, MARE – Marine and Environmental Sciences Centre, Faculdade de Ciências da Universidade de Lisboa, Lisbon, Portugal
| | - Carlos São-José
- Research Institute for Medicines (iMed.ULisboa), Faculdade de Farmácia da Universidade de Lisboa, Lisbon, Portugal
| |
Collapse
|
11
|
Shah SB, Hill AM, Wilke CO, Hockenberry AJ. Generating dynamic gene expression patterns without the need for regulatory circuits. PLoS One 2022; 17:e0268883. [PMID: 35617346 PMCID: PMC9135205 DOI: 10.1371/journal.pone.0268883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Accepted: 05/10/2022] [Indexed: 11/18/2022] Open
Abstract
Synthetic biology has successfully advanced our ability to design and implement complex, time-varying genetic circuits to control the expression of recombinant proteins. However, these circuits typically require the production of regulatory genes whose only purpose is to coordinate expression of other genes. When designing very small genetic constructs, such as viral genomes, we may want to avoid introducing such auxiliary gene products while nevertheless encoding complex expression dynamics. To this end, here we demonstrate that varying only the placement and strengths of promoters, terminators, and RNase cleavage sites in a computational model of a bacteriophage genome is sufficient to achieve solutions to a variety of basic gene expression patterns. We discover these genetic solutions by computationally evolving genomes to reproduce desired gene expression time-course data. Our approach shows that non-trivial patterns can be evolved, including patterns where the relative ordering of genes by abundance changes over time. We find that some patterns are easier to evolve than others, and comparable expression patterns can be achieved via different genetic architectures. Our work opens up a novel avenue to genome engineering via fine-tuning the balance of gene expression and gene degradation rates.
Collapse
Affiliation(s)
- Sahil B. Shah
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX, United States of America
| | - Alexis M. Hill
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX, United States of America
| | - Claus O. Wilke
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX, United States of America
- * E-mail: (COW); (AJH)
| | - Adam J. Hockenberry
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX, United States of America
- * E-mail: (COW); (AJH)
| |
Collapse
|
12
|
Muñoz-Baena L, Poon AFY. Using networks to analyze and visualize the distribution of overlapping genes in virus genomes. PLoS Pathog 2022; 18:e1010331. [PMID: 35202429 PMCID: PMC8903798 DOI: 10.1371/journal.ppat.1010331] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Revised: 03/08/2022] [Accepted: 02/02/2022] [Indexed: 11/19/2022] Open
Abstract
Gene overlap occurs when two or more genes are encoded by the same nucleotides. This phenomenon is found in all taxonomic domains, but is particularly common in viruses, where it may increase the information content of compact genomes or influence the creation of new genes. Here we report a global comparative study of overlapping open reading frames (OvRFs) of 12,609 virus reference genomes in the NCBI database. We retrieved metadata associated with all annotated open reading frames (ORFs) in each genome record to calculate the number, length, and frameshift of OvRFs. Our results show that while the number of OvRFs increases with genome length, they tend to be shorter in longer genomes. The majority of overlaps involve +2 frameshifts, predominantly found in dsDNA viruses. Antisense overlaps in which one of the ORFs was encoded in the same frame on the opposite strand (−0) tend to be longer. Next, we develop a new graph-based representation of the distribution of overlaps among the ORFs of genomes in a given virus family. In the absence of an unambiguous partition of ORFs by homology at this taxonomic level, we used an alignment-free k-mer based approach to cluster protein coding sequences by similarity. We connect these clusters with two types of directed edges to indicate (1) that constituent ORFs are adjacent in one or more genomes, and (2) that these ORFs overlap. These adjacency graphs not only provide a natural visualization scheme, but also a novel statistical framework for analyzing the effects of gene- and genome-level attributes on the frequencies of overlaps.
Collapse
Affiliation(s)
- Laura Muñoz-Baena
- Department of Microbiology and Immunology, Western University, London, ON, Canada
| | - Art F. Y. Poon
- Department of Microbiology and Immunology, Western University, London, ON, Canada
- Department of Pathology and Laboratory Medicine, Western University, London, ON, Canada
- * E-mail:
| |
Collapse
|
13
|
Gene Overlapping as a Modulator of Begomovirus Evolution. Microorganisms 2022; 10:microorganisms10020366. [PMID: 35208820 PMCID: PMC8875319 DOI: 10.3390/microorganisms10020366] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 02/01/2022] [Accepted: 02/01/2022] [Indexed: 02/06/2023] Open
Abstract
In RNA viruses, which have high mutation—and fast evolutionary— rates, gene overlapping (i.e., genomic regions that encode more than one protein) is a major factor controlling mutational load and therefore the virus evolvability. Although DNA viruses use host high-fidelity polymerases for their replication, and therefore should have lower mutation rates, it has been shown that some of them have evolutionary rates comparable to those of RNA viruses. Notably, these viruses have large proportions of their genes with at least one overlapping instance. Hence, gene overlapping could be a modulator of virus evolution beyond the RNA world. To test this hypothesis, we use the genus Begomovirus of plant viruses as a model. Through comparative genomic approaches, we show that terminal gene overlapping decreases the rate of virus evolution, which is associated with lower frequency of both synonymous and nonsynonymous mutations. In contrast, terminal overlapping has little effect on the pace of virus evolution. Overall, our analyses support a role for gene overlapping in the evolution of begomoviruses and provide novel information on the factors that shape their genetic diversity.
Collapse
|
14
|
Predicting the capsid architecture of phages from metagenomic data. Comput Struct Biotechnol J 2022; 20:721-732. [PMID: 35140890 PMCID: PMC8814770 DOI: 10.1016/j.csbj.2021.12.032] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 12/22/2021] [Accepted: 12/22/2021] [Indexed: 12/29/2022] Open
Abstract
Tailed phages are viruses that infect bacteria and are the most abundant biological entities on Earth. Their ecological, evolutionary, and biogeochemical roles in the planet stem from their genomic diversity. Known tailed phage genomes range from 10 to 735 kilobase pairs thanks to the size variability of the protective protein capsids that store them. However, the role of tailed phage capsids’ diversity in ecosystems is unclear. A fundamental gap is the difficulty of associating genomic information with viral capsids in the environment. To address this problem, here, we introduce a computational approach to predict the capsid architecture (T-number) of tailed phages using the sequence of a single gene—the major capsid protein. This approach relies on an allometric model that relates the genome length and capsid architecture of tailed phages. This allometric model was applied to isolated phage genomes to generate a library that associated major capsid proteins and putative capsid architectures. This library was used to train machine learning methods, and the most computationally scalable model investigated (random forest) was applied to human gut metagenomes. Compared to isolated phages, the analysis of gut data reveals a large abundance of mid-sized (T = 7) capsids, as expected, followed by a relatively large frequency of jumbo-like tailed phage capsids (T ≥ 25) and small capsids (T = 4) that have been under-sampled. We discussed how to increase the method’s accuracy and how to extend the approach to other viruses. The computational pipeline introduced here opens the doors to monitor the ongoing evolution and selection of viral capsids across ecosystems.
Collapse
|
15
|
Abstract
Modern genome-scale methods that identify new genes, such as proteogenomics and ribosome profiling, have revealed, to the surprise of many, that overlap in genes, open reading frames and even coding sequences is widespread and functionally integrated into prokaryotic, eukaryotic and viral genomes. In parallel, the constraints that overlapping regions place on genome sequences and their evolution can be harnessed in bioengineering to build more robust synthetic strains and constructs. With a focus on overlapping protein-coding and RNA-coding genes, this Review examines their discovery, topology and biogenesis in the context of their genome biology. We highlight exciting new uses for sequence overlap to control translation, compress synthetic genetic constructs, and protect against mutation.
Collapse
|
16
|
Wichmann S, Scherer S, Ardern Z. Biological factors in the synthetic construction of overlapping genes. BMC Genomics 2021; 22:888. [PMID: 34895142 PMCID: PMC8665328 DOI: 10.1186/s12864-021-08181-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2020] [Accepted: 11/17/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Overlapping genes (OLGs) with long protein-coding overlapping sequences are disallowed by standard genome annotation programs, outside of viruses. Recently however they have been discovered in Archaea, diverse Bacteria, and Mammals. The biological factors underlying life's ability to create overlapping genes require more study, and may have important applications in understanding evolution and in biotechnology. A previous study claimed that protein domains from viruses were much better suited to forming overlaps than those from other cellular organisms - in this study we assessed this claim, in order to discover what might underlie taxonomic differences in the creation of gene overlaps. RESULTS After overlapping arbitrary Pfam domain pairs and evaluating them with Hidden Markov Models we find OLG construction to be much less constrained than expected. For instance, close to 10% of the constructed sequences cannot be distinguished from typical sequences in their protein family. Most are also indistinguishable from natural protein sequences regarding identity and secondary structure. Surprisingly, contrary to a previous study, virus domains were much less suitable for designing OLGs than bacterial or eukaryotic domains were. In general, the amount of amino acid change required to force a domain to overlap is approximately equal to the variation observed within a typical domain family. The resulting high similarity between natural sequences and those altered so as to overlap is mostly due to the combination of high redundancy in the genetic code and the evolutionary exchangeability of many amino acids. CONCLUSIONS Synthetic overlapping genes which closely resemble natural gene sequences, as measured by HMM profiles, are remarkably easy to construct, and most arbitrary domain pairs can be altered so as to overlap while retaining high similarity to the original sequences. Future work however will need to assess important factors not considered such as intragenic interactions which affect protein folding. While the analysis here is not sufficient to guarantee functional folding proteins, further analysis of constructed OLGs will improve our understanding of the origin of these remarkable genetic elements across life and opens up exciting possibilities for synthetic biology.
Collapse
Affiliation(s)
- Stefan Wichmann
- Chair of Microbial Ecology, Department of Molecular Life Sciences, Technical University of Munich, Freising, Germany
| | - Siegfried Scherer
- Chair of Microbial Ecology, Department of Molecular Life Sciences, Technical University of Munich, Freising, Germany
| | - Zachary Ardern
- Chair of Microbial Ecology, Department of Molecular Life Sciences, Technical University of Munich, Freising, Germany.
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.
| |
Collapse
|
17
|
Mehravar M, Ghaemimanesh F, Poursani EM. Exon and intron sharing in opposite direction-an undocumented phenomenon in human genome-between Pou5f1 and Tcf19 genes. BMC Genomics 2021; 22:718. [PMID: 34610795 PMCID: PMC8493703 DOI: 10.1186/s12864-021-08039-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 09/24/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Overlapping genes share same genomic regions in parallel (sense) or anti-parallel (anti-sense) orientations. These gene pairs seem to occur in all domains of life and are best known from viruses. However, the advantage and biological significance of overlapping genes is still unclear. Expressed sequence tags (ESTs) analysis enabled us to uncover an overlapping gene pair in the human genome. RESULTS By using in silico analysis of previous experimental documentations, we reveal a new form of overlapping genes in the human genome, in which two genes found on opposite strands (Pou5f1 and Tcf19), share two exons and one intron enclosed, at the same positions, between OCT4B3 and TCF19-D splice variants. CONCLUSIONS This new form of overlapping gene expands our previous perception of splicing events and may shed more light on the complexity of gene regulation in higher organisms. Additional such genes might be detected by ESTs analysis also of other organisms.
Collapse
Affiliation(s)
- Majid Mehravar
- Department of Anatomy and Developmental Biology, Development and Stem Cells Program, Biomedicine Discovery Institute, Monash University, Melbourne, Australia
| | - Fatemeh Ghaemimanesh
- Monoclonal Antibody Research Center, Avicenna Research Institute, ACECR, Tehran, Iran
| | - Ensieh M Poursani
- Hematology, Oncology and Stem Cell Transplantation Research Center, Tehran University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
18
|
Abstract
Narnaviruses are RNA viruses detected in diverse fungi, plants, protists, arthropods, and nematodes. Though initially described as simple single-gene nonsegmented viruses encoding RNA-dependent RNA polymerase (RdRp), a subset of narnaviruses referred to as "ambigrammatic" harbor a unique genomic configuration consisting of overlapping open reading frames (ORFs) encoded on opposite strands. Phylogenetic analysis supports selection to maintain this unusual genome organization, but functional investigations are lacking. Here, we establish the mosquito-infecting Culex narnavirus 1 (CxNV1) as a model to investigate the functional role of overlapping ORFs in narnavirus replication. In CxNV1, a reverse ORF without homology to known proteins covers nearly the entire 3.2-kb segment encoding the RdRp. Additionally, two opposing and nearly completely overlapping novel ORFs are found on the second putative CxNV1 segment, the 0.8-kb "Robin" RNA. We developed a system to launch CxNV1 in a naive mosquito cell line and then showed that functional RdRp is required for persistence of both segments, and an intact reverse ORF is required on the RdRp segment for persistence. Mass spectrometry of persistently CxNV1-infected cells provided evidence for translation of this reverse ORF. Finally, ribosome profiling yielded a striking pattern of footprints for all four CxNV1 RNA strands that was distinct from actively translating ribosomes on host mRNA or coinfecting RNA viruses. Taken together, these data raise the possibility that the process of translation itself is important for persistence of ambigrammatic narnaviruses, potentially by protecting viral RNA with ribosomes, thus suggesting a heretofore undescribed viral tactic for replication and transmission. IMPORTANCE Fundamental to our understanding of RNA viruses is a description of which strand(s) of RNA are transmitted as the viral genome relative to which encode the viral proteins. Ambigrammatic narnaviruses break the mold. These viruses, found broadly in fungi, plants, and insects, have the unique feature of two overlapping genes encoded on opposite strands, comprising nearly the full length of the viral genome. Such extensive overlap is not seen in other RNA viruses and comes at the cost of reduced evolutionary flexibility in the sequence. The present study is motivated by investigating the benefits which balance that cost. We show for the first time a functional requirement for the ambigrammatic genome configuration in Culex narnavirus 1, which suggests a model for how translation of both strands might benefit this virus. Our work highlights a new blueprint for viral persistence, distinct from strategies defined by canonical definitions of the coding strand.
Collapse
|
19
|
Pavesi A. Origin, Evolution and Stability of Overlapping Genes in Viruses: A Systematic Review. Genes (Basel) 2021; 12:genes12060809. [PMID: 34073395 PMCID: PMC8227390 DOI: 10.3390/genes12060809] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 05/22/2021] [Accepted: 05/24/2021] [Indexed: 12/11/2022] Open
Abstract
During their long evolutionary history viruses generated many proteins de novo by a mechanism called “overprinting”. Overprinting is a process in which critical nucleotide substitutions in a pre-existing gene can induce the expression of a novel protein by translation of an alternative open reading frame (ORF). Overlapping genes represent an intriguing example of adaptive conflict, because they simultaneously encode two proteins whose freedom to change is constrained by each other. However, overlapping genes are also a source of genetic novelties, as the constraints under which alternative ORFs evolve can give rise to proteins with unusual sequence properties, most importantly the potential for novel functions. Starting with the discovery of overlapping genes in phages infecting Escherichia coli, this review covers a range of studies dealing with detection of overlapping genes in small eukaryotic viruses (genomic length below 30 kb) and recognition of their critical role in the evolution of pathogenicity. Origin of overlapping genes, what factors favor their birth and retention, and how they manage their inherent adaptive conflict are extensively reviewed. Special attention is paid to the assembly of overlapping genes into ad hoc databases, suitable for future studies, and to the development of statistical methods for exploring viral genome sequences in search of undiscovered overlaps.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parco Area delle Scienze 23/A, I-43124 Parma, Italy
| |
Collapse
|
20
|
Nelson CW, Ardern Z, Wei X. OLGenie: Estimating Natural Selection to Predict Functional Overlapping Genes. Mol Biol Evol 2021; 37:2440-2449. [PMID: 32243542 PMCID: PMC7531306 DOI: 10.1093/molbev/msaa087] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Purifying (negative) natural selection is a hallmark of functional biological sequences, and can be detected in protein-coding genes using the ratio of nonsynonymous to synonymous substitutions per site (dN/dS). However, when two genes overlap the same nucleotide sites in different frames, synonymous changes in one gene may be nonsynonymous in the other, perturbing dN/dS. Thus, scalable methods are needed to estimate functional constraint specifically for overlapping genes (OLGs). We propose OLGenie, which implements a modification of the Wei–Zhang method. Assessment with simulations and controls from viral genomes (58 OLGs and 176 non-OLGs) demonstrates low false-positive rates and good discriminatory ability in differentiating true OLGs from non-OLGs. We also apply OLGenie to the unresolved case of HIV-1’s putative antisense protein gene, showing significant purifying selection. OLGenie can be used to study known OLGs and to predict new OLGs in genome annotation. Software and example data are freely available at https://github.com/chasewnelson/OLGenie (last accessed April 10, 2020).
Collapse
Affiliation(s)
- Chase W Nelson
- Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY.,Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | - Zachary Ardern
- Microbial Ecology, ZIEL-Institute for Food & Health, Technische Universität München, Freising, Germany
| | - Xinzhu Wei
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI.,Department of Integrative Biology and Statistics, University of California, Berkeley, CA
| |
Collapse
|
21
|
Carter CW. Simultaneous codon usage, the origin of the proteome, and the emergence of de-novo proteins. Curr Opin Struct Biol 2021; 68:142-148. [PMID: 33529785 DOI: 10.1016/j.sbi.2021.01.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 01/05/2021] [Indexed: 12/21/2022]
Abstract
Genetic coding generally uses only one of a gene's two strands; its complement serving as template for replication. Aminoacyl-tRNA synthetases, aaRS, apparently first emerged as pairs on bidirectional genes, in which anticodons in the template strand served as codons for an entirely different protein. Interpreting both strands in frame constrained such genes sufficiently that it was rapidly superseded, leaving only traces in the elevated pairing between codon middle bases in antiparallel alignments. Codon assignments actually promote using information from both strands in multiple reading frames. Related phenomena, known as overprinting, are widely associated with viruses. In-frame bidirectional coding and overprinting nevertheless imply different structural and functional relationships, and different roles in generating folded proteins throughout the evolution of the proteome.
Collapse
Affiliation(s)
- Charles W Carter
- Department of Biochemistry, Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-7260, United States.
| |
Collapse
|
22
|
Douglas J, Drummond AJ, Kingston RL. Evolutionary history of cotranscriptional editing in the paramyxoviral phosphoprotein gene. Virus Evol 2021; 7:veab028. [PMID: 34141448 PMCID: PMC8204654 DOI: 10.1093/ve/veab028] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
The phosphoprotein gene of the paramyxoviruses encodes multiple protein products. The P, V, and W proteins are generated by transcriptional slippage. This process results in the insertion of non-templated guanosine nucleosides into the mRNA at a conserved edit site. The P protein is an essential component of the viral RNA polymerase and is encoded by a faithful copy of the gene in the majority of paramyxoviruses. However, in some cases, the non-essential V protein is encoded by default and guanosines must be inserted into the mRNA in order to encode P. The number of guanosines inserted into the P gene can be described by a probability distribution, which varies between viruses. In this article, we review the nature of these distributions, which can be inferred from mRNA sequencing data, and reconstruct the evolutionary history of cotranscriptional editing in the paramyxovirus family. Our model suggests that, throughout known history of the family, the system has switched from a P default to a V default mode four times; complete loss of the editing system has occurred twice, the canonical zinc finger domain of the V protein has been deleted or heavily mutated a further two times, and the W protein has independently evolved a novel function three times. Finally, we review the physical mechanisms of cotranscriptional editing via slippage of the viral RNA polymerase.
Collapse
Affiliation(s)
- Jordan Douglas
- Centre for Computational Evolution, University of Auckland, Auckland 1010, New Zealand
- School of Computer Science, University of Auckland, Auckland 1010, New Zealand
| | - Alexei J Drummond
- Centre for Computational Evolution, University of Auckland, Auckland 1010, New Zealand
- School of Biological Sciences, University of Auckland, Auckland 1010, New Zealand
| | - Richard L Kingston
- School of Biological Sciences, University of Auckland, Auckland 1010, New Zealand
| |
Collapse
|
23
|
Wright BW, Ruan J, Molloy MP, Jaschke PR. Genome Modularization Reveals Overlapped Gene Topology Is Necessary for Efficient Viral Reproduction. ACS Synth Biol 2020; 9:3079-3090. [PMID: 33044064 DOI: 10.1021/acssynbio.0c00323] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Sequence overlap between two genes is common across all genomes, with viruses having high proportions of these gene overlaps. Genome modularization and refactoring is the process of disrupting natural gene overlaps to separate coding sequences to enable their individual manipulation. The biological function and fitness effects of gene overlaps are not fully understood, and their effects on gene cluster and genome-level refactoring are unknown. The bacteriophage φX174 genome has ∼26% of nucleotides involved in encoding more than one gene. In this study we use an engineered φX174 phage containing a genome with all gene overlaps removed to show that gene overlap is critical to maintaining optimal viral fecundity. Through detailed phenotypic measurements we reveal that genome modularization in φX174 causes virion replication, stability, and attachment deficiencies. Quantitation of the complete phage proteome across an infection cycle reveals 30% of proteins display abnormal expression patterns. Taken together, we have for the first time comprehensively demonstrated that gene modularization severely perturbs the coordinated functioning of a bacteriophage replication cycle. This work highlights the biological importance of gene overlap in natural genomes and that reducing gene overlap disruption should be an integral part of future genome engineering projects.
Collapse
Affiliation(s)
- Bradley W. Wright
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia
| | - Juanfang Ruan
- Electron Microscope Unit, Mark Wainwright Analytical Centre, The University of New South Wales, Sydney, NSW 2052, Australia
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, NSW 2052, Australia
| | - Mark P. Molloy
- Kolling Institute, Northern Clinical School, The University of Sydney, Sydney, NSW 2006, Australia
| | - Paul R. Jaschke
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia
| |
Collapse
|
24
|
Stervbo U, Rahmann S, Roch T, Westhoff TH, Babel N. Epitope similarity cannot explain the pre-formed T cell immunity towards structural SARS-CoV-2 proteins. Sci Rep 2020; 10:18995. [PMID: 33149224 PMCID: PMC7642385 DOI: 10.1038/s41598-020-75972-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 10/18/2020] [Indexed: 01/08/2023] Open
Abstract
The current pandemic is caused by the SARS-CoV-2 virus and large progress in understanding the pathology of the virus has been made since its emergence in late 2019. Several reports indicate short lasting immunity against endemic coronaviruses, which contrasts studies showing that biobanked venous blood contains T cells reactive to SARS-CoV-2 S-protein even before the outbreak in Wuhan. This suggests a preformed T cell memory towards structural proteins in individuals not exposed to SARS-CoV-2. Given the similarity of SARS-CoV-2 to other members of the Coronaviridae family, the endemic coronaviruses appear likely candidates to generate this T cell memory. However, given the apparent poor immunological memory created by the endemic coronaviruses, immunity against other common pathogens might offer an alternative explanation. Here, we utilize a combination of epitope prediction and similarity to common human pathogens to identify potential sources of the SARS-CoV-2 T cell memory. Although beta-coronaviruses are the most likely candidates to explain the pre-existing SARS-CoV-2 reactive T cells in uninfected individuals, the SARS-CoV-2 epitopes with the highest similarity to those from beta-coronaviruses are confined to replication associated proteins-not the host interacting S-protein. Thus, our study suggests that the observed SARS-CoV-2 pre-formed immunity to structural proteins is not driven by near-identical epitopes.
Collapse
Affiliation(s)
- Ulrik Stervbo
- Center for Translational Medicine, University Hospital Marien Hospital Herne, Ruhr-University, Bochum, Germany.
- Berlin-Brandenburg Center for Regenerative Therapies, and Institute of Medical Immunology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität Zu Berlin, Berlin Institute of Health, Berlin, Germany.
| | - Sven Rahmann
- Genome Informatics, Institute of Human Genetics, University of Duisburg-Essen, Duisburg, Germany.
| | - Toralf Roch
- Center for Translational Medicine, University Hospital Marien Hospital Herne, Ruhr-University, Bochum, Germany
- Berlin-Brandenburg Center for Regenerative Therapies, and Institute of Medical Immunology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität Zu Berlin, Berlin Institute of Health, Berlin, Germany
| | - Timm H Westhoff
- Center for Translational Medicine, University Hospital Marien Hospital Herne, Ruhr-University, Bochum, Germany
| | - Nina Babel
- Center for Translational Medicine, University Hospital Marien Hospital Herne, Ruhr-University, Bochum, Germany
- Berlin-Brandenburg Center for Regenerative Therapies, and Institute of Medical Immunology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität Zu Berlin, Berlin Institute of Health, Berlin, Germany
| |
Collapse
|
25
|
Friedersdorff JCA, Kingston-Smith AH, Pachebat JA, Cookson AR, Rooke D, Creevey CJ. The Isolation and Genome Sequencing of Five Novel Bacteriophages From the Rumen Active Against Butyrivibrio fibrisolvens. Front Microbiol 2020; 11:1588. [PMID: 32760371 PMCID: PMC7372960 DOI: 10.3389/fmicb.2020.01588] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2019] [Accepted: 06/17/2020] [Indexed: 01/21/2023] Open
Abstract
Although the prokaryotic communities of the rumen microbiome are being uncovered through genome sequencing, little is known about the resident viral populations. Whilst temperate phages can be predicted as integrated prophages when analyzing bacterial and archaeal genomes, the genetics underpinning lytic phages remain poorly characterized. To the five genomes of bacteriophages isolated from rumen-associated samples sequenced and analyzed previously, this study adds a further five novel genomes and predictions gleaned from them to further the understanding of the rumen phage population. Lytic bacteriophages isolated from fresh ovine and bovine fecal and rumen fluid samples were active against the predominant fibrolytic ruminal bacterium Butyrivibrio fibrisolvens. The double stranded DNA genomes were sequenced and reconstructed into single circular complete contigs. Based on sequence similarity and genome distances, the five phages represent four species from three separate genera, consisting of: (1) Butyrivibrio phages Arian and Bo-Finn; (2) Butyrivibrio phages Idris and Arawn; and (3) Butyrivibrio phage Ceridwen. They were predicted to all belong to the Siphoviridae family, based on evidence in the genomes such as size, the presence of the tail morphogenesis module, genes that share similarity to those in other siphovirus isolates and phylogenetic analysis using phage proteomes. Yet, phylogenomic analysis and sequence similarity of the entire phage genomes revealed that these five phages are unique and novel. These phages have only been observed undergoing the lytic lifecycle, but there is evidence in the genomes of phages Arawn and Idris for the potential to be temperate. However, there is no evidence in the genome of the bacterial host Butyrivibrio fibrisolvens of prophage genes or genes that share similarity with the phage genomes.
Collapse
Affiliation(s)
- Jessica C A Friedersdorff
- Institute of Biological, Environmental and Rural Sciences (IBERS), Aberystwyth University, Aberystwyth, United Kingdom.,Institute for Global Food Security (IGFS), Queen's University, Belfast, United Kingdom
| | - Alison H Kingston-Smith
- Institute of Biological, Environmental and Rural Sciences (IBERS), Aberystwyth University, Aberystwyth, United Kingdom
| | - Justin A Pachebat
- Institute of Biological, Environmental and Rural Sciences (IBERS), Aberystwyth University, Aberystwyth, United Kingdom
| | - Alan R Cookson
- Institute of Biological, Environmental and Rural Sciences (IBERS), Aberystwyth University, Aberystwyth, United Kingdom
| | - David Rooke
- Dynamic Extractions Ltd., Tredegar, United Kingdom
| | - Christopher J Creevey
- Institute for Global Food Security (IGFS), Queen's University, Belfast, United Kingdom
| |
Collapse
|
26
|
Pavesi A. New insights into the evolutionary features of viral overlapping genes by discriminant analysis. Virology 2020; 546:51-66. [PMID: 32452417 PMCID: PMC7157939 DOI: 10.1016/j.virol.2020.03.007] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2020] [Accepted: 03/29/2020] [Indexed: 12/18/2022]
Abstract
Overlapping genes originate by a mechanism of overprinting, in which nucleotide substitutions in a pre-existing frame induce the expression of a de novo protein from an alternative frame. In this study, I assembled a dataset of 319 viral overlapping genes, which included 82 overlaps whose expression is experimentally known and the respective 237 homologs. Principal component analysis revealed that overlapping genes have a common pattern of nucleotide and amino acid composition. Discriminant analysis separated overlapping from non-overlapping genes with an accuracy of 97%. When applied to overlapping genes with known genealogy, it separated ancestral from de novo frames with an accuracy close to 100%. This high discriminant power was crucial to computationally design variants of de novo viral proteins known to possess selective anticancer toxicity (apoptin) or protection against neurodegeneration (X protein), as well as to detect two new potential overlapping genes in the genome of the new coronavirus SARS-CoV-2.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parco Area Delle Scienze 23/A, I-43124, Parma, Italy.
| |
Collapse
|
27
|
Zehentner B, Ardern Z, Kreitmeier M, Scherer S, Neuhaus K. A Novel pH-Regulated, Unusual 603 bp Overlapping Protein Coding Gene pop Is Encoded Antisense to ompA in Escherichia coli O157:H7 (EHEC). Front Microbiol 2020; 11:377. [PMID: 32265854 PMCID: PMC7103648 DOI: 10.3389/fmicb.2020.00377] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2019] [Accepted: 02/20/2020] [Indexed: 12/23/2022] Open
Abstract
Antisense transcription is well known in bacteria. However, translation of antisense RNAs is typically not considered, as the implied overlapping coding at a DNA locus is assumed to be highly improbable. Therefore, such overlapping genes are systematically excluded in prokaryotic genome annotation. Here we report an exceptional 603 bp long open reading frame completely embedded in antisense to the gene of the outer membrane protein ompA. An active σ70 promoter, transcription start site (TSS), Shine-Dalgarno motif and rho-independent terminator were experimentally validated, providing evidence that this open reading frame has all the structural features of a functional gene. Furthermore, ribosomal profiling revealed translation of the mRNA, the protein was detected in Western blots and a pH-dependent phenotype conferred by the protein was shown in competitive overexpression growth experiments of a translationally arrested mutant versus wild type. We designate this novel gene pop (pH-regulated overlapping protein-coding gene), thus adding another example to the growing list of overlapping, protein coding genes in bacteria.
Collapse
Affiliation(s)
- Barbara Zehentner
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany
| | - Zachary Ardern
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany
| | - Michaela Kreitmeier
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany
| | - Siegfried Scherer
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany
- ZIEL – Institute for Food & Health, Technical University of Munich, Freising, Germany
| | - Klaus Neuhaus
- ZIEL – Institute for Food & Health, Technical University of Munich, Freising, Germany
- Core Facility Microbiome, ZIEL – Institute for Food & Health, Technical University of Munich, Freising, Germany
| |
Collapse
|
28
|
Abstract
Overlapping genes are commonplace in viruses and play an important role in their function and evolution. However, aside from studies on specific groups of viruses, relatively little is known about the extent and nature of gene overlap and its determinants in viruses as a whole. Here, we present an extensive characterisation of gene overlap in viruses through an analysis of reference genomes present in the NCBI virus genome database. We find that over half the instances of gene overlap are very small, covering <10 nt, and 84 per cent are <50 nt in length. Despite this, 53 per cent of all viruses still contained a gene overlap of 50 nt or larger. We also investigate several predictors of gene overlap such as genome structure (single- and double-stranded RNA and DNA), virus family, genome length, and genome segmentation. This revealed that gene overlap occurs more frequently in DNA viruses than in RNA viruses, and more frequently in single-stranded viruses than in double-stranded viruses. Genome segmentation is also associated with gene overlap, particularly in single-stranded DNA viruses. Notably, we observed a large range of overlap frequencies across families of all genome types, suggesting that it is a common evolutionary trait that provides flexible genome structures in all virus families.
Collapse
Affiliation(s)
- Timothy E Schlub
- Sydney School of Public Health, Faculty of Medicine and Health,The University of Sydney, NSW, 2006, Australia
| | - Edward C Holmes
- School of Life and Environmental Sciences and School of Medical Sciences, Marie Bashir Institute for Infectious Diseases and Biosecurity, The University of Sydney, Sydney, NSW 2006, Australia
| |
Collapse
|
29
|
Dinan AM, Lukhovitskaya NI, Olendraite I, Firth AE. A case for a negative-strand coding sequence in a group of positive-sense RNA viruses. Virus Evol 2020; 6:veaa007. [PMID: 32064120 PMCID: PMC7010960 DOI: 10.1093/ve/veaa007] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Positive-sense single-stranded RNA viruses form the largest and most diverse group of eukaryote-infecting viruses. Their genomes comprise one or more segments of coding-sense RNA that function directly as messenger RNAs upon release into the cytoplasm of infected cells. Positive-sense RNA viruses are generally accepted to encode proteins solely on the positive strand. However, we previously identified a surprisingly long (∼1,000-codon) open reading frame (ORF) on the negative strand of some members of the family Narnaviridae which, together with RNA bacteriophages of the family Leviviridae, form a sister group to all other positive-sense RNA viruses. Here, we completed the genomes of three mosquito-associated narnaviruses, all of which have the long reverse-frame ORF. We systematically identified narnaviral sequences in public data sets from a wide range of sources, including arthropod, fungal, and plant transcriptomic data sets. Long reverse-frame ORFs are widespread in one clade of narnaviruses, where they frequently occupy >95 per cent of the genome. The reverse-frame ORFs correspond to a specific avoidance of CUA, UUA, and UCA codons (i.e. stop codon reverse complements) in the forward-frame RNA-dependent RNA polymerase ORF. However, absence of these codons cannot be explained by other factors such as inability to decode these codons or GC3 bias. Together with other analyses, we provide the strongest evidence yet of coding capacity on the negative strand of a positive-sense RNA virus. As these ORFs comprise some of the longest known overlapping genes, their study may be of broad relevance to understanding overlapping gene evolution and de novo origin of genes.
Collapse
Affiliation(s)
- Adam M Dinan
- Division of Virology, Department of Pathology, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QP, UK
| | - Nina I Lukhovitskaya
- Division of Virology, Department of Pathology, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QP, UK
| | - Ingrida Olendraite
- Division of Virology, Department of Pathology, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QP, UK
| | - Andrew E Firth
- Division of Virology, Department of Pathology, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QP, UK
| |
Collapse
|
30
|
DeRisi JL, Huber G, Kistler A, Retallack H, Wilkinson M, Yllanes D. An exploration of ambigrammatic sequences in narnaviruses. Sci Rep 2019; 9:17982. [PMID: 31784609 PMCID: PMC6884476 DOI: 10.1038/s41598-019-54181-3] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2019] [Accepted: 11/11/2019] [Indexed: 11/09/2022] Open
Abstract
Narnaviruses have been described as positive-sense RNA viruses with a remarkably simple genome of ~3 kb, encoding only a highly conserved RNA-dependent RNA polymerase (RdRp). Many narnaviruses, however, are 'ambigrammatic' and harbour an additional uninterrupted open reading frame (ORF) covering almost the entire length of the reverse complement strand. No function has been described for this ORF, yet the absence of stops is conserved across diverse narnaviruses, and in every case the codons in the reverse ORF and the RdRp are aligned. The >3 kb ORF overlap on opposite strands, unprecedented among RNA viruses, motivates an exploration of the constraints imposed or alleviated by the codon alignment. Here, we show that only when the codon frames are aligned can all stop codons be eliminated from the reverse strand by synonymous single-nucleotide substitutions in the RdRp gene, suggesting a mechanism for de novo gene creation within a strongly conserved amino-acid sequence. It will be fascinating to explore what implications this coding strategy has for other aspects of narnavirus biology. Beyond narnaviruses, our rapidly expanding catalogue of viral diversity may yet reveal additional examples of this broadly-extensible principle for ambigrammatic-sequence development.
Collapse
Affiliation(s)
- Joseph L DeRisi
- Chan Zuckerberg Biohub, 499 Illinois Street, San Francisco, CA, 94158, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco, California, USA
| | - Greg Huber
- Chan Zuckerberg Biohub, 499 Illinois Street, San Francisco, CA, 94158, USA
| | - Amy Kistler
- Chan Zuckerberg Biohub, 499 Illinois Street, San Francisco, CA, 94158, USA
| | - Hanna Retallack
- Department of Biochemistry and Biophysics, University of California, San Francisco, California, USA
| | - Michael Wilkinson
- Chan Zuckerberg Biohub, 499 Illinois Street, San Francisco, CA, 94158, USA
- School of Mathematics and Statistics, The Open University, Walton Hall, Milton Keynes, MK7 6AA, England
| | - David Yllanes
- Chan Zuckerberg Biohub, 499 Illinois Street, San Francisco, CA, 94158, USA.
| |
Collapse
|
31
|
Cervera L, Gòdia F, Tarrés-Freixas F, Aguilar-Gurrieri C, Carrillo J, Blanco J, Gutiérrez-Granados S. Production of HIV-1-based virus-like particles for vaccination: achievements and limits. Appl Microbiol Biotechnol 2019; 103:7367-7384. [DOI: 10.1007/s00253-019-10038-3] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 07/15/2019] [Accepted: 07/16/2019] [Indexed: 12/20/2022]
|
32
|
Schlub TE, Buchmann JP, Holmes EC. A Simple Method to Detect Candidate Overlapping Genes in Viruses Using Single Genome Sequences. Mol Biol Evol 2019; 35:2572-2581. [PMID: 30099499 PMCID: PMC6188560 DOI: 10.1093/molbev/msy155] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Overlapping genes in viruses maximize the coding capacity of their genomes and allow the generation of new genes without major increases in genome size. Despite their importance, the evolution and function of overlapping genes are often not well understood, in part due to difficulties in their detection. In addition, most bioinformatic approaches for the detection of overlapping genes require the comparison of multiple genome sequences that may not be available in metagenomic surveys of virus biodiversity. We introduce a simple new method for identifying candidate functional overlapping genes using single virus genome sequences. Our method uses randomization tests to estimate the expected length of open reading frames and then identifies overlapping open reading frames that significantly exceed this length and are thus predicted to be functional. We applied this method to 2548 reference RNA virus genomes and find that it has both high sensitivity and low false discovery for genes that overlap by at least 50 nucleotides. Notably, this analysis provided evidence for 29 previously undiscovered functional overlapping genes, some of which are coded in the antisense direction suggesting there are limitations in our current understanding of RNA virus replication.
Collapse
Affiliation(s)
- Timothy E Schlub
- Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW, Australia
| | - Jan P Buchmann
- Marie Bashir Institute for Infectious Diseases and Biosecurity, Charles Perkins Centre, School of Life and Environmental Sciences and Sydney Medical School, The University of Sydney, Sydney, NSW , Australia
| | - Edward C Holmes
- Marie Bashir Institute for Infectious Diseases and Biosecurity, Charles Perkins Centre, School of Life and Environmental Sciences and Sydney Medical School, The University of Sydney, Sydney, NSW , Australia
| |
Collapse
|
33
|
Brandes N, Linial M. Giant Viruses-Big Surprises. Viruses 2019; 11:v11050404. [PMID: 31052218 PMCID: PMC6563228 DOI: 10.3390/v11050404] [Citation(s) in RCA: 51] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Revised: 04/17/2019] [Accepted: 04/23/2019] [Indexed: 12/21/2022] Open
Abstract
Viruses are the most prevalent infectious agents, populating almost every ecosystem on earth. Most viruses carry only a handful of genes supporting their replication and the production of capsids. It came as a great surprise in 2003 when the first giant virus was discovered and found to have a >1 Mbp genome encoding almost a thousand proteins. Following this first discovery, dozens of giant virus strains across several viral families have been reported. Here, we provide an updated quantitative and qualitative view on giant viruses and elaborate on their shared and variable features. We review the complexity of giant viral proteomes, which include functions traditionally associated only with cellular organisms. These unprecedented functions include components of the translation machinery, DNA maintenance, and metabolic enzymes. We discuss the possible underlying evolutionary processes and mechanisms that might have shaped the diversity of giant viruses and their genomes, highlighting their remarkable capacity to hijack genes and genomic sequences from their hosts and environments. This leads us to examine prominent theories regarding the origin of giant viruses. Finally, we present the emerging ecological view of giant viruses, found across widespread habitats and ecological systems, with respect to the environment and human health.
Collapse
Affiliation(s)
- Nadav Brandes
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem 91904, Israel.
| | - Michal Linial
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem 91904, Israel.
| |
Collapse
|
34
|
Pavesi A. Asymmetric evolution in viral overlapping genes is a source of selective protein adaptation. Virology 2019; 532:39-47. [PMID: 31004987 PMCID: PMC7125799 DOI: 10.1016/j.virol.2019.03.017] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Revised: 03/25/2019] [Accepted: 03/26/2019] [Indexed: 12/29/2022]
Abstract
Overlapping genes represent an intriguing puzzle, as they encode two proteins whose ability to evolve is constrained by each other. Overlapping genes can undergo “symmetric evolution” (similar selection pressures on the two proteins) or “asymmetric evolution” (significantly different selection pressures on the two proteins). By sequence analysis of 75 pairs of homologous viral overlapping genes, I evaluated their accordance with one or the other model. Analysis of nucleotide and amino acid sequences revealed that half of overlaps undergo asymmetric evolution, as the protein from one frame shows a number of substitutions significantly higher than that of the protein from the other frame. Interestingly, the most variable protein (often known to interact with the host proteins) appeared to be encoded by the de novo frame in all cases examined. These findings suggest that overlapping genes, besides to increase the coding ability of viruses, are also a source of selective protein adaptation. A dataset of 80 pairs of homologous overlapping genes from viruses is examined. Its analysis reveals that half of overlapping genes undergo asymmetric evolution. The most variable gene product is that encoded by the de novo overlapping gene. Overlapping genes evolving asymmetrically are a source of selective protein adaptation.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parco Area delle Scienze 11/A, I-43124, Parma, Italy.
| |
Collapse
|
35
|
Depledge DP, Srinivas KP, Sadaoka T, Bready D, Mori Y, Placantonakis DG, Mohr I, Wilson AC. Direct RNA sequencing on nanopore arrays redefines the transcriptional complexity of a viral pathogen. Nat Commun 2019; 10:754. [PMID: 30765700 PMCID: PMC6376126 DOI: 10.1038/s41467-019-08734-9] [Citation(s) in RCA: 152] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2018] [Accepted: 01/25/2019] [Indexed: 12/18/2022] Open
Abstract
Characterizing complex viral transcriptomes by conventional RNA sequencing approaches is complicated by high gene density, overlapping reading frames, and complex splicing patterns. Direct RNA sequencing (direct RNA-seq) using nanopore arrays offers an exciting alternative whereby individual polyadenylated RNAs are sequenced directly, without the recoding and amplification biases inherent to other sequencing methodologies. Here we use direct RNA-seq to profile the herpes simplex virus type 1 (HSV-1) transcriptome during productive infection of primary cells. We show how direct RNA-seq data can be used to define transcription initiation and RNA cleavage sites associated with all polyadenylated viral RNAs and demonstrate that low level read-through transcription produces a novel class of chimeric HSV-1 transcripts, including a functional mRNA encoding a fusion of the viral E3 ubiquitin ligase ICP0 and viral membrane glycoprotein L. Thus, direct RNA-seq offers a powerful method to characterize the changing transcriptional landscape of viruses with complex genomes.
Collapse
Affiliation(s)
- Daniel P Depledge
- Department of Microbiology, New York University School of Medicine, New York, NY, 10016, USA.
| | | | - Tomohiko Sadaoka
- Division of Clinical Virology, Center for Infectious Diseases, Kobe University Graduate School of Medicine, 7-5-1 Kusunoki-cho, Chuo-ku, Kobe, 650-0017, Japan
| | - Devin Bready
- Department of Neurosurgery, New York University School of Medicine, New York, NY, 10016, USA
| | - Yasuko Mori
- Division of Clinical Virology, Center for Infectious Diseases, Kobe University Graduate School of Medicine, 7-5-1 Kusunoki-cho, Chuo-ku, Kobe, 650-0017, Japan
| | - Dimitris G Placantonakis
- Department of Neurosurgery, New York University School of Medicine, New York, NY, 10016, USA
- Kimmel Center for Stem Cell Biology, New York University School of Medicine, New York, NY, 10016, USA
- Laura and Isaac Perlmutter Cancer Center, New York University School of Medicine, New York, NY, 10016, USA
- Brain Tumor Center, New York University School of Medicine, New York, NY, 10016, USA
- Neuroscience Institute, New York University School of Medicine, New York, NY, 10016, USA
| | - Ian Mohr
- Department of Microbiology, New York University School of Medicine, New York, NY, 10016, USA
- Laura and Isaac Perlmutter Cancer Center, New York University School of Medicine, New York, NY, 10016, USA
| | - Angus C Wilson
- Department of Microbiology, New York University School of Medicine, New York, NY, 10016, USA.
- Laura and Isaac Perlmutter Cancer Center, New York University School of Medicine, New York, NY, 10016, USA.
| |
Collapse
|
36
|
Pavesi A, Vianelli A, Chirico N, Bao Y, Blinkova O, Belshaw R, Firth A, Karlin D. Overlapping genes and the proteins they encode differ significantly in their sequence composition from non-overlapping genes. PLoS One 2018; 13:e0202513. [PMID: 30339683 PMCID: PMC6195259 DOI: 10.1371/journal.pone.0202513] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2018] [Accepted: 08/03/2018] [Indexed: 11/19/2022] Open
Abstract
Overlapping genes represent a fascinating evolutionary puzzle, since they encode two functionally unrelated proteins from the same DNA sequence. They originate by a mechanism of overprinting, in which point mutations in an existing frame allow the expression (the "birth") of a completely new protein from a second frame. In viruses, in which overlapping genes are abundant, these new proteins often play a critical role in infection, yet they are frequently overlooked during genome annotation. This results in erroneous interpretation of mutational studies and in a significant waste of resources. Therefore, overlapping genes need to be correctly detected, especially since they are now thought to be abundant also in eukaryotes. Developing better detection methods and conducting systematic evolutionary studies require a large, reliable benchmark dataset of known cases. We thus assembled a high-quality dataset of 80 viral overlapping genes whose expression is experimentally proven. Many of them were not present in databases. We found that overall, overlapping genes differ significantly from non-overlapping genes in their nucleotide and amino acid composition. In particular, the proteins they encode are enriched in high-degeneracy amino acids and depleted in low-degeneracy ones, which may alleviate the evolutionary constraints acting on overlapping genes. Principal component analysis revealed that the vast majority of overlapping genes follow a similar composition bias, despite their heterogeneity in length and function. Six proven mammalian overlapping genes also followed this bias. We propose that this apparently near-universal composition bias may either favour the birth of overlapping genes, or/and result from selection pressure acting on them.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma, Italy
| | - Alberto Vianelli
- Department of Theoretical and Applied Sciences, University of Insubria, Varese, Italy
| | - Nicola Chirico
- Department of Theoretical and Applied Sciences, University of Insubria, Varese, Italy
| | - Yiming Bao
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Olga Blinkova
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, United States of America
| | - Robert Belshaw
- School of Biomedical & Healthcare Sciences, Plymouth University Peninsula Schools of Medicine and Dentistry (PUPSMD), Plymouth, United Kingdom
| | - Andrew Firth
- Department of Pathology, Division of Virology, University of Cambridge, Cambridge, United Kingdom
| | - David Karlin
- Department of Zoology, University of Oxford, Oxford, United Kingdom
- Division of Structural Biology, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
37
|
Aherfi S, Andreani J, Baptiste E, Oumessoum A, Dornas FP, Andrade ACDSP, Chabriere E, Abrahao J, Levasseur A, Raoult D, La Scola B, Colson P. A Large Open Pangenome and a Small Core Genome for Giant Pandoraviruses. Front Microbiol 2018; 9:1486. [PMID: 30042742 PMCID: PMC6048876 DOI: 10.3389/fmicb.2018.01486] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2018] [Accepted: 06/14/2018] [Indexed: 01/09/2023] Open
Abstract
Giant viruses of amoebae are distinct from classical viruses by the giant size of their virions and genomes. Pandoraviruses are the record holders in size of genomes and number of predicted genes. Three strains, P. salinus, P. dulcis, and P. inopinatum, have been described to date. We isolated three new ones, namely P. massiliensis, P. braziliensis, and P. pampulha, from environmental samples collected in Brazil. We describe here their genomes, the transcriptome and proteome of P. massiliensis, and the pangenome of the group encompassing the six pandoravirus isolates. Genome sequencing was performed with an Illumina MiSeq instrument. Genome annotation was performed using GeneMarkS and Prodigal softwares and comparative genomic analyses. The core genome and pangenome were determined using notably ProteinOrtho and CD-HIT programs. Transcriptomics was performed for P. massiliensis with the Illumina MiSeq instrument; proteomics was also performed for this virus using 1D/2D gel electrophoresis and mass spectrometry on a Synapt G2Si Q-TOF traveling wave mobility spectrometer. The genomes of the three new pandoraviruses are comprised between 1.6 and 1.8 Mbp. The genomes of P. massiliensis, P. pampulha, and P. braziliensis were predicted to harbor 1,414, 2,368, and 2,696 genes, respectively. These genes comprise up to 67% of ORFans. Phylogenomic analyses showed that P. massiliensis and P. braziliensis were more closely related to each other than to the other pandoraviruses. The core genome of pandoraviruses comprises 352 clusters of genes, and the ratio core genome/pangenome is less than 0.05. The extinction curve shows clearly that the pangenome is still open. A quarter of the gene content of P. massiliensis was detected by transcriptomics. In addition, a product for a total of 162 open reading frames were found by proteomic analysis of P. massiliensis virions, including notably the products of 28 ORFans, 99 hypothetical proteins, and 90 core genes. Further analyses should allow to gain a better knowledge and understanding of the evolution and origin of these giant pandoraviruses, and of their relationships with viruses and cellular microorganisms.
Collapse
Affiliation(s)
- Sarah Aherfi
- Microbes Evolution Phylogenie et Infections (MEϕI), Institut Hospitalo-Universitaire Méditerranée Infection, Assistance Publique - Hôpitaux de Marseille, Institut de Recherche pour le Développement, Aix-Marseille Université, Marseille, France
| | - Julien Andreani
- Microbes Evolution Phylogenie et Infections (MEϕI), Institut Hospitalo-Universitaire Méditerranée Infection, Assistance Publique - Hôpitaux de Marseille, Institut de Recherche pour le Développement, Aix-Marseille Université, Marseille, France
| | - Emeline Baptiste
- Microbes Evolution Phylogenie et Infections (MEϕI), Institut Hospitalo-Universitaire Méditerranée Infection, Assistance Publique - Hôpitaux de Marseille, Institut de Recherche pour le Développement, Aix-Marseille Université, Marseille, France
| | - Amina Oumessoum
- Microbes Evolution Phylogenie et Infections (MEϕI), Institut Hospitalo-Universitaire Méditerranée Infection, Assistance Publique - Hôpitaux de Marseille, Institut de Recherche pour le Développement, Aix-Marseille Université, Marseille, France
| | - Fábio P Dornas
- Departamento de Microbiologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - Ana Claudia Dos S P Andrade
- Departamento de Microbiologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - Eric Chabriere
- Microbes Evolution Phylogenie et Infections (MEϕI), Institut Hospitalo-Universitaire Méditerranée Infection, Assistance Publique - Hôpitaux de Marseille, Institut de Recherche pour le Développement, Aix-Marseille Université, Marseille, France
| | - Jonatas Abrahao
- Departamento de Microbiologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - Anthony Levasseur
- Microbes Evolution Phylogenie et Infections (MEϕI), Institut Hospitalo-Universitaire Méditerranée Infection, Assistance Publique - Hôpitaux de Marseille, Institut de Recherche pour le Développement, Aix-Marseille Université, Marseille, France
| | - Didier Raoult
- Microbes Evolution Phylogenie et Infections (MEϕI), Institut Hospitalo-Universitaire Méditerranée Infection, Assistance Publique - Hôpitaux de Marseille, Institut de Recherche pour le Développement, Aix-Marseille Université, Marseille, France
| | - Bernard La Scola
- Microbes Evolution Phylogenie et Infections (MEϕI), Institut Hospitalo-Universitaire Méditerranée Infection, Assistance Publique - Hôpitaux de Marseille, Institut de Recherche pour le Développement, Aix-Marseille Université, Marseille, France
| | - Philippe Colson
- Microbes Evolution Phylogenie et Infections (MEϕI), Institut Hospitalo-Universitaire Méditerranée Infection, Assistance Publique - Hôpitaux de Marseille, Institut de Recherche pour le Développement, Aix-Marseille Université, Marseille, France
| |
Collapse
|
38
|
Mahmoudabadi G, Phillips R. A comprehensive and quantitative exploration of thousands of viral genomes. eLife 2018; 7:31955. [PMID: 29624169 PMCID: PMC5908442 DOI: 10.7554/elife.31955] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2017] [Accepted: 03/30/2018] [Indexed: 01/27/2023] Open
Abstract
The complete assembly of viral genomes from metagenomic datasets (short genomic sequences gathered from environmental samples) has proven to be challenging, so there are significant blind spots when we view viral genomes through the lens of metagenomics. One approach to overcoming this problem is to leverage the thousands of complete viral genomes that are publicly available. Here we describe our efforts to assemble a comprehensive resource that provides a quantitative snapshot of viral genomic trends – such as gene density, noncoding percentage, and abundances of functional gene categories – across thousands of viral genomes. We have also developed a coarse-grained method for visualizing viral genome organization for hundreds of genomes at once, and have explored the extent of the overlap between bacterial and bacteriophage gene pools. Existing viral classification systems were developed prior to the sequencing era, so we present our analysis in a way that allows us to assess the utility of the different classification systems for capturing genomic trends.
Collapse
Affiliation(s)
- Gita Mahmoudabadi
- Department of Bioengineering, California Institute of Technology, Pasadena, United States
| | - Rob Phillips
- Department of Bioengineering, California Institute of Technology, Pasadena, United States.,Department of Applied Physics, California Institute of Technology, Pasadena, United States
| |
Collapse
|
39
|
Kirby LE, Koslowsky D. Mitochondrial dual-coding genes in Trypanosoma brucei. PLoS Negl Trop Dis 2017; 11:e0005989. [PMID: 28991908 PMCID: PMC5650466 DOI: 10.1371/journal.pntd.0005989] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2017] [Revised: 10/20/2017] [Accepted: 09/23/2017] [Indexed: 12/31/2022] Open
Abstract
Trypanosoma brucei is transmitted between mammalian hosts by the tsetse fly. In the mammal, they are exclusively extracellular, continuously replicating within the bloodstream. During this stage, the mitochondrion lacks a functional electron transport chain (ETC). Successful transition to the fly, requires activation of the ETC and ATP synthesis via oxidative phosphorylation. This life cycle leads to a major problem: in the bloodstream, the mitochondrial genes are not under selection and are subject to genetic drift that endangers their integrity. Exacerbating this, T. brucei undergoes repeated population bottlenecks as they evade the host immune system that would create additional forces of genetic drift. These parasites possess several unique genetic features, including RNA editing of mitochondrial transcripts. RNA editing creates open reading frames by the guided insertion and deletion of U-residues within the mRNA. A major question in the field has been why this metabolically expensive system of RNA editing would evolve and persist. Here, we show that many of the edited mRNAs can alter the choice of start codon and the open reading frame by alternative editing of the 5’ end. Analyses of mutational bias indicate that six of the mitochondrial genes may be dual-coding and that RNA editing allows access to both reading frames. We hypothesize that dual-coding genes can protect genetic information by essentially hiding a non-selected gene within one that remains under selection. Thus, the complex RNA editing system found in the mitochondria of trypanosomes provides a unique molecular strategy to combat genetic drift in non-selective conditions. In African trypanosomes, many of the mitochondrial mRNAs require extensive RNA editing before they can be translated. During this process, each edited transcript can undergo hundreds of cleavage/ligation events as U-residues are inserted or deleted to generate a translatable open reading frame. A major paradox has been why this incredibly metabolically expensive process would evolve and persist. In this work, we show that many of the mitochondrial genes in trypanosomes are dual-coding, utilizing different reading frames to potentially produce two very different proteins. Access to both reading frames is made possible by alternative editing of the 5’ end of the transcript. We hypothesize that dual-coding genes may work to protect the mitochondrial genes from mutations during growth in the mammalian host, when many of the mitochondrial genes are not being used. Thus, the complex RNA editing system may be maintained because it provides a unique molecular strategy to combat genetic drift.
Collapse
Affiliation(s)
- Laura E. Kirby
- Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, Michigan, United States of America
| | - Donna Koslowsky
- Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, Michigan, United States of America
- * E-mail:
| |
Collapse
|
40
|
Dorokhov YL, Sheshukova EV, Komarova TV. Tobamovirus 3'-Terminal Gene Overlap May be a Mechanism for within-Host Fitness Improvement. Front Microbiol 2017; 8:851. [PMID: 28553276 PMCID: PMC5425575 DOI: 10.3389/fmicb.2017.00851] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2017] [Accepted: 04/25/2017] [Indexed: 12/13/2022] Open
Abstract
Overlapping genes (OGs) are a universal phenomenon in all kingdoms, and viruses display a high content of OGs combined with a high rate of evolution. It is believed that the mechanism of gene overlap is based on overprinting of an existing gene. OGs help virus genes compress a maximum amount of information into short sequences, conferring viral proteins with novel features and thereby increasing their within-host fitness. Analysis of tobamovirus 3′-terminal genes reveals at least two modes of OG organization and mechanisms of interaction with the host. Originally isolated from Solanaceae species, viruses (referred to as Solanaceae-infecting) such as tobacco mosaic virus do not show 3′-terminal overlap between movement protein (MP) and coat protein (CP) genes but do contain open reading frame 6 (ORF6), which overlaps with both genes. Conversely, tobamoviruses, originally isolated from Brassicaceae species (referred to as Brassicaceae-infecting) and also able to infect Solanaceae plants, have no ORF6 but are characterized by overlapping MP and CP genes. Our analysis showed that the MP/CP overlap of Brassicaceae-infecting tobamoviruses results in the following: (i) genome compression and strengthening of subgenomic promoters; (ii) CP gene early expression directly from genomic and dicistronic MP subgenomic mRNA using an internal ribosome entry site (IRES) and a stable hairpin structure in the overlapping region; (iii) loss of ORF6, which influences the symptomatology of Solanaceae-infecting tobamoviruses; and (iv) acquisition of an IRES polypurine-rich region encoding an MP nuclear localization signal. We believe that MP/CP gene overlap may constitute a mechanism for host range expansion and virus adjustment to Brassicaceae plants.
Collapse
Affiliation(s)
- Yuri L Dorokhov
- N.I. Vavilov Institute of General Genetics, Russian Academy of ScienceMoscow, Russia.,A.N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State UniversityMoscow, Russia
| | | | - Tatiana V Komarova
- N.I. Vavilov Institute of General Genetics, Russian Academy of ScienceMoscow, Russia.,A.N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State UniversityMoscow, Russia
| |
Collapse
|
41
|
Fernandes JD, Faust TB, Strauli NB, Smith C, Crosby DC, Nakamura RL, Hernandez RD, Frankel AD. Functional Segregation of Overlapping Genes in HIV. Cell 2017; 167:1762-1773.e12. [PMID: 27984726 DOI: 10.1016/j.cell.2016.11.031] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2016] [Revised: 09/29/2016] [Accepted: 11/15/2016] [Indexed: 11/28/2022]
Abstract
Overlapping genes pose an evolutionary dilemma as one DNA sequence evolves under the selection pressures of multiple proteins. Here, we perform systematic statistical and mutational analyses of the overlapping HIV-1 genes tat and rev and engineer exhaustive libraries of non-overlapped viruses to perform deep mutational scanning of each gene independently. We find a "segregated" organization in which overlapped sites encode functional residues of one gene or the other, but never both. Furthermore, this organization eliminates unfit genotypes, providing a fitness advantage to the population. Our comprehensive analysis reveals the extraordinary manner in which HIV minimizes the constraint of overlapping genes and repurposes that constraint to its own advantage. Thus, overlaps are not just consequences of evolutionary constraints, but rather can provide population fitness advantages.
Collapse
Affiliation(s)
- Jason D Fernandes
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94158, USA; Program in Pharmaceutical Sciences and Pharmacogenomics, University of California San Francisco, San Francisco, CA 94158, USA
| | - Tyler B Faust
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94158, USA; Tetrad Program, Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94158, USA
| | - Nicolas B Strauli
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA; Biomedical Sciences Graduate Program, University of California San Francisco, San Francisco, CA 94158, USA
| | - Cynthia Smith
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94158, USA
| | - David C Crosby
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94158, USA
| | - Robert L Nakamura
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94158, USA
| | - Ryan D Hernandez
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA
| | - Alan D Frankel
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94158, USA.
| |
Collapse
|
42
|
The ins and outs of eukaryotic viruses: Knowledge base and ontology of a viral infection. PLoS One 2017; 12:e0171746. [PMID: 28207819 PMCID: PMC5313201 DOI: 10.1371/journal.pone.0171746] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2016] [Accepted: 01/25/2017] [Indexed: 12/19/2022] Open
Abstract
Viruses are genetically diverse, infect a wide range of tissues and host cells and follow unique processes for replicating themselves. All these processes were investigated and indexed in ViralZone knowledge base. To facilitate standardizing data, a simple ontology of viral life-cycle terms was developed to provide a common vocabulary for annotating data sets. New terminology was developed to address unique viral replication cycle processes, and existing terminology was modified and adapted. The virus life-cycle is classically described by schematic pictures. Using this ontology, it can be represented by a combination of successive terms: “entry”, “latency”, “transcription”, “replication” and “exit”. Each of these parts is broken down into discrete steps. For example Zika virus “entry” is broken down in successive steps: “Attachment”, “Apoptotic mimicry”, “Viral endocytosis/ macropinocytosis”, “Fusion with host endosomal membrane”, “Viral factory”. To demonstrate the utility of a standard ontology for virus biology, this work was completed by annotating virus data in the ViralZone, UniProtKB and Gene Ontology databases.
Collapse
|