1
|
Cao M, Qiu Q, Zhang X, Zhang W, Shen Z, Ma C, Zhu M, Pan J, Tong X, Cao G, Gong C, Hu X. Identification and characterization of a novel small viral peptide (VSP59) encoded by Bombyx mori cypovirus (BmCPV) that negatively regulates viral replication. Microbiol Spectr 2024; 12:e0082624. [PMID: 39382281 PMCID: PMC11537000 DOI: 10.1128/spectrum.00826-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Accepted: 08/16/2024] [Indexed: 10/10/2024] Open
Abstract
Bombyx mori cypovirus (BmCPV), a member of the Reoviridae family, is a well-established research model for double-stranded RNA (dsRNA) viruses with segmented genomes. Despite its small genome size, the coding potential of BmCPV remains largely unexplored. In this study, we identified a novel small open reading frame within the S10 dsRNA genome, encoding a small viral peptide (VSP59) with 59 amino acid residues. Functional characterization revealed that VSP59 acts as a negative regulator of viral replication. VSP59 predominantly localizes to the cytoplasm, where it interacts with prohibitin 2 (PHB2), an inner membrane mitophagy receptor. This interaction targets mitochondria and triggers caspase 3-dependent apoptosis. Transient expression of vsp59 in BmN cells suppressed viral replication, an effect that was reversed by silencing PHB2 expression. Moreover, recombinant BmCPV with a mutated vsp59 exhibited reduced replication. Our findings demonstrate that VSP59 interacts with PHB2 on mitochondria, inducing apoptosis and thereby diminishing viral replication. This study expands our understanding of the genetic information encoded by the BmCPV genome and highlights the role of novel small peptides in host-virus interactions. IMPORTANCE A novel small open reading frame (sORF) from the viral genome was identified and characterized. The sORF could encode a small viral peptide (VSP59) that targeted mitochondria and induced prohibitin 2-related apoptosis, further attenuating Bombyx mori cypovirus replication.
Collapse
Affiliation(s)
- Manman Cao
- School of Life Science, Soochow University, Suzhou, China
| | - Qunnan Qiu
- School of Life Science, Soochow University, Suzhou, China
| | - Xing Zhang
- School of Chemistry and Life Science, Suzhou University of Science and Technology, Suzhou, China
| | - Wenxue Zhang
- School of Life Science, Soochow University, Suzhou, China
| | - Zeen Shen
- School of Life Science, Soochow University, Suzhou, China
| | - Chang Ma
- School of Life Science, Soochow University, Suzhou, China
| | - Min Zhu
- School of Life Science, Soochow University, Suzhou, China
| | - Jun Pan
- School of Life Science, Soochow University, Suzhou, China
| | - Xingyu Tong
- School of Life Science, Soochow University, Suzhou, China
| | - Guangli Cao
- School of Life Science, Soochow University, Suzhou, China
| | - Chengliang Gong
- School of Life Science, Soochow University, Suzhou, China
- Institute of Agricultural Biotechnology and Ecological Research, Soochow University, Suzhou, China
| | - Xiaolong Hu
- School of Life Science, Soochow University, Suzhou, China
- Institute of Agricultural Biotechnology and Ecological Research, Soochow University, Suzhou, China
| |
Collapse
|
2
|
Legarda EG, Elena SF, Mushegian AR. Emergence of two distinct spatial folds in a pair of plant virus proteins encoded by nested genes. J Biol Chem 2024; 300:107218. [PMID: 38522515 PMCID: PMC11044054 DOI: 10.1016/j.jbc.2024.107218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 03/15/2024] [Accepted: 03/19/2024] [Indexed: 03/26/2024] Open
Abstract
Virus genomes may encode overlapping or nested open reading frames that increase their coding capacity. It is not known whether the constraints on spatial structures of the two encoded proteins limit the evolvability of nested genes. We examine the evolution of a pair of proteins, p22 and p19, encoded by nested genes in plant viruses from the genus Tombusvirus. The known structure of p19, a suppressor of RNA silencing, belongs to the RAGNYA fold from the alpha+beta class. The structure of p22, the cell-to-cell movement protein from the 30K family widespread in plant viruses, is predicted with the AlphaFold approach, suggesting a single jelly-roll fold core from the all-beta class, structurally similar to capsid proteins from plant and animal viruses. The nucleotide and codon preferences impose modest constraints on the types of secondary structures encoded in the alternative reading frames, nonetheless allowing for compact, well-ordered folds from different structural classes in two similarly-sized nested proteins. Tombusvirus p22 emerged through radiation of the widespread 30K family, which evolved by duplication of a virus capsid protein early in the evolution of plant viruses, whereas lineage-specific p19 may have emerged by a stepwise increase in the length of the overprinted gene and incremental acquisition of functionally active secondary structure elements by the protein product. This evolution of p19 toward the RAGNYA fold represents one of the first documented examples of protein structure convergence in naturally occurring proteins.
Collapse
Affiliation(s)
- Esmeralda G Legarda
- Instituto de Biología Integrativa de Sistemas (I2SysBio), CSIC-Universitat de València, Paterna, València, Spain
| | - Santiago F Elena
- Instituto de Biología Integrativa de Sistemas (I2SysBio), CSIC-Universitat de València, Paterna, València, Spain; The Santa Fe Institute, Santa Fe, New Mexico, USA
| | - Arcady R Mushegian
- Division of Molecular and Cellular Biosciences, National Science Foundation, Arlington, Virginia, USA.
| |
Collapse
|
3
|
Pavesi A, Romerio F. Creation of the HIV-1 antisense gene asp coincided with the emergence of the pandemic group M and is associated with faster disease progression. Microbiol Spectr 2024; 12:e0380223. [PMID: 38230940 PMCID: PMC10846101 DOI: 10.1128/spectrum.03802-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 12/19/2023] [Indexed: 01/18/2024] Open
Abstract
Despite being first identified more than three decades ago, the antisense gene asp of HIV-1 remains an enigma. asp is present uniquely in pandemic (group M) HIV-1 strains, and it is absent in all non-pandemic (out-of-M) HIV-1 strains and virtually all non-human primate lentiviruses. This suggests that the creation of asp may have contributed to HIV-1 fitness or worldwide spread. It also raises the question of which evolutionary processes were at play in the creation of asp. Here, we show that HIV-1 genomes containing an intact asp gene are associated with faster HIV-1 disease progression. Furthermore, we demonstrate that the creation of a full-length asp gene occurred via the evolution of codon usage in env overlapping asp on the opposite strand. This involved differential use of synonymous codons or conservative amino acid substitution in env that eliminated internal stop codons in asp, and redistribution of synonymous codons in env that minimized the likelihood of new premature stops arising in asp. Nevertheless, the creation of a full-length asp gene reduced the genetic diversity of env. The Luria-Delbruck fluctuation test suggests that the interrupted asp open reading frame (ORF) is the progenitor of the intact ORF, rather than a descendant under random genetic drift. Therefore, the existence of group-M isolates with a truncated asp ORF indicates an incomplete transition process. For the first time, our study links the presence of a full-length asp ORF to faster disease progression, thus warranting further investigation into the cellular processes and molecular mechanisms through which the ASP protein impacts HIV-1 replication, transmission, and pathogenesis.IMPORTANCEOverlapping genes engage in a tug-of-war, constraining each other's evolution. The creation of a new gene overlapping an existing one comes at an evolutionary cost. Thus, its conservation must be advantageous, or it will be lost, especially if the pre-existing gene is essential for the viability of the virus or cell. We found that the creation and conservation of the HIV-1 antisense gene asp occurred through differential use of synonymous codons or conservative amino acid substitutions within the overlapping gene, env. This process did not involve amino acid changes in ENV that benefited its function, but rather it constrained the evolution of ENV. Nonetheless, the creation of asp brought a net selective advantage to HIV-1 because asp is conserved especially among high-prevalence strains. The association between the presence of an intact asp gene and faster HIV-1 disease progression supports that conclusion and warrants further investigation.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma, Italy
| | - Fabio Romerio
- Department of Molecular and Comparative Pathobiology, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| |
Collapse
|
4
|
Bukhnikashvili L. Overlaps Between CDS Regions of Protein-Coding Genes in the Human Genome: A Case Study on the NR1D1-THRA Gene Pair. J Mol Evol 2023; 91:963-975. [PMID: 38006429 DOI: 10.1007/s00239-023-10147-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2023] [Accepted: 11/12/2023] [Indexed: 11/27/2023]
Abstract
For several decades, it has been known that a substantial number of genes within human DNA exhibit overlap; however, the biological and evolutionary significance of these overlaps remain poorly understood. This study focused on investigating specific instances of overlap where the overlapping DNA region encompasses the coding DNA sequences (CDSs) of protein-coding genes. The results revealed that proteins encoded by overlapping CDSs exhibit greater disorder than those from nonoverlapping CDSs. Additionally, these DNA regions were identified as GC-rich. This could be partially attributed to the absence of stop codons from two distinct reading frames rather than one. Furthermore, these regions were found to harbour fewer single-nucleotide polymorphism (SNP) sites, possibly due to constraints arising from the overlapping state where mutations could affect two genes simultaneously.While elucidating these properties, the NR1D1-THRA gene pair emerged as an exceptional case with highly structured proteins and a distinctly conserved sequence across eutherian mammals. Both NR1D1 and THRA are nuclear receptors lacking a ligand-binding domain at their C-terminus, which is the region where these gene pairs overlap. The NR1D1 gene is involved in the regulation of circadian rhythm, while the THRA gene encodes a thyroid hormone receptor, and both play crucial roles in various physiological processes. This study suggests that, in addition to their well-established functions, the specifically overlapping CDS regions of these genes may encode protein segments with additional, yet undiscovered, biological roles.
Collapse
|
5
|
Ardern Z. Alternative Reading Frames are an Underappreciated Source of Protein Sequence Novelty. J Mol Evol 2023; 91:570-580. [PMID: 37326679 DOI: 10.1007/s00239-023-10122-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 05/31/2023] [Indexed: 06/17/2023]
Abstract
Protein-coding DNA sequences can be translated into completely different amino acid sequences if the nucleotide triplets used are shifted by a non-triplet amount on the same DNA strand or by translating codons from the opposite strand. Such "alternative reading frames" of protein-coding genes are a major contributor to the evolution of novel protein products. Recent studies demonstrating this include examples across the three domains of cellular life and in viruses. These sequences increase the number of trials potentially available for the evolutionary invention of new genes and also have unusual properties which may facilitate gene origin. There is evidence that the structure of the standard genetic code contributes to the features and gene-likeness of some alternative frame sequences. These findings have important implications across diverse areas of molecular biology, including for genome annotation, structural biology, and evolutionary genomics.
Collapse
|
6
|
Kienzle L, Bettinazzi S, Choquette T, Brunet M, Khorami HH, Jacques JF, Moreau M, Roucou X, Landry CR, Angers A, Breton S. A small protein coded within the mitochondrial canonical gene nd4 regulates mitochondrial bioenergetics. BMC Biol 2023; 21:111. [PMID: 37198654 DOI: 10.1186/s12915-023-01609-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2022] [Accepted: 05/03/2023] [Indexed: 05/19/2023] Open
Abstract
BACKGROUND Mitochondria have a central role in cellular functions, aging, and in certain diseases. They possess their own genome, a vestige of their bacterial ancestor. Over the course of evolution, most of the genes of the ancestor have been lost or transferred to the nucleus. In humans, the mtDNA is a very small circular molecule with a functional repertoire limited to only 37 genes. Its extremely compact nature with genes arranged one after the other and separated by short non-coding regions suggests that there is little room for evolutionary novelties. This is radically different from bacterial genomes, which are also circular but much larger, and in which we can find genes inside other genes. These sequences, different from the reference coding sequences, are called alternatives open reading frames or altORFs, and they are involved in key biological functions. However, whether altORFs exist in mitochondrial protein-coding genes or elsewhere in the human mitogenome has not been fully addressed. RESULTS We found a downstream alternative ATG initiation codon in the + 3 reading frame of the human mitochondrial nd4 gene. This newly characterized altORF encodes a 99-amino-acid-long polypeptide, MTALTND4, which is conserved in primates. Our custom antibody, but not the pre-immune serum, was able to immunoprecipitate MTALTND4 from HeLa cell lysates, confirming the existence of an endogenous MTALTND4 peptide. The protein is localized in mitochondria and cytoplasm and is also found in the plasma, and it impacts cell and mitochondrial physiology. CONCLUSIONS Many human mitochondrial translated ORFs might have so far gone unnoticed. By ignoring mtaltORFs, we have underestimated the coding potential of the mitogenome. Alternative mitochondrial peptides such as MTALTND4 may offer a new framework for the investigation of mitochondrial functions and diseases.
Collapse
Affiliation(s)
- Laura Kienzle
- Département de sciences biologiques, Université de Montréal, Montréal, Canada
| | - Stefano Bettinazzi
- Département de sciences biologiques, Université de Montréal, Montréal, Canada
| | - Thierry Choquette
- Département de sciences biologiques, Université de Montréal, Montréal, Canada
| | - Marie Brunet
- Service de génétique médicale, Département de pédiatrie, Université de Sherbrooke, Sherbrooke, Canada
- Centre de recherche du Centre hospitalier universitaire de Sherbrooke (CRCHUS), Sherbrooke, Canada
| | | | - Jean-François Jacques
- Département de biochimie et génomique fonctionnelle, Université de Sherbrooke, Sherbrooke, Canada
| | - Mathilde Moreau
- Département de biochimie et génomique fonctionnelle, Université de Sherbrooke, Sherbrooke, Canada
| | - Xavier Roucou
- Centre de recherche du Centre hospitalier universitaire de Sherbrooke (CRCHUS), Sherbrooke, Canada
- Département de biochimie et génomique fonctionnelle, Université de Sherbrooke, Sherbrooke, Canada
| | - Christian R Landry
- Département de biochimie, de microbiologie et de bio-informatique, Faculté des sciences et de génie, Université Laval, Québec, Canada
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- PROTEO, Le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, Canada
- Centre de recherche sur les données massives, Université Laval, Québec, Canada
- Département de biologie, Faculté des sciences et de génie, Université Laval, Québec, Canada
| | - Annie Angers
- Département de sciences biologiques, Université de Montréal, Montréal, Canada
| | - Sophie Breton
- Département de sciences biologiques, Université de Montréal, Montréal, Canada.
| |
Collapse
|
7
|
Graf F, Zehentner B, Fellner L, Scherer S, Neuhaus K. Three Novel Antisense Overlapping Genes in E. coli O157:H7 EDL933. Microbiol Spectr 2023; 11:e0235122. [PMID: 36533921 PMCID: PMC9927249 DOI: 10.1128/spectrum.02351-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 12/03/2022] [Indexed: 12/23/2022] Open
Abstract
The abundance of long overlapping genes in prokaryotic genomes is likely to be significantly underestimated. To date, only a few examples of such genes are fully established. Using RNA sequencing and ribosome profiling, we found expression of novel overlapping open reading frames in Escherichia coli O157:H7 EDL933 (EHEC). Indeed, the overlapping candidate genes are equipped with typical structural elements required for transcription and translation, i.e., promoters, transcription start sites, as well as terminators, all of which were experimentally verified. Translationally arrested mutants, unable to produce the overlapping encoded protein, were found to have a growth disadvantage when grown competitively against the wild type. Thus, the phenotypes found imply biological functionality of the genes at the level of proteins produced. The addition of 3 more examples of prokaryotic overlapping genes to the currently limited, yet constantly growing pool of such genes emphasizes the underestimated coding capacity of bacterial genomes. IMPORTANCE The abundance of long overlapping genes in prokaryotic genomes is likely to be significantly underestimated, since such genes are not allowed in genome annotations. However, ribosome profiling catches mRNA in the moment of being template for protein production. Using this technique and subsequent experiments, we verified 3 novel overlapping genes encoded in antisense of known genes. This adds more examples of prokaryotic overlapping genes to the currently limited, yet constantly growing pool of such genes.
Collapse
Affiliation(s)
- Franziska Graf
- Core Facility Microbiome, ZIEL – Institute for Food & Health, Technische Universität München, Freising, Germany
- Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München, Freising, Germany
| | - Barbara Zehentner
- Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München, Freising, Germany
| | - Lea Fellner
- Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München, Freising, Germany
| | - Siegfried Scherer
- Core Facility Microbiome, ZIEL – Institute for Food & Health, Technische Universität München, Freising, Germany
- Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München, Freising, Germany
| | - Klaus Neuhaus
- Core Facility Microbiome, ZIEL – Institute for Food & Health, Technische Universität München, Freising, Germany
- Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München, Freising, Germany
| |
Collapse
|
8
|
Functional benefit of structural disorder for the replication of measles, Nipah and Hendra viruses. Essays Biochem 2022; 66:915-934. [PMID: 36148633 DOI: 10.1042/ebc20220045] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 08/18/2022] [Accepted: 08/25/2022] [Indexed: 12/24/2022]
Abstract
Measles, Nipah and Hendra viruses are severe human pathogens within the Paramyxoviridae family. Their non-segmented, single-stranded, negative-sense RNA genome is encapsidated by the nucleoprotein (N) within a helical nucleocapsid that is the substrate used by the viral RNA-dependent-RNA-polymerase (RpRd) for transcription and replication. The RpRd is a complex made of the large protein (L) and of the phosphoprotein (P), the latter serving as an obligate polymerase cofactor and as a chaperon for N. Both the N and P proteins are enriched in intrinsically disordered regions (IDRs), i.e. regions devoid of stable secondary and tertiary structure. N possesses a C-terminal IDR (NTAIL), while P consists of a large, intrinsically disordered N-terminal domain (NTD) and a C-terminal domain (CTD) encompassing alternating disordered and ordered regions. The V and W proteins, two non-structural proteins that are encoded by the P gene via a mechanism of co-transcriptional edition of the P mRNA, are prevalently disordered too, sharing with P the disordered NTD. They are key players in the evasion of the host antiviral response and were shown to phase separate and to form amyloid-like fibrils in vitro. In this review, we summarize the available information on IDRs within the N, P, V and W proteins from these three model paramyxoviruses and describe their molecular partnership. We discuss the functional benefit of disorder to virus replication in light of the critical role of IDRs in affording promiscuity, multifunctionality, fine regulation of interaction strength, scaffolding functions and in promoting liquid-liquid phase separation and fibrillation.
Collapse
|
9
|
Aylward FO, Moniruzzaman M. Viral Complexity. Biomolecules 2022; 12:1061. [PMID: 36008955 PMCID: PMC9405923 DOI: 10.3390/biom12081061] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2022] [Revised: 07/25/2022] [Accepted: 07/27/2022] [Indexed: 12/18/2022] Open
Abstract
Although traditionally viewed as streamlined and simple, discoveries over the last century have revealed that viruses can exhibit surprisingly complex physical structures, genomic organization, ecological interactions, and evolutionary histories. Viruses can have physical dimensions and genome lengths that exceed many cellular lineages, and their infection strategies can involve a remarkable level of physiological remodeling of their host cells. Virus-virus communication and widespread forms of hyperparasitism have been shown to be common in the virosphere, demonstrating that dynamic ecological interactions often shape their success. And the evolutionary histories of viruses are often fraught with complexities, with chimeric genomes including genes derived from numerous distinct sources or evolved de novo. Here we will discuss many aspects of this viral complexity, with particular emphasis on large DNA viruses, and provide an outlook for future research.
Collapse
Affiliation(s)
- Frank O. Aylward
- Department of Biological Sciences, Virginia Tech, Blacksburg, VA 24061, USA
- Center for Emerging, Zoonotic, and Arthropod-Borne Pathogens, Virginia Tech, Blacksburg, VA 24061, USA
| | - Mohammad Moniruzzaman
- Rosenstiel School of Marine and Atmospheric Science, University of Miami, Coral Gables, FL 33149, USA;
| |
Collapse
|
10
|
Santos TCB, Dingjan T, Futerman AH. The sphingolipid anteome: implications for evolution of the sphingolipid metabolic pathway. FEBS Lett 2022; 596:2345-2363. [PMID: 35899376 DOI: 10.1002/1873-3468.14457] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 07/10/2022] [Accepted: 07/19/2022] [Indexed: 11/09/2022]
Abstract
Modern cell membranes contain a bewildering complexity of lipids, among them sphingolipids (SLs). Advances in mass spectrometry have led to the realization that the number and combinatorial complexity of lipids, including SLs, is much greater than previously appreciated. SLs are generated de novo by four enzymes, namely serine palmitoyltransferase, 3-ketodihydrosphingosine reductase, ceramide synthase and dihydroceramide Δ4-desaturase 1. Some of these enzymes depend on the availability of specific substrates and cofactors, which are themselves supplied by other complex metabolic pathways. The evolution of these four enzymes is poorly understood and likely depends on the co-evolution of the metabolic pathways that supply the other essential reaction components. Here, we introduce the concept of the 'anteome', from the Latin ante ('before') to describe the network of metabolic ('omic') pathways that must have converged in order for these pathways to co-evolve and permit SL synthesis. We also suggest that current origin of life and evolutionary models lack appropriate experimental support to explain the appearance of this complex metabolic pathway and its anteome.
Collapse
Affiliation(s)
- Tania C B Santos
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Tamir Dingjan
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Anthony H Futerman
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, 76100, Israel
| |
Collapse
|
11
|
Pley C, Lourenço J, McNaughton AL, Matthews PC. Spacer Domain in Hepatitis B Virus Polymerase: Plugging a Hole or Performing a Role? J Virol 2022; 96:e0005122. [PMID: 35412348 PMCID: PMC9093120 DOI: 10.1128/jvi.00051-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Accepted: 03/14/2022] [Indexed: 11/25/2022] Open
Abstract
Hepatitis B virus (HBV) polymerase is divided into terminal protein, spacer, reverse transcriptase, and RNase domains. Spacer has previously been considered dispensable, merely acting as a tether between other domains or providing plasticity to accommodate deletions and mutations. We explore evidence for the role of spacer sequence, structure, and function in HBV evolution and lineage, consider its associations with escape from drugs, vaccines, and immune responses, and review its potential impacts on disease outcomes.
Collapse
Affiliation(s)
- Caitlin Pley
- School of Clinical Medicine, University of Cambridge, Cambridge, United Kingdom
- Guy’s and St Thomas’ NHS Foundation Trust, London, United Kingdom
| | - José Lourenço
- Department of Zoology, University of Oxford, Oxford, United Kingdom
- Biosystems and Integrative Sciences Institute, University of Lisbon, Lisbon, Portugal
| | - Anna L. McNaughton
- Population Health Science, Bristol Medical School, University of Bristol, Bristol, United Kingdom
- Nuffield Department of Medicine, University of Oxford Medawar Building, Oxford, United Kingdom
| | - Philippa C. Matthews
- Nuffield Department of Medicine, University of Oxford Medawar Building, Oxford, United Kingdom
- The Francis Crick Institute, London, United Kingdom
- Division of Infection and Immunity, University College London, London, United Kingdom
| |
Collapse
|
12
|
Safari M, Jayaraman B, Zommer H, Yang S, Smith C, Fernandes JD, Frankel AD. Functional and structural segregation of overlapping helices in HIV-1. eLife 2022; 11:e72482. [PMID: 35511220 PMCID: PMC9119678 DOI: 10.7554/elife.72482] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Accepted: 04/19/2022] [Indexed: 11/13/2022] Open
Abstract
Overlapping coding regions balance selective forces between multiple genes. One possible division of nucleotide sequence is that the predominant selective force on a particular nucleotide can be attributed to just one gene. While this arrangement has been observed in regions in which one gene is structured and the other is disordered, we sought to explore how overlapping genes balance constraints when both protein products are structured over the same sequence. We use a combination of sequence analysis, functional assays, and selection experiments to examine an overlapped region in HIV-1 that encodes helical regions in both Env and Rev. We find that functional segregation occurs even in this overlap, with each protein spacing its functional residues in a manner that allows a mutable non-binding face of one helix to encode important functional residues on a charged face in the other helix. Additionally, our experiments reveal novel and critical functional residues in Env and have implications for the therapeutic targeting of HIV-1.
Collapse
Affiliation(s)
- Maliheh Safari
- Department of Biochemistry and Biophysics, University of California, San FranciscoSan FranciscoUnited States
| | - Bhargavi Jayaraman
- Department of Biochemistry and Biophysics, University of California, San FranciscoSan FranciscoUnited States
| | - Henni Zommer
- Department of Biochemistry and Biophysics, University of California, San FranciscoSan FranciscoUnited States
| | - Shumin Yang
- Department of Biochemistry and Biophysics, University of California, San FranciscoSan FranciscoUnited States
- School of Medicine, Tsinghua UniversityBeijingChina
| | - Cynthia Smith
- Department of Biochemistry and Biophysics, University of California, San FranciscoSan FranciscoUnited States
| | - Jason D Fernandes
- Department of Biochemistry and Biophysics, University of California, San FranciscoSan FranciscoUnited States
| | - Alan D Frankel
- Department of Biochemistry and Biophysics, University of California, San FranciscoSan FranciscoUnited States
| |
Collapse
|
13
|
Muñoz-Baena L, Poon AFY. Using networks to analyze and visualize the distribution of overlapping genes in virus genomes. PLoS Pathog 2022; 18:e1010331. [PMID: 35202429 PMCID: PMC8903798 DOI: 10.1371/journal.ppat.1010331] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Revised: 03/08/2022] [Accepted: 02/02/2022] [Indexed: 11/19/2022] Open
Abstract
Gene overlap occurs when two or more genes are encoded by the same nucleotides. This phenomenon is found in all taxonomic domains, but is particularly common in viruses, where it may increase the information content of compact genomes or influence the creation of new genes. Here we report a global comparative study of overlapping open reading frames (OvRFs) of 12,609 virus reference genomes in the NCBI database. We retrieved metadata associated with all annotated open reading frames (ORFs) in each genome record to calculate the number, length, and frameshift of OvRFs. Our results show that while the number of OvRFs increases with genome length, they tend to be shorter in longer genomes. The majority of overlaps involve +2 frameshifts, predominantly found in dsDNA viruses. Antisense overlaps in which one of the ORFs was encoded in the same frame on the opposite strand (−0) tend to be longer. Next, we develop a new graph-based representation of the distribution of overlaps among the ORFs of genomes in a given virus family. In the absence of an unambiguous partition of ORFs by homology at this taxonomic level, we used an alignment-free k-mer based approach to cluster protein coding sequences by similarity. We connect these clusters with two types of directed edges to indicate (1) that constituent ORFs are adjacent in one or more genomes, and (2) that these ORFs overlap. These adjacency graphs not only provide a natural visualization scheme, but also a novel statistical framework for analyzing the effects of gene- and genome-level attributes on the frequencies of overlaps.
Collapse
Affiliation(s)
- Laura Muñoz-Baena
- Department of Microbiology and Immunology, Western University, London, ON, Canada
| | - Art F. Y. Poon
- Department of Microbiology and Immunology, Western University, London, ON, Canada
- Department of Pathology and Laboratory Medicine, Western University, London, ON, Canada
- * E-mail:
| |
Collapse
|
14
|
Gene Overlapping as a Modulator of Begomovirus Evolution. Microorganisms 2022; 10:microorganisms10020366. [PMID: 35208820 PMCID: PMC8875319 DOI: 10.3390/microorganisms10020366] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 02/01/2022] [Accepted: 02/01/2022] [Indexed: 02/06/2023] Open
Abstract
In RNA viruses, which have high mutation—and fast evolutionary— rates, gene overlapping (i.e., genomic regions that encode more than one protein) is a major factor controlling mutational load and therefore the virus evolvability. Although DNA viruses use host high-fidelity polymerases for their replication, and therefore should have lower mutation rates, it has been shown that some of them have evolutionary rates comparable to those of RNA viruses. Notably, these viruses have large proportions of their genes with at least one overlapping instance. Hence, gene overlapping could be a modulator of virus evolution beyond the RNA world. To test this hypothesis, we use the genus Begomovirus of plant viruses as a model. Through comparative genomic approaches, we show that terminal gene overlapping decreases the rate of virus evolution, which is associated with lower frequency of both synonymous and nonsynonymous mutations. In contrast, terminal overlapping has little effect on the pace of virus evolution. Overall, our analyses support a role for gene overlapping in the evolution of begomoviruses and provide novel information on the factors that shape their genetic diversity.
Collapse
|
15
|
New Genomic Signals Underlying the Emergence of Human Proto-Genes. Genes (Basel) 2022; 13:genes13020284. [PMID: 35205330 PMCID: PMC8871994 DOI: 10.3390/genes13020284] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 01/20/2022] [Accepted: 01/24/2022] [Indexed: 12/04/2022] Open
Abstract
De novo genes are novel genes which emerge from non-coding DNA. Until now, little is known about de novo genes’ properties, correlated to their age and mechanisms of emergence. In this study, we investigate four related properties: introns, upstream regulatory motifs, 5′ Untranslated regions (UTRs) and protein domains, in 23,135 human proto-genes. We found that proto-genes contain introns, whose number and position correlates with the genomic position of proto-gene emergence. The origin of these introns is debated, as our results suggest that 41% of proto-genes might have captured existing introns, and 13.7% of them do not splice the ORF. We show that proto-genes which emerged via overprinting tend to be more enriched in core promotor motifs, while intergenic and intronic genes are more enriched in enhancers, even if the TATA motif is most commonly found upstream in these genes. Intergenic and intronic 5′ UTRs of proto-genes have a lower potential to stabilise mRNA structures than exonic proto-genes and established human genes. Finally, we confirm that proteins expressed by proto-genes gain new putative domains with age. Overall, we find that regulatory motifs inducing transcription and translation of previously non-coding sequences may facilitate proto-gene emergence. Our study demonstrates that introns, 5′ UTRs, and domains have specific properties in proto-genes. We also emphasize that the genomic positions of de novo genes strongly impacts these properties.
Collapse
|
16
|
Pavesi A, Romerio F. Extending the Coding Potential of Viral Genomes with Overlapping Antisense ORFs: A Case for the De Novo Creation of the Gene Encoding the Antisense Protein ASP of HIV-1. Viruses 2022; 14:v14010146. [PMID: 35062351 PMCID: PMC8781085 DOI: 10.3390/v14010146] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 01/11/2022] [Accepted: 01/12/2022] [Indexed: 02/04/2023] Open
Abstract
Gene overprinting occurs when point mutations within a genomic region with an existing coding sequence create a new one in another reading frame. This process is quite frequent in viral genomes either to maximize the amount of information that they encode or in response to strong selective pressure. The most frequent scenario involves two different reading frames in the same DNA strand (sense overlap). Much less frequent are cases of overlapping genes that are encoded on opposite DNA strands (antisense overlap). One such example is the antisense ORF, asp in the minus strand of the HIV-1 genome overlapping the env gene. The asp gene is highly conserved in pandemic HIV-1 strains of group M, and it is absent in non-pandemic HIV-1 groups, HIV-2, and lentiviruses infecting non-human primates, suggesting that the ~190-amino acid protein that is expressed from this gene (ASP) may play a role in virus spread. While the function of ASP in the virus life cycle remains to be elucidated, mounting evidence from several research groups indicates that ASP is expressed in vivo. There are two alternative hypotheses that could be envisioned to explain the origin of the asp ORF. On one hand, asp may have originally been present in the ancestor of contemporary lentiviruses, and subsequently lost in all descendants except for most HIV-1 strains of group M due to selective advantage. Alternatively, the asp ORF may have originated very recently with the emergence of group M HIV-1 strains from SIVcpz. Here, we used a combination of computational and statistical approaches to study the genomic region of env in primate lentiviruses to shed light on the origin, structure, and sequence evolution of the asp ORF. The results emerging from our studies support the hypothesis of a recent de novo addition of the antisense ORF to the HIV-1 genome through a process that entailed progressive removal of existing internal stop codons from SIV strains to HIV-1 strains of group M, and fine tuning of the codon sequence in env that reduced the chances of new stop codons occurring in asp. Altogether, the study supports the notion that the HIV-1 asp gene encodes an accessory protein, providing a selective advantage to the virus.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, 43124 Parma, Italy;
| | - Fabio Romerio
- Department of Molecular and Comparative Pathobiology, Johns Hopkins University School of Medicine, Baltimore, MD 21205-2196, USA
- Correspondence:
| |
Collapse
|
17
|
Heinen T, Xie C, Keshavarz M, Stappert D, Künzel S, Tautz D. Evolution of a New Testis-Specific Functional Promoter Within the Highly Conserved Map2k7 Gene of the Mouse. Front Genet 2022; 12:812139. [PMID: 35069705 PMCID: PMC8766832 DOI: 10.3389/fgene.2021.812139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 12/08/2021] [Indexed: 12/03/2022] Open
Abstract
Map2k7 (synonym Mkk7) is a conserved regulatory kinase gene and a central component of the JNK signaling cascade with key functions during cellular differentiation. It shows complex transcription patterns, and different transcript isoforms are known in the mouse (Mus musculus). We have previously identified a newly evolved testis-specific transcript for the Map2k7 gene in the subspecies M. m. domesticus. Here, we identify the new promoter that drives this transcript and find that it codes for an open reading frame (ORF) of 50 amino acids. The new promoter was gained in the stem lineage of closely related mouse species but was secondarily lost in the subspecies M. m. musculus and M. m. castaneus. A single mutation can be correlated with its transcriptional activity in M. m. domesticus, and cell culture assays demonstrate the capability of this mutation to drive expression. A mouse knockout line in which the promoter region of the new transcript is deleted reveals a functional contribution of the newly evolved promoter to sperm motility and the spermatid transcriptome. Our data show that a new functional transcript (and possibly protein) can evolve within an otherwise highly conserved gene, supporting the notion of regulatory changes contributing to the emergence of evolutionary novelties.
Collapse
Affiliation(s)
| | - Chen Xie
- Max-Plank Institute for Evolutionary Biology, Plön, Germany
| | - Maryam Keshavarz
- Max-Plank Institute for Evolutionary Biology, Plön, Germany
- Deutsches Zentrum für Neurodegenerative Erkrankungen e. V. (DZNE), Bonn, Germany
| | - Dominik Stappert
- Deutsches Zentrum für Neurodegenerative Erkrankungen e. V. (DZNE), Bonn, Germany
| | - Sven Künzel
- Max-Plank Institute for Evolutionary Biology, Plön, Germany
| | - Diethard Tautz
- Max-Plank Institute for Evolutionary Biology, Plön, Germany
| |
Collapse
|
18
|
Cherezov RO, Vorontsova JE, Simonova OB. The Phenomenon of Evolutionary “De Novo Generation” of Genes. Russ J Dev Biol 2021. [DOI: 10.1134/s1062360421060035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
19
|
Watson AK, Lopez P, Bapteste E. Hundreds of out-of-frame remodelled gene families in the E. coli pangenome. Mol Biol Evol 2021; 39:6430988. [PMID: 34792602 PMCID: PMC8788219 DOI: 10.1093/molbev/msab329] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
All genomes include gene families with very limited taxonomic distributions that potentially represent new genes and innovations in protein-coding sequence, raising questions on the origins of such genes. Some of these genes are hypothesized to have formed de novo, from noncoding sequences, and recent work has begun to elucidate the processes by which de novo gene formation can occur. A special case of de novo gene formation, overprinting, describes the origin of new genes from noncoding alternative reading frames of existing open reading frames (ORFs). We argue that additionally, out-of-frame gene fission/fusion events of alternative reading frames of ORFs and out-of-frame lateral gene transfers could contribute to the origin of new gene families. To demonstrate this, we developed an original pattern-search in sequence similarity networks, enhancing the use of these graphs, commonly used to detect in-frame remodeled genes. We applied this approach to gene families in 524 complete genomes of Escherichia coli. We identified 767 gene families whose evolutionary history likely included at least one out-of-frame remodeling event. These genes with out-of-frame components represent ∼2.5% of all genes in the E. coli pangenome, suggesting that alternative reading frames of existing ORFs can contribute to a significant proportion of de novo genes in bacteria.
Collapse
Affiliation(s)
- Andrew K Watson
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Sorbonne Université, CNRS, Museum National d'Histoire Naturelle, EPHE, Université des Antilles, 7, quai Saint Bernard, Paris, 75005, France
| | - Philippe Lopez
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Sorbonne Université, CNRS, Museum National d'Histoire Naturelle, EPHE, Université des Antilles, 7, quai Saint Bernard, Paris, 75005, France
| | - Eric Bapteste
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Sorbonne Université, CNRS, Museum National d'Histoire Naturelle, EPHE, Université des Antilles, 7, quai Saint Bernard, Paris, 75005, France
| |
Collapse
|
20
|
Computational methods for inferring location and genealogy of overlapping genes in virus genomes: approaches and applications. Curr Opin Virol 2021; 52:1-8. [PMID: 34798370 PMCID: PMC8594276 DOI: 10.1016/j.coviro.2021.10.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2021] [Revised: 10/21/2021] [Accepted: 10/22/2021] [Indexed: 12/02/2022]
Abstract
Viruses may evolve to increase the amount of encoded genetic information by means of overlapping genes, which utilize several reading frames. Such overlapping genes may be especially impactful for genomes of small size, often serving a source of novel accessory proteins, some of which play a crucial role in viral pathogenicity or in promoting the systemic spread of virus. Diverse genome-based metrics were proposed to facilitate recognition of overlapping genes that otherwise may be overlooked during genome annotation. They can detect the atypical codon bias associated with the overlap (e.g. a statistically significant reduction in variability at synonymous sites) or other sequence-composition features peculiar to overlapping genes. In this review, I compare nine computational methods, discuss their strengths and limitations, and survey how they were applied to detect candidate overlapping genes in the genome of SARS-CoV-2, the etiological agent of COVID-19 pandemic.
Collapse
|
21
|
Pavesi A. Prediction of two novel overlapping ORFs in the genome of SARS-CoV-2. Virology 2021; 562:149-157. [PMID: 34339929 PMCID: PMC8317007 DOI: 10.1016/j.virol.2021.07.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 07/21/2021] [Accepted: 07/21/2021] [Indexed: 10/25/2022]
Abstract
Six candidate overlapping genes have been detected in SARS-CoV-2, yet current methods struggle to detect overlapping genes that recently originated. However, such genes might encode proteins beneficial to the virus, and provide a model system to understand gene birth. To complement existing detection methods, I first demonstrated that selection pressure to avoid stop codons in alternative reading frames is a driving force in the origin and retention of overlapping genes. I then built a detection method, CodScr, based on this selection pressure. Finally, I combined CodScr with methods that detect other properties of overlapping genes, such as a biased nucleotide and amino acid composition. I detected two novel ORFs (ORF-Sh and ORF-Mh), overlapping the spike and membrane genes respectively, which are under selection pressure and may be beneficial to SARS-CoV-2. ORF-Sh and ORF-Mh are present, as ORF uninterrupted by stop codons, in 100% and 95% of the SARS-CoV-2 genomes, respectively.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parco Area Delle Scienze 23/A, I-43124, Parma, Italy.
| |
Collapse
|
22
|
Dong H, Zhu Y, Shen Y, Xie S, He Y, Lu L. High prevalence of tryptophan-truncated S quasispecies in treatment-naïve chronic hepatitis B patients. J Gen Virol 2021; 102. [PMID: 34292864 DOI: 10.1099/jgv.0.001623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Hepatitis B virus surface antigen (HBsAg) encoded by the S gene is highly expressed during the replication cycle of hepatitis B virus (HBV). However, the frequent usage of tryptophan in HBsAg, which leads to a high cost of biosynthesis, is inconsistent with the high expression level of this protein. Tryptophan-truncated mutation of HBsAg, that is, a tryptophan to stop codon mutation resulting in truncated HBsAg, might help to maintain its high expression with lower biosynthetic cost. We aimed to investigate the prevalence of tryptophan-truncated S quasispecies in treatment-naïve patients with chronic hepatitis B (CHB) by applying CirSeq as well as a site-by-site algorithm developed by us to identify variants at extremely low frequencies in the carboxyl terminus of HBsAg. A total of 730 mutations were identified in 27 patients with CHB, varying from seven to 56 mutations per sample. The number of synonymous mutations was much higher than that of nonsynonymous mutations in the reverse transcriptase (RT) coding region and vice versa in the S coding region, implying that the evolutionary constraints on the RT and S genes might be different. We showed that 25 (92.6 %) of 27 patients had at least one S-truncated mutation, most of which were derived from tryptophan, indicating a high prevalence of tryptophan-truncated S mutations in treatment-naïve patients with CHB. In terms of the RT gene, 21 (77.8 %) patients had pre-existing drug-resistant mutations, while no truncated mutations were detected. Our findings that tryptophan-truncated S quasispecies and drug-resistant RT mutants were highly prevalent in treatment-naïve patients with CHB provide new insights into the composition of the HBV population, which might help optimize the treatment and management of patients with CHB.
Collapse
Affiliation(s)
- Hui Dong
- Department of Gastroenterology, Shanghai Key Laboratory of Pancreatic Diseases, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200080, PR China
| | - Yongqiang Zhu
- Shanghai Institute for Biomedical and Pharmaceutical Technologies, Shanghai 201203, PR China
| | - Yan Shen
- Nanjing Shenyou Institute of Genome Research, Nanjing, 210048, PR China
| | - Shaoqing Xie
- Nanjing Shenyou Institute of Genome Research, Nanjing, 210048, PR China
| | - Yungang He
- Shanghai Fifth People's Hospital, Shanghai Key Laboratory of Medical Epigenetics, the International Co-laboratory of Medical Epigenetics and Metabolism, Institutes of Biomedical Sciences, Fudan University, Shanghai 200032, PR China
| | - Lungen Lu
- Department of Gastroenterology, Shanghai Key Laboratory of Pancreatic Diseases, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200080, PR China
| |
Collapse
|
23
|
Chazal N. Coronavirus, the King Who Wanted More Than a Crown: From Common to the Highly Pathogenic SARS-CoV-2, Is the Key in the Accessory Genes? Front Microbiol 2021; 12:682603. [PMID: 34335504 PMCID: PMC8317507 DOI: 10.3389/fmicb.2021.682603] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 06/22/2021] [Indexed: 12/14/2022] Open
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), that emerged in late 2019, is the etiologic agent of the current "coronavirus disease 2019" (COVID-19) pandemic, which has serious health implications and a significant global economic impact. Of the seven human coronaviruses, all of which have a zoonotic origin, the pandemic SARS-CoV-2, is the third emerging coronavirus, in the 21st century, highly pathogenic to the human population. Previous human coronavirus outbreaks (SARS-CoV-1 and MERS-CoV) have already provided several valuable information on some of the common molecular and cellular mechanisms of coronavirus infections as well as their origin. However, to meet the new challenge caused by the SARS-CoV-2, a detailed understanding of the biological specificities, as well as knowledge of the origin are crucial to provide information on viral pathogenicity, transmission and epidemiology, and to enable strategies for therapeutic interventions and drug discovery. Therefore, in this review, we summarize the current advances in SARS-CoV-2 knowledges, in light of pre-existing information of other recently emerging coronaviruses. We depict the specificity of the immune response of wild bats and discuss current knowledge of the genetic diversity of bat-hosted coronaviruses that promotes viral genome expansion (accessory gene acquisition). In addition, we describe the basic virology of coronaviruses with a special focus SARS-CoV-2. Finally, we highlight, in detail, the current knowledge of genes and accessory proteins which we postulate to be the major keys to promote virus adaptation to specific hosts (bat and human), to contribute to the suppression of immune responses, as well as to pathogenicity.
Collapse
Affiliation(s)
- Nathalie Chazal
- Institut de Recherche en Infectiologie de Montpellier (IRIM), Université de Montpellier, CNRS, Montpellier, France
| |
Collapse
|
24
|
Guerra-Almeida D, Tschoeke DA, da-Fonseca RN. Understanding small ORF diversity through a comprehensive transcription feature classification. DNA Res 2021; 28:6317669. [PMID: 34240112 PMCID: PMC8435553 DOI: 10.1093/dnares/dsab007] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Indexed: 11/13/2022] Open
Abstract
Small open reading frames (small ORFs/sORFs/smORFs) are potentially coding sequences smaller than 100 codons that have historically been considered junk DNA by gene prediction software and in annotation screening; however, the advent of next-generation sequencing has contributed to the deeper investigation of junk DNA regions and their transcription products, resulting in the emergence of smORFs as a new focus of interest in systems biology. Several smORF peptides were recently reported in noncanonical mRNAs as new players in numerous biological contexts; however, their relevance is still overlooked in coding potential analysis. Hence, this review proposes a smORF classification based on transcriptional features, discussing the most promising approaches to investigate smORFs based on their different characteristics. First, smORFs were divided into nonexpressed (intergenic) and expressed (genic) smORFs. Second, genic smORFs were classified as smORFs located in noncoding RNAs (ncRNAs) or canonical mRNAs. Finally, smORFs in ncRNAs were further subdivided into sequences located in small or long RNAs, whereas smORFs located in canonical mRNAs were subdivided into several specific classes depending on their localization along the gene. We hope that this review provides new insights into large-scale annotations and reinforces the role of smORFs as essential components of a hidden coding DNA world.
Collapse
Affiliation(s)
- Diego Guerra-Almeida
- Institute of Biodiversity and Sustainability, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Diogo Antonio Tschoeke
- Alberto Luiz Coimbra Institute of Graduate Studies and Engineering Research (COPPE), Biomedical Engineering Program, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Rodrigo Nunes- da-Fonseca
- Institute of Biodiversity and Sustainability, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil.,National Institute of Science and Technology in Molecular Entomology, Rio de Janeiro, Brazil
| |
Collapse
|
25
|
Positive selection and intrinsic disorder are associated with multifunctional C4(AC4) proteins and geminivirus diversification. Sci Rep 2021; 11:11150. [PMID: 34045539 PMCID: PMC8160170 DOI: 10.1038/s41598-021-90557-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Accepted: 05/13/2021] [Indexed: 02/06/2023] Open
Abstract
Viruses within the Geminiviridae family cause extensive agricultural losses. Members of four genera of geminiviruses contain a C4 gene (AC4 in geminiviruses with bipartite genomes). C4(AC4) genes are entirely overprinted on the C1(AC1) genes, which encode the replication-associated proteins. The C4(AC4) proteins exhibit diverse functions that may be important for geminivirus diversification. In this study, the influence of natural selection on the evolutionary diversity of 211 C4(AC4) genes relative to the C1(AC1) sequences they overlap was determined from isolates of the Begomovirus and Curtovirus genera. The ratio of nonsynonymous (dN) to synonymous (dS) nucleotide substitutions indicated that C4(AC4) genes are under positive selection, while the overlapped C1(AC1) sequences are under purifying selection. Ninety-one of 200 Begomovirus C4(AC4) genes encode elongated proteins with the extended regions being under neutral selection. C4(AC4) genes from begomoviruses isolated from tomato from native versus exotic regions were under similar levels of positive selection. Analysis of protein structure suggests that C4(AC4) proteins are entirely intrinsically disordered. Our data suggest that non-synonymous mutations and mutations that increase the length of C4(AC4) drive protein diversity that is intrinsically disordered, which could explain C4/AC4 functional variation and contribute to both geminivirus diversification and host jumping.
Collapse
|
26
|
Pavesi A. Origin, Evolution and Stability of Overlapping Genes in Viruses: A Systematic Review. Genes (Basel) 2021; 12:genes12060809. [PMID: 34073395 PMCID: PMC8227390 DOI: 10.3390/genes12060809] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 05/22/2021] [Accepted: 05/24/2021] [Indexed: 12/11/2022] Open
Abstract
During their long evolutionary history viruses generated many proteins de novo by a mechanism called “overprinting”. Overprinting is a process in which critical nucleotide substitutions in a pre-existing gene can induce the expression of a novel protein by translation of an alternative open reading frame (ORF). Overlapping genes represent an intriguing example of adaptive conflict, because they simultaneously encode two proteins whose freedom to change is constrained by each other. However, overlapping genes are also a source of genetic novelties, as the constraints under which alternative ORFs evolve can give rise to proteins with unusual sequence properties, most importantly the potential for novel functions. Starting with the discovery of overlapping genes in phages infecting Escherichia coli, this review covers a range of studies dealing with detection of overlapping genes in small eukaryotic viruses (genomic length below 30 kb) and recognition of their critical role in the evolution of pathogenicity. Origin of overlapping genes, what factors favor their birth and retention, and how they manage their inherent adaptive conflict are extensively reviewed. Special attention is paid to the assembly of overlapping genes into ad hoc databases, suitable for future studies, and to the development of statistical methods for exploring viral genome sequences in search of undiscovered overlaps.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parco Area delle Scienze 23/A, I-43124 Parma, Italy
| |
Collapse
|
27
|
Unconventional viral gene expression mechanisms as therapeutic targets. Nature 2021; 593:362-371. [PMID: 34012080 DOI: 10.1038/s41586-021-03511-5] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Accepted: 03/22/2021] [Indexed: 12/14/2022]
Abstract
Unlike the human genome that comprises mostly noncoding and regulatory sequences, viruses have evolved under the constraints of maintaining a small genome size while expanding the efficiency of their coding and regulatory sequences. As a result, viruses use strategies of transcription and translation in which one or more of the steps in the conventional gene-protein production line are altered. These alternative strategies of viral gene expression (also known as gene recoding) can be uniquely brought about by dedicated viral enzymes or by co-opting host factors (known as host dependencies). Targeting these unique enzymatic activities and host factors exposes vulnerabilities of a virus and provides a paradigm for the design of novel antiviral therapies. In this Review, we describe the types and mechanisms of unconventional gene and protein expression in viruses, and provide a perspective on how future basic mechanistic work could inform translational efforts that are aimed at viral eradication.
Collapse
|
28
|
Gholizadeh Z, Iqbal MS, Li R, Romerio F. The HIV-1 Antisense Gene ASP: The New Kid on the Block. Vaccines (Basel) 2021; 9:vaccines9050513. [PMID: 34067514 PMCID: PMC8156140 DOI: 10.3390/vaccines9050513] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 05/04/2021] [Accepted: 05/13/2021] [Indexed: 01/14/2023] Open
Abstract
Viruses have developed incredibly creative ways of making a virtue out of necessity, including taking full advantage of their small genomes. Indeed, viruses often encode multiple proteins within the same genomic region by using two or more reading frames in both orientations through a process called overprinting. Complex retroviruses provide compelling examples of that. The human immunodeficiency virus type 1 (HIV-1) genome expresses sixteen proteins from nine genes that are encoded in the three positive-sense reading frames. In addition, the genome of some HIV-1 strains contains a tenth gene in one of the negative-sense reading frames. The so-called Antisense Protein (ASP) gene overlaps the HIV-1 Rev Response Element (RRE) and the envelope glycoprotein gene, and encodes a highly hydrophobic protein of ~190 amino acids. Despite being identified over thirty years ago, relatively few studies have investigated the role that ASP may play in the virus lifecycle, and its expression in vivo is still questioned. Here we review the current knowledge about ASP, and we discuss some of the many unanswered questions.
Collapse
|
29
|
Li R, Sklutuis R, Groebner JL, Romerio F. HIV-1 Natural Antisense Transcription and Its Role in Viral Persistence. Viruses 2021; 13:v13050795. [PMID: 33946840 PMCID: PMC8145503 DOI: 10.3390/v13050795] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 04/26/2021] [Accepted: 04/27/2021] [Indexed: 12/11/2022] Open
Abstract
Natural antisense transcripts (NATs) represent a class of RNA molecules that are transcribed from the opposite strand of a protein-coding gene, and that have the ability to regulate the expression of their cognate protein-coding gene via multiple mechanisms. NATs have been described in many prokaryotic and eukaryotic systems, as well as in the viruses that infect them. The human immunodeficiency virus (HIV-1) is no exception, and produces one or more NAT from a promoter within the 3’ long terminal repeat. HIV-1 antisense transcripts have been the focus of several studies spanning over 30 years. However, a complete appreciation of the role that these transcripts play in the virus lifecycle is still lacking. In this review, we cover the current knowledge about HIV-1 NATs, discuss some of the questions that are still open and identify possible areas of future research.
Collapse
Affiliation(s)
- Rui Li
- Department of Molecular and Comparative Pathobiology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA;
| | - Rachel Sklutuis
- HIV Dynamics and Replication Program, Host-Virus Interaction Branch, National Cancer Institute, National Institutes of Health, Frederick, MD 21702, USA; (R.S.); (J.L.G.)
| | - Jennifer L. Groebner
- HIV Dynamics and Replication Program, Host-Virus Interaction Branch, National Cancer Institute, National Institutes of Health, Frederick, MD 21702, USA; (R.S.); (J.L.G.)
| | - Fabio Romerio
- Department of Molecular and Comparative Pathobiology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA;
- Correspondence:
| |
Collapse
|
30
|
Carter CW. Simultaneous codon usage, the origin of the proteome, and the emergence of de-novo proteins. Curr Opin Struct Biol 2021; 68:142-148. [PMID: 33529785 DOI: 10.1016/j.sbi.2021.01.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 01/05/2021] [Indexed: 12/21/2022]
Abstract
Genetic coding generally uses only one of a gene's two strands; its complement serving as template for replication. Aminoacyl-tRNA synthetases, aaRS, apparently first emerged as pairs on bidirectional genes, in which anticodons in the template strand served as codons for an entirely different protein. Interpreting both strands in frame constrained such genes sufficiently that it was rapidly superseded, leaving only traces in the elevated pairing between codon middle bases in antiparallel alignments. Codon assignments actually promote using information from both strands in multiple reading frames. Related phenomena, known as overprinting, are widely associated with viruses. In-frame bidirectional coding and overprinting nevertheless imply different structural and functional relationships, and different roles in generating folded proteins throughout the evolution of the proteome.
Collapse
Affiliation(s)
- Charles W Carter
- Department of Biochemistry, Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-7260, United States.
| |
Collapse
|
31
|
Abstract
Many virus-encoded proteins have intrinsically disordered regions that lack a stable, folded three-dimensional structure. These disordered proteins often play important functional roles in virus replication, such as down-regulating host defense mechanisms. With the widespread availability of next-generation sequencing, the number of new virus genomes with predicted open reading frames is rapidly outpacing our capacity for directly characterizing protein structures through crystallography. Hence, computational methods for structural prediction play an important role. A large number of predictors focus on the problem of classifying residues into ordered and disordered regions, and these methods tend to be validated on a diverse training set of proteins from eukaryotes, prokaryotes, and viruses. In this study, we investigate whether some predictors outperform others in the context of virus proteins and compared our findings with data from non-viral proteins. We evaluate the prediction accuracy of 21 methods, many of which are only available as web applications, on a curated set of 126 proteins encoded by viruses. Furthermore, we apply a random forest classifier to these predictor outputs. Based on cross-validation experiments, this ensemble approach confers a substantial improvement in accuracy, e.g., a mean 36 per cent gain in Matthews correlation coefficient. Lastly, we apply the random forest predictor to severe acute respiratory syndrome coronavirus 2 ORF6, an accessory gene that encodes a short (61 AA) and moderately disordered protein that inhibits the host innate immune response. We show that disorder prediction methods perform differently for viral and non-viral proteins, and that an ensemble approach can yield more robust and accurate predictions.
Collapse
Affiliation(s)
- Gal Almog
- Department of Pathology & Laboratory Medicine, Western University, Dental Sciences Building, Rm. 4044 London, Ontario, Canada, N6A 5C1
| | - Abayomi S Olabode
- Department of Pathology & Laboratory Medicine, Western University, Dental Sciences Building, Rm. 4044 London, Ontario, Canada, N6A 5C1
| | - Art F Y Poon
- Department of Pathology & Laboratory Medicine, Western University, Dental Sciences Building, Rm. 4044 London, Ontario, Canada, N6A 5C1.,Department of Applied Mathematics, Western University, Middlesex College Room 255, 1151 Richmond Street London, Ontario, Canada, N6A 5B7.,Department of Microbiology & Immunology, Western University, 1151 Richmond Street London, Ontario, Canada, N6A 3K
| |
Collapse
|
32
|
Savoret J, Mesnard JM, Gross A, Chazal N. Antisense Transcripts and Antisense Protein: A New Perspective on Human Immunodeficiency Virus Type 1. Front Microbiol 2021; 11:625941. [PMID: 33510738 PMCID: PMC7835632 DOI: 10.3389/fmicb.2020.625941] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 12/14/2020] [Indexed: 12/13/2022] Open
Abstract
It was first predicted in 1988 that there may be an Open Reading Frame (ORF) on the negative strand of the Human Immunodeficiency Virus type 1 (HIV-1) genome that could encode a protein named AntiSense Protein (ASP). In spite of some controversy, reports began to emerge some years later describing the detection of HIV-1 antisense transcripts, the presence of ASP in transfected and infected cells, and the existence of an immune response targeting ASP. Recently, it was established that the asp gene is exclusively conserved within the pandemic group M of HIV-1. In this review, we summarize the latest findings on HIV-1 antisense transcripts and ASP, and we discuss their potential functions in HIV-1 infection together with the role played by antisense transcripts and ASPs in some other viruses. Finally, we suggest pathways raised by the study of antisense transcripts and ASPs that may warrant exploration in the future.
Collapse
Affiliation(s)
- Juliette Savoret
- Institut de Recherche en Infectiologie de Montpellier (IRIM), CNRS, Université de Montpellier, Montpellier, France
| | - Jean-Michel Mesnard
- Institut de Recherche en Infectiologie de Montpellier (IRIM), CNRS, Université de Montpellier, Montpellier, France
| | - Antoine Gross
- Institut de Recherche en Infectiologie de Montpellier (IRIM), CNRS, Université de Montpellier, Montpellier, France
| | - Nathalie Chazal
- Institut de Recherche en Infectiologie de Montpellier (IRIM), CNRS, Université de Montpellier, Montpellier, France
| |
Collapse
|
33
|
Douglas J, Drummond AJ, Kingston RL. Evolutionary history of cotranscriptional editing in the paramyxoviral phosphoprotein gene. Virus Evol 2021; 7:veab028. [PMID: 34141448 PMCID: PMC8204654 DOI: 10.1093/ve/veab028] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
The phosphoprotein gene of the paramyxoviruses encodes multiple protein products. The P, V, and W proteins are generated by transcriptional slippage. This process results in the insertion of non-templated guanosine nucleosides into the mRNA at a conserved edit site. The P protein is an essential component of the viral RNA polymerase and is encoded by a faithful copy of the gene in the majority of paramyxoviruses. However, in some cases, the non-essential V protein is encoded by default and guanosines must be inserted into the mRNA in order to encode P. The number of guanosines inserted into the P gene can be described by a probability distribution, which varies between viruses. In this article, we review the nature of these distributions, which can be inferred from mRNA sequencing data, and reconstruct the evolutionary history of cotranscriptional editing in the paramyxovirus family. Our model suggests that, throughout known history of the family, the system has switched from a P default to a V default mode four times; complete loss of the editing system has occurred twice, the canonical zinc finger domain of the V protein has been deleted or heavily mutated a further two times, and the W protein has independently evolved a novel function three times. Finally, we review the physical mechanisms of cotranscriptional editing via slippage of the viral RNA polymerase.
Collapse
Affiliation(s)
- Jordan Douglas
- Centre for Computational Evolution, University of Auckland, Auckland 1010, New Zealand
- School of Computer Science, University of Auckland, Auckland 1010, New Zealand
| | - Alexei J Drummond
- Centre for Computational Evolution, University of Auckland, Auckland 1010, New Zealand
- School of Biological Sciences, University of Auckland, Auckland 1010, New Zealand
| | - Richard L Kingston
- School of Biological Sciences, University of Auckland, Auckland 1010, New Zealand
| |
Collapse
|
34
|
Zamora-Briseño JA, Pereira-Santana A, Reyes-Hernández SJ, Cerqueda-García D, Castaño E, Rodríguez-Zapata LC. Towards an understanding of the role of intrinsic protein disorder on plant adaptation to environmental challenges. Cell Stress Chaperones 2021; 26:141-150. [PMID: 32902806 PMCID: PMC7736417 DOI: 10.1007/s12192-020-01162-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Revised: 07/31/2020] [Accepted: 08/27/2020] [Indexed: 02/05/2023] Open
Abstract
Intrinsic protein disorder is an interesting structural feature where fully functional proteins lack a three-dimensional structure in solution. In this work, we estimated the relative content of intrinsic protein disorder in 96 plant proteomes including monocots and eudicots. In this analysis, we found variation in the relative abundance of intrinsic protein disorder among these major clades; the relative level of disorder is higher in monocots than eudicots. In turn, there is an inverse relationship between the degree of intrinsic protein disorder and protein length, with smaller proteins being more disordered. The relative abundance of amino acids depends on intrinsic disorder and also varies among clades. Within the nucleus, intrinsically disordered proteins are more abundant than ordered proteins. Intrinsically disordered proteins are specialized in regulatory functions, nucleic acid binding, RNA processing, and in response to environmental stimuli. The implications of this on plants' responses to their environment are discussed.
Collapse
Affiliation(s)
- Jesús Alejandro Zamora-Briseño
- Unidad de Biotecnología, Centro de Investigación Científica de Yucatán, Calle 43, Número 130, Chuburná de Hidalgo, C.P. 97205, Mérida, Yucatán, México
| | - Alejandro Pereira-Santana
- División de Biotecnología Industrial, Centro de Investigación y Asistencia en Tecnología y Diseño del estado de Jalisco, Camino Arenero 1227, El Bajio, C.P. 45019, Zapopan, Jalisco, México
- Dirección de Cátedras, Consejo Nacional de Ciencia y Tecnologia, Av. Insurgentes Sur 1582, Alcaldía Benito Juárez, C.P. 03940, Ciudad de México, México
| | - Sandi Julissa Reyes-Hernández
- Unidad de Biotecnología, Centro de Investigación Científica de Yucatán, Calle 43, Número 130, Chuburná de Hidalgo, C.P. 97205, Mérida, Yucatán, México
| | - Daniel Cerqueda-García
- Departamento de Recursos del Mar, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional- Unidad Mérida, Carr. Mérida - Progreso, colonia Loma Bonita, C.P. 97205, Mérida, Yucatán, México
| | - Enrique Castaño
- Unidad de Bioquímica y Biología Molecular de Plantas, Centro de Investigación Científica de Yucatán, Calle 43, Número 130, Chuburná de Hidalgo, C.P. 97205, Mérida, Yucatán, México
| | - Luis Carlos Rodríguez-Zapata
- Unidad de Biotecnología, Centro de Investigación Científica de Yucatán, Calle 43, Número 130, Chuburná de Hidalgo, C.P. 97205, Mérida, Yucatán, México.
| |
Collapse
|
35
|
Lulla V, Firth AE. A hidden gene in astroviruses encodes a viroporin. Nat Commun 2020; 11:4070. [PMID: 32792502 PMCID: PMC7426862 DOI: 10.1038/s41467-020-17906-x] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Accepted: 07/23/2020] [Indexed: 12/13/2022] Open
Abstract
Human astroviruses are small non-enveloped viruses with positive-sense single-stranded RNA genomes. Astroviruses cause acute gastroenteritis in children worldwide and have been associated with encephalitis and meningitis in immunocompromised individuals. It is still unknown how astrovirus particles exit infected cells following replication. Through comparative genomic analysis and ribosome profiling we here identify and confirm the expression of a conserved alternative-frame ORF, encoding the protein XP. XP-knockout astroviruses are attenuated and pseudo-revert on passaging. Further investigation into the function of XP revealed plasma and trans Golgi network membrane-associated roles in virus assembly and/or release through a viroporin-like activity. XP-knockout replicons have only a minor replication defect, demonstrating the role of XP at late stages of infection. The discovery of XP advances our knowledge of these important human viruses and opens an additional direction of research into their life cycle and pathogenesis.
Collapse
Affiliation(s)
- Valeria Lulla
- Division of Virology, Department of Pathology, Addenbrooke's Hospital, University of Cambridge, Cambridge, UK.
| | - Andrew E Firth
- Division of Virology, Department of Pathology, Addenbrooke's Hospital, University of Cambridge, Cambridge, UK.
| |
Collapse
|
36
|
Seitz S, Habjanič J, Schütz AK, Bartenschlager R. The Hepatitis B Virus Envelope Proteins: Molecular Gymnastics Throughout the Viral Life Cycle. Annu Rev Virol 2020; 7:263-288. [PMID: 32600157 DOI: 10.1146/annurev-virology-092818-015508] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
New hepatitis B virions released from infected hepatocytes are the result of an intricate maturation process that starts with the formation of the nucleocapsid providing a confined space where the viral DNA genome is synthesized via reverse transcription. Virion assembly is finalized by the enclosure of the icosahedral nucleocapsid within a heterogeneous envelope. The latter contains integral membrane proteins of three sizes, collectively known as hepatitis B surface antigen, and adopts multiple conformations in the course of the viral life cycle. The nucleocapsid conformation depends on the reverse transcription status of the genome, which in turn controls nucleocapsid interaction with the envelope proteins for virus exit. In addition, after secretion the virions undergo a distinct maturation step during which a topological switch of the large envelope protein confers infectivity. Here we review molecular determinants for envelopment and models that postulate molecular signals encoded in the capsid scaffold conducive or adverse to the recruitment of envelope proteins.
Collapse
Affiliation(s)
- Stefan Seitz
- Department of Infectious Diseases, University of Heidelberg, 69120 Heidelberg, Germany;
| | - Jelena Habjanič
- Bavarian NMR Center, Department of Chemistry, Technical University of Munich, 85748 Garching, Germany.,Institute of Structural Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany
| | - Anne K Schütz
- Bavarian NMR Center, Department of Chemistry, Technical University of Munich, 85748 Garching, Germany.,Institute of Structural Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany
| | - Ralf Bartenschlager
- Department of Infectious Diseases, University of Heidelberg, 69120 Heidelberg, Germany; .,Division of Virus-Associated Carcinogenesis, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| |
Collapse
|
37
|
Yoshimoto FK. The Proteins of Severe Acute Respiratory Syndrome Coronavirus-2 (SARS CoV-2 or n-COV19), the Cause of COVID-19. Protein J 2020; 39:198-216. [PMID: 32447571 PMCID: PMC7245191 DOI: 10.1007/s10930-020-09901-4] [Citation(s) in RCA: 353] [Impact Index Per Article: 70.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The devastating effects of the recent global pandemic (termed COVID-19 for "coronavirus disease 2019") caused by the severe acute respiratory syndrome coronavirus-2 (SARS CoV-2) are paramount with new cases and deaths growing at an exponential rate. In order to provide a better understanding of SARS CoV-2, this article will review the proteins found in the SARS CoV-2 that caused this global pandemic.
Collapse
Affiliation(s)
- Francis K Yoshimoto
- Department of Chemistry, The University of Texas at San Antonio (UTSA), San Antonio, TX, 78249-0698, USA.
| |
Collapse
|
38
|
Pavesi A. New insights into the evolutionary features of viral overlapping genes by discriminant analysis. Virology 2020; 546:51-66. [PMID: 32452417 PMCID: PMC7157939 DOI: 10.1016/j.virol.2020.03.007] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2020] [Accepted: 03/29/2020] [Indexed: 12/18/2022]
Abstract
Overlapping genes originate by a mechanism of overprinting, in which nucleotide substitutions in a pre-existing frame induce the expression of a de novo protein from an alternative frame. In this study, I assembled a dataset of 319 viral overlapping genes, which included 82 overlaps whose expression is experimentally known and the respective 237 homologs. Principal component analysis revealed that overlapping genes have a common pattern of nucleotide and amino acid composition. Discriminant analysis separated overlapping from non-overlapping genes with an accuracy of 97%. When applied to overlapping genes with known genealogy, it separated ancestral from de novo frames with an accuracy close to 100%. This high discriminant power was crucial to computationally design variants of de novo viral proteins known to possess selective anticancer toxicity (apoptin) or protection against neurodegeneration (X protein), as well as to detect two new potential overlapping genes in the genome of the new coronavirus SARS-CoV-2.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parco Area Delle Scienze 23/A, I-43124, Parma, Italy.
| |
Collapse
|
39
|
Blazejewski T, Ho HI, Wang HH. Synthetic sequence entanglement augments stability and containment of genetic information in cells. Science 2020; 365:595-598. [PMID: 31395784 DOI: 10.1126/science.aav5477] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Revised: 06/21/2019] [Accepted: 07/15/2019] [Indexed: 12/28/2022]
Abstract
In synthetic biology, methods for stabilizing genetically engineered functions and confining recombinant DNA to intended hosts are necessary to cope with natural mutation accumulation and pervasive lateral gene flow. We present a generalizable strategy to preserve and constrain genetic information through the computational design of overlapping genes. Overlapping a sequence with an essential gene altered its fitness landscape and produced a constrained evolutionary path, even for synonymous mutations. Embedding a toxin gene in a gene of interest restricted its horizontal propagation. We further demonstrated a multiplex and scalable approach to build and test >7500 overlapping sequence designs, yielding functional yet highly divergent variants from natural homologs. This work enables deeper exploration of natural and engineered overlapping genes and facilitates enhanced genetic stability and biocontainment in emerging applications.
Collapse
Affiliation(s)
- Tomasz Blazejewski
- Department of Systems Biology, Columbia University, New York, NY, USA.,Integrated Program in Cellular, Molecular, and Biomedical Studies, Columbia University, New York, NY, USA
| | - Hsing-I Ho
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Harris H Wang
- Department of Systems Biology, Columbia University, New York, NY, USA. .,Department of Pathology and Cell Biology, Columbia University, New York, NY, USA
| |
Collapse
|
40
|
Abstract
Overlapping genes are commonplace in viruses and play an important role in their function and evolution. However, aside from studies on specific groups of viruses, relatively little is known about the extent and nature of gene overlap and its determinants in viruses as a whole. Here, we present an extensive characterisation of gene overlap in viruses through an analysis of reference genomes present in the NCBI virus genome database. We find that over half the instances of gene overlap are very small, covering <10 nt, and 84 per cent are <50 nt in length. Despite this, 53 per cent of all viruses still contained a gene overlap of 50 nt or larger. We also investigate several predictors of gene overlap such as genome structure (single- and double-stranded RNA and DNA), virus family, genome length, and genome segmentation. This revealed that gene overlap occurs more frequently in DNA viruses than in RNA viruses, and more frequently in single-stranded viruses than in double-stranded viruses. Genome segmentation is also associated with gene overlap, particularly in single-stranded DNA viruses. Notably, we observed a large range of overlap frequencies across families of all genome types, suggesting that it is a common evolutionary trait that provides flexible genome structures in all virus families.
Collapse
Affiliation(s)
- Timothy E Schlub
- Sydney School of Public Health, Faculty of Medicine and Health,The University of Sydney, NSW, 2006, Australia
| | - Edward C Holmes
- School of Life and Environmental Sciences and School of Medical Sciences, Marie Bashir Institute for Infectious Diseases and Biosecurity, The University of Sydney, Sydney, NSW 2006, Australia
| |
Collapse
|
41
|
Dinan AM, Lukhovitskaya NI, Olendraite I, Firth AE. A case for a negative-strand coding sequence in a group of positive-sense RNA viruses. Virus Evol 2020; 6:veaa007. [PMID: 32064120 PMCID: PMC7010960 DOI: 10.1093/ve/veaa007] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Positive-sense single-stranded RNA viruses form the largest and most diverse group of eukaryote-infecting viruses. Their genomes comprise one or more segments of coding-sense RNA that function directly as messenger RNAs upon release into the cytoplasm of infected cells. Positive-sense RNA viruses are generally accepted to encode proteins solely on the positive strand. However, we previously identified a surprisingly long (∼1,000-codon) open reading frame (ORF) on the negative strand of some members of the family Narnaviridae which, together with RNA bacteriophages of the family Leviviridae, form a sister group to all other positive-sense RNA viruses. Here, we completed the genomes of three mosquito-associated narnaviruses, all of which have the long reverse-frame ORF. We systematically identified narnaviral sequences in public data sets from a wide range of sources, including arthropod, fungal, and plant transcriptomic data sets. Long reverse-frame ORFs are widespread in one clade of narnaviruses, where they frequently occupy >95 per cent of the genome. The reverse-frame ORFs correspond to a specific avoidance of CUA, UUA, and UCA codons (i.e. stop codon reverse complements) in the forward-frame RNA-dependent RNA polymerase ORF. However, absence of these codons cannot be explained by other factors such as inability to decode these codons or GC3 bias. Together with other analyses, we provide the strongest evidence yet of coding capacity on the negative strand of a positive-sense RNA virus. As these ORFs comprise some of the longest known overlapping genes, their study may be of broad relevance to understanding overlapping gene evolution and de novo origin of genes.
Collapse
Affiliation(s)
- Adam M Dinan
- Division of Virology, Department of Pathology, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QP, UK
| | - Nina I Lukhovitskaya
- Division of Virology, Department of Pathology, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QP, UK
| | - Ingrida Olendraite
- Division of Virology, Department of Pathology, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QP, UK
| | - Andrew E Firth
- Division of Virology, Department of Pathology, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QP, UK
| |
Collapse
|
42
|
Kumar D, Singh A, Kumar P, Uversky VN, Rao CD, Giri R. Understanding the penetrance of intrinsic protein disorder in rotavirus proteome. Int J Biol Macromol 2020; 144:892-908. [PMID: 31739058 PMCID: PMC7112477 DOI: 10.1016/j.ijbiomac.2019.09.166] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2019] [Revised: 09/09/2019] [Accepted: 09/20/2019] [Indexed: 01/03/2023]
Abstract
Rotavirus is a major cause of severe acute gastroenteritis in the infants and young children. The past decade has evidenced the role of intrinsically disordered proteins/regions (IDPs)/(IDPRs) in viral and other diseases. In general, (IDPs)/(IDPRs) are considered as dynamic conformational ensembles that devoid of a specific 3D structure, being associated with various important biological phenomena. Viruses utilize IDPs/IDPRs to survive in harsh environments, to evade the host immune system, and to highjack and manipulate host cellular proteins. The role of IDPs/IDPRs in Rotavirus biology and pathogenicity are not assessed so far, therefore, we have designed this study to deeply look at the penetrance of intrinsic disorder in rotavirus proteome consisting 12 proteins encoded by 11 segments of viral genome. Also, for all human rotaviral proteins, we have deciphered molecular recognition features (MoRFs), which are disorder based binding sites in proteins. Our study shows the wide spread of intrinsic disorder in several rotavirus proteins, primarily the nonstructural proteins NSP3, NSP4, and NSP5 that are involved in viral replication, translation, viroplasm formation and/or maturation. This study may serve as a primer for understanding the role of IDPs/MoRFs in rotavirus biology, design of alternative therapeutic strategies, and development of disorder-based drugs.
Collapse
Affiliation(s)
- Deepak Kumar
- Indian Institute of Technology Mandi, VPO Kamand, Himachal Pradesh 175005, India
| | - Ankur Singh
- Indian Institute of Technology Mandi, VPO Kamand, Himachal Pradesh 175005, India
| | - Prateek Kumar
- Indian Institute of Technology Mandi, VPO Kamand, Himachal Pradesh 175005, India
| | - Vladimir N Uversky
- Department of Molecular Medicine and Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, United States
| | - C Durga Rao
- SRM University, AP - Amaravati, Neerukonda, Mangalagiri Mandal Guntur District, Mangalagiri, Andhra Pradesh 522502, India.
| | - Rajanish Giri
- Indian Institute of Technology Mandi, VPO Kamand, Himachal Pradesh 175005, India; BioX Center, Indian Institute of Technology Mandi, Himachal Pradesh, India.
| |
Collapse
|
43
|
DeRisi JL, Huber G, Kistler A, Retallack H, Wilkinson M, Yllanes D. An exploration of ambigrammatic sequences in narnaviruses. Sci Rep 2019; 9:17982. [PMID: 31784609 PMCID: PMC6884476 DOI: 10.1038/s41598-019-54181-3] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2019] [Accepted: 11/11/2019] [Indexed: 11/09/2022] Open
Abstract
Narnaviruses have been described as positive-sense RNA viruses with a remarkably simple genome of ~3 kb, encoding only a highly conserved RNA-dependent RNA polymerase (RdRp). Many narnaviruses, however, are 'ambigrammatic' and harbour an additional uninterrupted open reading frame (ORF) covering almost the entire length of the reverse complement strand. No function has been described for this ORF, yet the absence of stops is conserved across diverse narnaviruses, and in every case the codons in the reverse ORF and the RdRp are aligned. The >3 kb ORF overlap on opposite strands, unprecedented among RNA viruses, motivates an exploration of the constraints imposed or alleviated by the codon alignment. Here, we show that only when the codon frames are aligned can all stop codons be eliminated from the reverse strand by synonymous single-nucleotide substitutions in the RdRp gene, suggesting a mechanism for de novo gene creation within a strongly conserved amino-acid sequence. It will be fascinating to explore what implications this coding strategy has for other aspects of narnavirus biology. Beyond narnaviruses, our rapidly expanding catalogue of viral diversity may yet reveal additional examples of this broadly-extensible principle for ambigrammatic-sequence development.
Collapse
Affiliation(s)
- Joseph L DeRisi
- Chan Zuckerberg Biohub, 499 Illinois Street, San Francisco, CA, 94158, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco, California, USA
| | - Greg Huber
- Chan Zuckerberg Biohub, 499 Illinois Street, San Francisco, CA, 94158, USA
| | - Amy Kistler
- Chan Zuckerberg Biohub, 499 Illinois Street, San Francisco, CA, 94158, USA
| | - Hanna Retallack
- Department of Biochemistry and Biophysics, University of California, San Francisco, California, USA
| | - Michael Wilkinson
- Chan Zuckerberg Biohub, 499 Illinois Street, San Francisco, CA, 94158, USA
- School of Mathematics and Statistics, The Open University, Walton Hall, Milton Keynes, MK7 6AA, England
| | - David Yllanes
- Chan Zuckerberg Biohub, 499 Illinois Street, San Francisco, CA, 94158, USA.
| |
Collapse
|
44
|
Affram Y, Zapata JC, Gholizadeh Z, Tolbert WD, Zhou W, Iglesias-Ussel MD, Pazgier M, Ray K, Latinovic OS, Romerio F. The HIV-1 Antisense Protein ASP Is a Transmembrane Protein of the Cell Surface and an Integral Protein of the Viral Envelope. J Virol 2019; 93:e00574-19. [PMID: 31434734 PMCID: PMC6803264 DOI: 10.1128/jvi.00574-19] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Accepted: 08/14/2019] [Indexed: 12/13/2022] Open
Abstract
The negative strand of HIV-1 encodes a highly hydrophobic antisense protein (ASP) with no known homologs. The presence of humoral and cellular immune responses to ASP in HIV-1 patients indicates that ASP is expressed in vivo, but its role in HIV-1 replication remains unknown. We investigated ASP expression in multiple chronically infected myeloid and lymphoid cell lines using an anti-ASP monoclonal antibody (324.6) in combination with flow cytometry and microscopy approaches. At baseline and in the absence of stimuli, ASP shows polarized subnuclear distribution, preferentially in areas with low content of suppressive epigenetic marks. However, following treatment with phorbol 12-myristate 13-acetate (PMA), ASP translocates to the cytoplasm and is detectable on the cell surface, even in the absence of membrane permeabilization, indicating that 324.6 recognizes an ASP epitope that is exposed extracellularly. Further, surface staining with 324.6 and anti-gp120 antibodies showed that ASP and gp120 colocalize, suggesting that ASP might become incorporated in the membranes of budding virions. Indeed, fluorescence correlation spectroscopy studies showed binding of 324.6 to cell-free HIV-1 particles. Moreover, 324.6 was able to capture and retain HIV-1 virions with efficiency similar to that of the anti-gp120 antibody VRC01. Our studies indicate that ASP is an integral protein of the plasma membranes of chronically infected cells stimulated with PMA, and upon viral budding, ASP becomes a structural protein of the HIV-1 envelope. These results may provide leads to investigate the possible role of ASP in the virus replication cycle and suggest that ASP may represent a new therapeutic or vaccine target.IMPORTANCE The HIV-1 genome contains a gene expressed in the opposite, or antisense, direction to all other genes. The protein product of this antisense gene, called ASP, is poorly characterized, and its role in viral replication remains unknown. We provide evidence that the antisense protein, ASP, of HIV-1 is found within the cell nucleus in unstimulated cells. In addition, we show that after PMA treatment, ASP exits the nucleus and localizes on the cell membrane. Moreover, we demonstrate that ASP is present on the surfaces of viral particles. Altogether, our studies identify ASP as a new structural component of HIV-1 and show that ASP is an accessory protein that promotes viral replication. The presence of ASP on the surfaces of both infected cells and viral particles might be exploited therapeutically.
Collapse
Affiliation(s)
- Yvonne Affram
- Institute of Human Virology, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - Juan C Zapata
- Institute of Human Virology, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - Zahra Gholizadeh
- Institute of Human Virology, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - William D Tolbert
- Institute of Human Virology, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - Wei Zhou
- Institute of Human Virology, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - Maria D Iglesias-Ussel
- Institute of Human Virology, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - Marzena Pazgier
- Institute of Human Virology, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - Krishanu Ray
- Institute of Human Virology, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - Olga S Latinovic
- Institute of Human Virology, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - Fabio Romerio
- Institute of Human Virology, University of Maryland School of Medicine, Baltimore, Maryland, USA
| |
Collapse
|
45
|
Arendsee Z, Li J, Singh U, Bhandary P, Seetharam A, Wurtele ES. fagin: synteny-based phylostratigraphy and finer classification of young genes. BMC Bioinformatics 2019; 20:440. [PMID: 31455236 PMCID: PMC6712868 DOI: 10.1186/s12859-019-3023-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2019] [Accepted: 08/08/2019] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND With every new genome that is sequenced, thousands of species-specific genes (orphans) are found, some originating from ultra-rapid mutations of existing genes, many others originating de novo from non-genic regions of the genome. If some of these genes survive across speciations, then extant organisms will contain a patchwork of genes whose ancestors first appeared at different times. Standard phylostratigraphy, the technique of partitioning genes by their age, is based solely on protein similarity algorithms. However, this approach relies on negative evidence ─ a failure to detect a homolog of a query gene. An alternative approach is to limit the search for homologs to syntenic regions. Then, genes can be positively identified as de novo orphans by tracing them to non-coding sequences in related species. RESULTS We have developed a synteny-based pipeline in the R framework. Fagin determines the genomic context of each query gene in a focal species compared to homologous sequence in target species. We tested the fagin pipeline on two focal species, Arabidopsis thaliana (plus four target species in Brassicaseae) and Saccharomyces cerevisiae (plus six target species in Saccharomyces). Using microsynteny maps, fagin classified the homology relationship of each query gene against each target genome into three main classes, and further subclasses: AAic (has a coding syntenic homolog), NTic (has a non-coding syntenic homolog), and Unknown (has no detected syntenic homolog). fagin inferred over half the "Unknown" A. thaliana query genes, and about 20% for S. cerevisiae, as lacking a syntenic homolog because of local indels or scrambled synteny. CONCLUSIONS fagin augments standard phylostratigraphy, and extends synteny-based phylostratigraphy with an automated, customizable, and detailed contextual analysis. By comparing synteny-based phylostrata to standard phylostrata, fagin systematically identifies those orphans and lineage-specific genes that are well-supported to have originated de novo. Analyzing within-species genomes should distinguish orphan genes that may have originated through rapid divergence from de novo orphans. Fagin also delineates whether a gene has no syntenic homolog because of technical or biological reasons. These analyses indicate that some orphans may be associated with regions of high genomic perturbation.
Collapse
Affiliation(s)
- Zebulun Arendsee
- Department of Genetics Development and Cell Biology, Iowa State University, Ames, IA, 50010, USA
- Center for Metabolic Biology, Iowa State University, Ames, IA, 50011, USA
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, 50011, USA
| | - Jing Li
- Department of Genetics Development and Cell Biology, Iowa State University, Ames, IA, 50010, USA
- Center for Metabolic Biology, Iowa State University, Ames, IA, 50011, USA
| | - Urminder Singh
- Department of Genetics Development and Cell Biology, Iowa State University, Ames, IA, 50010, USA
- Center for Metabolic Biology, Iowa State University, Ames, IA, 50011, USA
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, 50011, USA
| | - Priyanka Bhandary
- Department of Genetics Development and Cell Biology, Iowa State University, Ames, IA, 50010, USA
- Center for Metabolic Biology, Iowa State University, Ames, IA, 50011, USA
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, 50011, USA
| | - Arun Seetharam
- Genome Informatics Facility, Office of Biotechnology, Iowa State University, Ames, IA, 50011, USA
| | - Eve Syrkin Wurtele
- Department of Genetics Development and Cell Biology, Iowa State University, Ames, IA, 50010, USA.
- Center for Metabolic Biology, Iowa State University, Ames, IA, 50011, USA.
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, 50011, USA.
| |
Collapse
|
46
|
Hamsher SE, Keepers KG, Pogoda CS, Stepanek JG, Kane NC, Kociolek JP. Extensive chloroplast genome rearrangement amongst three closely related Halamphora spp. (Bacillariophyceae), and evidence for rapid evolution as compared to land plants. PLoS One 2019; 14:e0217824. [PMID: 31269054 PMCID: PMC6608930 DOI: 10.1371/journal.pone.0217824] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2018] [Accepted: 05/21/2019] [Indexed: 01/08/2023] Open
Abstract
Diatoms are the most diverse lineage of algae, but the diversity of their chloroplast genomes, particularly within a genus, has not been well documented. Herein, we present three chloroplast genomes from the genus Halamphora (H. americana, H. calidilacuna, and H. coffeaeformis), the first pennate diatom genus to be represented by more than one species. Halamphora chloroplast genomes ranged in size from ~120 to 150 kb, representing a 24% size difference within the genus. Differences in genome size were due to changes in the length of the inverted repeat region, length of intergenic regions, and the variable presence of ORFs that appear to encode as-yet-undescribed proteins. All three species shared a set of 161 core features but differed in the presence of two genes, serC and tyrC of foreign and unknown origin, respectively. A comparison of these data to three previously published chloroplast genomes in the non-pennate genus Cyclotella (Thalassiosirales) revealed that Halamphora has undergone extensive chloroplast genome rearrangement compared to other genera, as well as containing variation within the genus. Finally, a comparison of Halamphora chloroplast genomes to those of land plants indicates diatom chloroplast genomes within this genus may be evolving at least ~4–7 times faster than those of land plants. Studies such as these provide deeper insights into diatom chloroplast evolution and important genetic resources for future analyses.
Collapse
Affiliation(s)
- Sarah E. Hamsher
- Department of Biology, Grand Valley State University, Allendale, Michigan, United States of America
- Annis Water Resources Institute, Grand Valley State University, Muskegon, Michigan, United States of America
- * E-mail:
| | - Kyle G. Keepers
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, Colorado, United States of America
| | - Cloe S. Pogoda
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, Colorado, United States of America
| | - Joshua G. Stepanek
- Department of Biology, Colorado Mountain College, Edwards, Colorado, United States of America
| | - Nolan C. Kane
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, Colorado, United States of America
| | - J. Patrick Kociolek
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, Colorado, United States of America
- Museum of Natural History, University of Colorado, Boulder, Colorado, United States of America
| |
Collapse
|
47
|
Schlub TE, Buchmann JP, Holmes EC. A Simple Method to Detect Candidate Overlapping Genes in Viruses Using Single Genome Sequences. Mol Biol Evol 2019; 35:2572-2581. [PMID: 30099499 PMCID: PMC6188560 DOI: 10.1093/molbev/msy155] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Overlapping genes in viruses maximize the coding capacity of their genomes and allow the generation of new genes without major increases in genome size. Despite their importance, the evolution and function of overlapping genes are often not well understood, in part due to difficulties in their detection. In addition, most bioinformatic approaches for the detection of overlapping genes require the comparison of multiple genome sequences that may not be available in metagenomic surveys of virus biodiversity. We introduce a simple new method for identifying candidate functional overlapping genes using single virus genome sequences. Our method uses randomization tests to estimate the expected length of open reading frames and then identifies overlapping open reading frames that significantly exceed this length and are thus predicted to be functional. We applied this method to 2548 reference RNA virus genomes and find that it has both high sensitivity and low false discovery for genes that overlap by at least 50 nucleotides. Notably, this analysis provided evidence for 29 previously undiscovered functional overlapping genes, some of which are coded in the antisense direction suggesting there are limitations in our current understanding of RNA virus replication.
Collapse
Affiliation(s)
- Timothy E Schlub
- Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW, Australia
| | - Jan P Buchmann
- Marie Bashir Institute for Infectious Diseases and Biosecurity, Charles Perkins Centre, School of Life and Environmental Sciences and Sydney Medical School, The University of Sydney, Sydney, NSW , Australia
| | - Edward C Holmes
- Marie Bashir Institute for Infectious Diseases and Biosecurity, Charles Perkins Centre, School of Life and Environmental Sciences and Sydney Medical School, The University of Sydney, Sydney, NSW , Australia
| |
Collapse
|
48
|
Pavesi A. Asymmetric evolution in viral overlapping genes is a source of selective protein adaptation. Virology 2019; 532:39-47. [PMID: 31004987 PMCID: PMC7125799 DOI: 10.1016/j.virol.2019.03.017] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Revised: 03/25/2019] [Accepted: 03/26/2019] [Indexed: 12/29/2022]
Abstract
Overlapping genes represent an intriguing puzzle, as they encode two proteins whose ability to evolve is constrained by each other. Overlapping genes can undergo “symmetric evolution” (similar selection pressures on the two proteins) or “asymmetric evolution” (significantly different selection pressures on the two proteins). By sequence analysis of 75 pairs of homologous viral overlapping genes, I evaluated their accordance with one or the other model. Analysis of nucleotide and amino acid sequences revealed that half of overlaps undergo asymmetric evolution, as the protein from one frame shows a number of substitutions significantly higher than that of the protein from the other frame. Interestingly, the most variable protein (often known to interact with the host proteins) appeared to be encoded by the de novo frame in all cases examined. These findings suggest that overlapping genes, besides to increase the coding ability of viruses, are also a source of selective protein adaptation. A dataset of 80 pairs of homologous overlapping genes from viruses is examined. Its analysis reveals that half of overlapping genes undergo asymmetric evolution. The most variable gene product is that encoded by the de novo overlapping gene. Overlapping genes evolving asymmetrically are a source of selective protein adaptation.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parco Area delle Scienze 11/A, I-43124, Parma, Italy.
| |
Collapse
|
49
|
Mancarella A, Procopio FA, Achsel T, De Crignis E, Foley BT, Corradin G, Bagni C, Pantaleo G, Graziosi C. Detection of antisense protein (ASP) RNA transcripts in individuals infected with human immunodeficiency virus type 1 (HIV-1). J Gen Virol 2019; 100:863-876. [PMID: 30896385 DOI: 10.1099/jgv.0.001244] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
The detection of antisense RNA is hampered by reverse transcription (RT) non-specific priming, due to the ability of RNA secondary structures to prime RT in the absence of specific primers. The detection of antisense RNA by conventional RT-PCR does not allow assessment of the polarity of the initial RNA template, causing the amplification of non-specific cDNAs. In this study we have developed a modified protocol for the detection of human immunodeficiency virus type 1 (HIV-1) antisense protein (ASP) RNA. Using this approach, we have identified ASP transcripts in CD4+ T cells isolated from five HIV-infected individuals, either untreated or under suppressive therapy. We show that ASP RNA can be detected in stimulated CD4+ T cells from both groups of patients, but not in unstimulated cells. We also show that in untreated patients, the patterns of expression of ASP and env are very similar, with the levels of ASP RNA being markedly lower than those of env. Treatment of cells from one viraemic patient with α-amanitin greatly reduces the rate of ASP RNA synthesis, suggesting that it is associated with RNA polymerase II, the central enzyme in the transcription of protein-coding genes. Our data represent the first nucleotide sequences obtained in patients for ASP, demonstrating that its transcription indeed occurs in those HIV-1 lineages in which the ASP open reading frame is present.
Collapse
Affiliation(s)
- Antonio Mancarella
- 1Division of Immunology and Allergy, Lausanne University Hospital, Switzerland
| | | | - Tilmann Achsel
- 2Department of Fundamental Neuroscience, University of Lausanne, Switzerland
| | - Elisa De Crignis
- 3Department of Biochemistry, Erasmus Medical Center, Rotterdam, The Netherlands.,†Present address: Clinical Trial Office, CRO Aviano National Cancer Institute, Aviano, Italy
| | - Brian T Foley
- 4Theoretical Biology and Biophysics Group, Los Alamos National Laboratories, Los Alamos, New Mexico, USA
| | | | - Claudia Bagni
- 2Department of Fundamental Neuroscience, University of Lausanne, Switzerland
| | - Giuseppe Pantaleo
- 1Division of Immunology and Allergy, Lausanne University Hospital, Switzerland
| | - Cecilia Graziosi
- 1Division of Immunology and Allergy, Lausanne University Hospital, Switzerland
| |
Collapse
|
50
|
Puustusmaa M, Abroi A. cRegions-a tool for detecting conserved cis-elements in multiple sequence alignment of diverged coding sequences. PeerJ 2019; 6:e6176. [PMID: 30647994 PMCID: PMC6330207 DOI: 10.7717/peerj.6176] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2018] [Accepted: 11/27/2018] [Indexed: 12/31/2022] Open
Abstract
Identifying cis-acting elements and understanding regulatory mechanisms of a gene is crucial to fully understand the molecular biology of an organism. In general, it is difficult to identify previously uncharacterised cis-acting elements with an unknown consensus sequence. The task is especially problematic with viruses containing regions of limited or no similarity to other previously characterised sequences. Fortunately, the fast increase in the number of sequenced genomes allows us to detect some of these elusive cis-elements. In this work, we introduce a web-based tool called cRegions. It was developed to identify regions within a protein-coding sequence where the conservation in the amino acid sequence is caused by the conservation in the nucleotide sequence. The cRegion can be the first step in discovering novel cis-acting sequences from diverged protein-coding genes. The results can be used as a basis for future experimental analysis. We applied cRegions on the non-structural and structural polyproteins of alphaviruses as an example and successfully detected all known cis-acting elements. In this publication and in previous work, we have shown that cRegions is able to detect a wide variety of functional elements in DNA and RNA viruses. These functional elements include splice sites, stem-loops, overlapping reading frames, internal promoters, ribosome frameshifting signals and other embedded elements with yet unknown function. The cRegions web tool is available at http://bioinfo.ut.ee/cRegions/.
Collapse
Affiliation(s)
- Mikk Puustusmaa
- Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | - Aare Abroi
- Institute of Technology, University of Tartu, Tartu, Estonia
| |
Collapse
|