1
|
Olo Ndela E, Roux S, Henke C, Sczyrba A, Sime Ngando T, Varsani A, Enault F. Reekeekee- and roodoodooviruses, two different Microviridae clades constituted by the smallest DNA phages. Virus Evol 2022; 9:veac123. [PMID: 36694818 PMCID: PMC9865509 DOI: 10.1093/ve/veac123] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Revised: 10/19/2022] [Accepted: 12/22/2022] [Indexed: 12/25/2022] Open
Abstract
Small circular single-stranded DNA viruses of the Microviridae family are both prevalent and diverse in all ecosystems. They usually harbor a genome between 4.3 and 6.3 kb, with a microvirus recently isolated from a marine Alphaproteobacteria being the smallest known genome of a DNA phage (4.248 kb). A subfamily, Amoyvirinae, has been proposed to classify this virus and other related small Alphaproteobacteria-infecting phages. Here, we report the discovery, in meta-omics data sets from various aquatic ecosystems, of sixteen complete microvirus genomes significantly smaller (2.991-3.692 kb) than known ones. Phylogenetic analysis reveals that these sixteen genomes represent two related, yet distinct and diverse, novel groups of microviruses-amoyviruses being their closest known relatives. We propose that these small microviruses are members of two tentatively named subfamilies Reekeekeevirinae and Roodoodoovirinae. As known microvirus genomes encode many overlapping and overprinted genes that are not identified by gene prediction software, we developed a new methodology to identify all genes based on protein conservation, amino acid composition, and selection pressure estimations. Surprisingly, only four to five genes could be identified per genome, with the number of overprinted genes lower than that in phiX174. These small genomes thus tend to have both a lower number of genes and a shorter length for each gene, leaving no place for variable gene regions that could harbor overprinted genes. Even more surprisingly, these two Microviridae groups had specific and different gene content, and major differences in their conserved protein sequences, highlighting that these two related groups of small genome microviruses use very different strategies to fulfill their lifecycle with such a small number of genes. The discovery of these genomes and the detailed prediction and annotation of their genome content expand our understanding of ssDNA phages in nature and are further evidence that these viruses have explored a wide range of possibilities during their long evolution.
Collapse
Affiliation(s)
| | | | - Christian Henke
- Computational Metagenomics, Bielefeld University, Universitätsstraße 27, Bielefeld 30501, Germany,Center for Biotechnology, Bielefeld University, Universitätsstraße 27, Bielefeld 33615, Germany
| | - Alexander Sczyrba
- Computational Metagenomics, Bielefeld University, Universitätsstraße 27, Bielefeld 30501, Germany,Center for Biotechnology, Bielefeld University, Universitätsstraße 27, Bielefeld 33615, Germany
| | - Télesphore Sime Ngando
- Université Clermont Auvergne, CNRS, Laboratoire Microorganismes: Genome et Environnement, Clermont-Ferrand F-63000, France
| | | | | |
Collapse
|
2
|
Pley C, Lourenço J, McNaughton AL, Matthews PC. Spacer Domain in Hepatitis B Virus Polymerase: Plugging a Hole or Performing a Role? J Virol 2022; 96:e0005122. [PMID: 35412348 PMCID: PMC9093120 DOI: 10.1128/jvi.00051-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Accepted: 03/14/2022] [Indexed: 11/25/2022] Open
Abstract
Hepatitis B virus (HBV) polymerase is divided into terminal protein, spacer, reverse transcriptase, and RNase domains. Spacer has previously been considered dispensable, merely acting as a tether between other domains or providing plasticity to accommodate deletions and mutations. We explore evidence for the role of spacer sequence, structure, and function in HBV evolution and lineage, consider its associations with escape from drugs, vaccines, and immune responses, and review its potential impacts on disease outcomes.
Collapse
Affiliation(s)
- Caitlin Pley
- School of Clinical Medicine, University of Cambridge, Cambridge, United Kingdom
- Guy’s and St Thomas’ NHS Foundation Trust, London, United Kingdom
| | - José Lourenço
- Department of Zoology, University of Oxford, Oxford, United Kingdom
- Biosystems and Integrative Sciences Institute, University of Lisbon, Lisbon, Portugal
| | - Anna L. McNaughton
- Population Health Science, Bristol Medical School, University of Bristol, Bristol, United Kingdom
- Nuffield Department of Medicine, University of Oxford Medawar Building, Oxford, United Kingdom
| | - Philippa C. Matthews
- Nuffield Department of Medicine, University of Oxford Medawar Building, Oxford, United Kingdom
- The Francis Crick Institute, London, United Kingdom
- Division of Infection and Immunity, University College London, London, United Kingdom
| |
Collapse
|
3
|
Safari M, Jayaraman B, Yang S, Smith C, Fernandes JD, Frankel AD. Functional and structural segregation of overlapping helices in HIV-1. eLife 2022; 11:72482. [PMID: 35511220 PMCID: PMC9119678 DOI: 10.7554/elife.72482] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Accepted: 04/19/2022] [Indexed: 11/13/2022] Open
Abstract
Overlapping coding regions balance selective forces between multiple genes. One possible division of nucleotide sequence is that the predominant selective force on a particular nucleotide can be attributed to just one gene. While this arrangement has been observed in regions in which one gene is structured and the other is disordered, we sought to explore how overlapping genes balance constraints when both protein products are structured over the same sequence. We use a combination of sequence analysis, functional assays, and selection experiments to examine an overlapped region in HIV-1 that encodes helical regions in both Env and Rev. We find that functional segregation occurs even in this overlap, with each protein spacing its functional residues in a manner that allows a mutable non-binding face of one helix to encode important functional residues on a charged face in the other helix. Additionally, our experiments reveal novel and critical functional residues in Env and have implications for the therapeutic targeting of HIV-1.
Collapse
Affiliation(s)
- Maliheh Safari
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, United States
| | - Bhargavi Jayaraman
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, United States
| | - Shumin Yang
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, United States.,School of Medicine, Tsinghua University, Beijing, China
| | - Cynthia Smith
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, United States
| | - Jason D Fernandes
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, United States
| | - Alan D Frankel
- Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, United States
| |
Collapse
|
4
|
Gene Overlapping as a Modulator of Begomovirus Evolution. Microorganisms 2022; 10:microorganisms10020366. [PMID: 35208820 PMCID: PMC8875319 DOI: 10.3390/microorganisms10020366] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 02/01/2022] [Accepted: 02/01/2022] [Indexed: 02/06/2023] Open
Abstract
In RNA viruses, which have high mutation—and fast evolutionary— rates, gene overlapping (i.e., genomic regions that encode more than one protein) is a major factor controlling mutational load and therefore the virus evolvability. Although DNA viruses use host high-fidelity polymerases for their replication, and therefore should have lower mutation rates, it has been shown that some of them have evolutionary rates comparable to those of RNA viruses. Notably, these viruses have large proportions of their genes with at least one overlapping instance. Hence, gene overlapping could be a modulator of virus evolution beyond the RNA world. To test this hypothesis, we use the genus Begomovirus of plant viruses as a model. Through comparative genomic approaches, we show that terminal gene overlapping decreases the rate of virus evolution, which is associated with lower frequency of both synonymous and nonsynonymous mutations. In contrast, terminal overlapping has little effect on the pace of virus evolution. Overall, our analyses support a role for gene overlapping in the evolution of begomoviruses and provide novel information on the factors that shape their genetic diversity.
Collapse
|
5
|
Piontkivska H, Wales-McGrath B, Miyamoto M, Wayne ML. ADAR Editing in Viruses: An Evolutionary Force to Reckon with. Genome Biol Evol 2021; 13:evab240. [PMID: 34694399 PMCID: PMC8586724 DOI: 10.1093/gbe/evab240] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/21/2021] [Indexed: 02/06/2023] Open
Abstract
Adenosine Deaminases that Act on RNA (ADARs) are RNA editing enzymes that play a dynamic and nuanced role in regulating transcriptome and proteome diversity. This editing can be highly selective, affecting a specific site within a transcript, or nonselective, resulting in hyperediting. ADAR editing is important for regulating neural functions and autoimmunity, and has a key role in the innate immune response to viral infections, where editing can have a range of pro- or antiviral effects and can contribute to viral evolution. Here we examine the role of ADAR editing across a broad range of viral groups. We propose that the effect of ADAR editing on viral replication, whether pro- or antiviral, is better viewed as an axis rather than a binary, and that the specific position of a given virus on this axis is highly dependent on virus- and host-specific factors, and can change over the course of infection. However, more research needs to be devoted to understanding these dynamic factors and how they affect virus-ADAR interactions and viral evolution. Another area that warrants significant attention is the effect of virus-ADAR interactions on host-ADAR interactions, particularly in light of the crucial role of ADAR in regulating neural functions. Answering these questions will be essential to developing our understanding of the relationship between ADAR editing and viral infection. In turn, this will further our understanding of the effects of viruses such as SARS-CoV-2, as well as many others, and thereby influence our approach to treating these deadly diseases.
Collapse
Affiliation(s)
- Helen Piontkivska
- Department of Biological Sciences, Kent State University, Ohio, USA
- School of Biomedical Sciences, Kent State University, Ohio, USA
- Brain Health Research Institute, Kent State University, Ohio, USA
| | | | - Michael Miyamoto
- Department of Biology, University of Florida, Gainesville, Florida, USA
| | - Marta L Wayne
- Department of Biology, University of Florida, Gainesville, Florida, USA
| |
Collapse
|
6
|
De Cahsan B, Kiemel K, Westbury MV, Lauritsen M, Autenrieth M, Gollmann G, Schweiger S, Stenberg M, Nyström P, Drews H, Tiedemann R. Southern introgression increases adaptive immune gene variability in northern range margin populations of Fire-bellied toad. Ecol Evol 2021; 11:9776-9790. [PMID: 34306661 PMCID: PMC8293767 DOI: 10.1002/ece3.7805] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Revised: 05/14/2021] [Accepted: 05/28/2021] [Indexed: 11/20/2022] Open
Abstract
Northern range margin populations of the European fire-bellied toad (Bombina bombina) have rapidly declined during recent decades. Extensive agricultural land use has fragmented the landscape, leading to habitat disruption and loss, as well as eutrophication of ponds. In Northern Germany (Schleswig-Holstein) and Southern Sweden (Skåne), this population decline resulted in decreased gene flow from surrounding populations, low genetic diversity, and a putative reduction in adaptive potential, leaving populations vulnerable to future environmental and climatic changes. Previous studies using mitochondrial control region and nuclear transcriptome-wide SNP data detected introgressive hybridization in multiple northern B. bombina populations after unreported release of toads from Austria. Here, we determine the impact of this introgression by comparing the body conditions (proxy for fitness) of introgressed and nonintrogressed populations and the genetic consequences in two candidate genes for putative local adaptation (the MHC II gene as part of the adaptive immune system and the stress response gene HSP70 kDa). We detected regional differences in body condition and observed significantly elevated levels of within individual MHC allele counts in introgressed Swedish populations, associated with a tendency toward higher body weight, relative to regional nonintrogressed populations. These differences were not observed among introgressed and nonintrogressed German populations. Genetic diversity in both MHC and HSP was generally lower in northern than Austrian populations. Our study sheds light on the potential benefits of translocations of more distantly related conspecifics as a means to increase adaptive genetic variability and fitness of genetically depauperate range margin populations without distortion of local adaptation.
Collapse
Affiliation(s)
- Binia De Cahsan
- Unit of Evolutionary Biology/Systematic ZoologyInstitute of Biochemistry and BiologyUniversity of PotsdamPotsdamGermany
- GLOBE InstituteUniversity of CopenhagenCopenhagenDenmark
| | - Katrin Kiemel
- Unit of Evolutionary Biology/Systematic ZoologyInstitute of Biochemistry and BiologyUniversity of PotsdamPotsdamGermany
| | | | - Maike Lauritsen
- Unit of Evolutionary Biology/Systematic ZoologyInstitute of Biochemistry and BiologyUniversity of PotsdamPotsdamGermany
| | - Marijke Autenrieth
- Unit of Evolutionary Biology/Systematic ZoologyInstitute of Biochemistry and BiologyUniversity of PotsdamPotsdamGermany
| | - Günter Gollmann
- Department of Evolutionary BiologyUniversity of ViennaViennaAustria
| | - Silke Schweiger
- Herpetological CollectionNatural History Museum ViennaViennaAustria
| | | | | | - Hauke Drews
- Stiftung Naturschutz Schleswig‐HolsteinMolfseeGermany
| | - Ralph Tiedemann
- Unit of Evolutionary Biology/Systematic ZoologyInstitute of Biochemistry and BiologyUniversity of PotsdamPotsdamGermany
| |
Collapse
|
7
|
Genetic and phylogenetic characterization of polycistronic dsRNA segment-10 of bluetongue virus isolates from India between 1985 and 2011. Virus Genes 2021; 57:369-379. [PMID: 34120252 DOI: 10.1007/s11262-021-01855-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Accepted: 06/08/2021] [Indexed: 01/07/2023]
Abstract
The smallest polycistronic dsRNA segment-10 (S10) of bluetongue virus (BTV) encodes NS3/3A and putative NS5. The S10 sequence data of 46 Indian BTV field isolates obtained between 1985 and 2011 were determined and compared with the cognate sequences of global BTV strains. The largest ORF on S10 encodes NS3 (229 aa) and an amino-terminal truncated form of the protein (NS3A) and a putative NS5 (50-59 aa) due to alternate translation initiation site. The overall mean distance of the global NS3 was 0.1106 and 0.0269 at nt and deduced aa sequence, respectively. The global BTV strains formed four major clusters. The major cluster of Indian BTV strains was closely related to the viruses reported from Australia and China. A minor sub-cluster of Indian BTV strains were closely related to the USA strains and a few of the Indian strains were similar to the South African reference and vaccine strains. The global trait association of phylogenetic structure indicates the evolution of the global BTV S10 was not homogenous but rather represents a moderate level of geographical divergence. There was no evidence of an association between the virus and the host species, suggesting a random spread of the viruses. Conflicting selection pressure on the alternate coding sequences of the S10 was evident where NS3/3A might have evolved through strong purifying (negative) selection and NS5 through a positive selection. The presence of multiple positively selected codons on the putative NS5 may be advantageous for adaptation of the virus though their precise role is unknown.
Collapse
|
8
|
Pavesi A. Origin, Evolution and Stability of Overlapping Genes in Viruses: A Systematic Review. Genes (Basel) 2021; 12:genes12060809. [PMID: 34073395 PMCID: PMC8227390 DOI: 10.3390/genes12060809] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 05/22/2021] [Accepted: 05/24/2021] [Indexed: 12/11/2022] Open
Abstract
During their long evolutionary history viruses generated many proteins de novo by a mechanism called “overprinting”. Overprinting is a process in which critical nucleotide substitutions in a pre-existing gene can induce the expression of a novel protein by translation of an alternative open reading frame (ORF). Overlapping genes represent an intriguing example of adaptive conflict, because they simultaneously encode two proteins whose freedom to change is constrained by each other. However, overlapping genes are also a source of genetic novelties, as the constraints under which alternative ORFs evolve can give rise to proteins with unusual sequence properties, most importantly the potential for novel functions. Starting with the discovery of overlapping genes in phages infecting Escherichia coli, this review covers a range of studies dealing with detection of overlapping genes in small eukaryotic viruses (genomic length below 30 kb) and recognition of their critical role in the evolution of pathogenicity. Origin of overlapping genes, what factors favor their birth and retention, and how they manage their inherent adaptive conflict are extensively reviewed. Special attention is paid to the assembly of overlapping genes into ad hoc databases, suitable for future studies, and to the development of statistical methods for exploring viral genome sequences in search of undiscovered overlaps.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parco Area delle Scienze 23/A, I-43124 Parma, Italy
| |
Collapse
|
9
|
Pavesi A. Asymmetric evolution in viral overlapping genes is a source of selective protein adaptation. Virology 2019; 532:39-47. [PMID: 31004987 PMCID: PMC7125799 DOI: 10.1016/j.virol.2019.03.017] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Revised: 03/25/2019] [Accepted: 03/26/2019] [Indexed: 12/29/2022]
Abstract
Overlapping genes represent an intriguing puzzle, as they encode two proteins whose ability to evolve is constrained by each other. Overlapping genes can undergo “symmetric evolution” (similar selection pressures on the two proteins) or “asymmetric evolution” (significantly different selection pressures on the two proteins). By sequence analysis of 75 pairs of homologous viral overlapping genes, I evaluated their accordance with one or the other model. Analysis of nucleotide and amino acid sequences revealed that half of overlaps undergo asymmetric evolution, as the protein from one frame shows a number of substitutions significantly higher than that of the protein from the other frame. Interestingly, the most variable protein (often known to interact with the host proteins) appeared to be encoded by the de novo frame in all cases examined. These findings suggest that overlapping genes, besides to increase the coding ability of viruses, are also a source of selective protein adaptation. A dataset of 80 pairs of homologous overlapping genes from viruses is examined. Its analysis reveals that half of overlapping genes undergo asymmetric evolution. The most variable gene product is that encoded by the de novo overlapping gene. Overlapping genes evolving asymmetrically are a source of selective protein adaptation.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parco Area delle Scienze 11/A, I-43124, Parma, Italy.
| |
Collapse
|
10
|
Adaptive evolution of proteins in hepatitis B virus during divergence of genotypes. Sci Rep 2017; 7:1990. [PMID: 28512348 PMCID: PMC5434055 DOI: 10.1038/s41598-017-02012-8] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2017] [Accepted: 04/03/2017] [Indexed: 12/12/2022] Open
Abstract
Hepatitis B virus (HBV) is classified into several genotypes, correlated with different geographic distributions, clinical outcomes and susceptible human populations. It is crucial to investigate the evolutionary significance behind the diversification of HBV genotypes, because it improves our understanding of their pathological differences and pathogen-host interactions. Here, we performed comprehensive analysis of HBV genome sequences collected from public database. With a stringent criteria, we generated a dataset of 2992 HBV genomes from eight major genotypes. In particular, we applied a specified classification of non-synonymous and synonymous variants in overlapping regions, to distinguish joint and independent gene evolutions. We confirmed the presence of selective constraints over non-synonymous variants in consideration of overlapping regions. We then performed the McDonald-Kreitman test and revealed adaptive evolutions of non-synonymous variants during genotypic differentiation. Remarkably, we identified strong positive selection that drove the differentiation of PreS1 domain, which is an essential regulator involved in viral transmission. Our study presents novel evidences for the adaptive evolution of HBV genotypes, which suggests that these viruses evolve directionally for maintenance or improvement of successful infections.
Collapse
|
11
|
Fernandes JD, Faust TB, Strauli NB, Smith C, Crosby DC, Nakamura RL, Hernandez RD, Frankel AD. Functional Segregation of Overlapping Genes in HIV. Cell 2017; 167:1762-1773.e12. [PMID: 27984726 DOI: 10.1016/j.cell.2016.11.031] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2016] [Revised: 09/29/2016] [Accepted: 11/15/2016] [Indexed: 11/28/2022]
Abstract
Overlapping genes pose an evolutionary dilemma as one DNA sequence evolves under the selection pressures of multiple proteins. Here, we perform systematic statistical and mutational analyses of the overlapping HIV-1 genes tat and rev and engineer exhaustive libraries of non-overlapped viruses to perform deep mutational scanning of each gene independently. We find a "segregated" organization in which overlapped sites encode functional residues of one gene or the other, but never both. Furthermore, this organization eliminates unfit genotypes, providing a fitness advantage to the population. Our comprehensive analysis reveals the extraordinary manner in which HIV minimizes the constraint of overlapping genes and repurposes that constraint to its own advantage. Thus, overlaps are not just consequences of evolutionary constraints, but rather can provide population fitness advantages.
Collapse
Affiliation(s)
- Jason D Fernandes
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94158, USA; Program in Pharmaceutical Sciences and Pharmacogenomics, University of California San Francisco, San Francisco, CA 94158, USA
| | - Tyler B Faust
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94158, USA; Tetrad Program, Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94158, USA
| | - Nicolas B Strauli
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA; Biomedical Sciences Graduate Program, University of California San Francisco, San Francisco, CA 94158, USA
| | - Cynthia Smith
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94158, USA
| | - David C Crosby
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94158, USA
| | - Robert L Nakamura
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94158, USA
| | - Ryan D Hernandez
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA
| | - Alan D Frankel
- Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA 94158, USA.
| |
Collapse
|
12
|
Saha D, Podder S, Ghosh TC. Overlapping Regions in HIV-1 Genome Act as Potential Sites for Host-Virus Interaction. Front Microbiol 2016; 7:1735. [PMID: 27867372 PMCID: PMC5095123 DOI: 10.3389/fmicb.2016.01735] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2016] [Accepted: 10/17/2016] [Indexed: 01/05/2023] Open
Abstract
More than a decade, overlapping genes in RNA viruses became a subject of research which has explored various effect of gene overlapping on the evolution and function of viral genomes like genome size compaction. Additionally, overlapping regions (OVRs) are also reported to encode elevated degree of protein intrinsic disorder (PID) in unspliced RNA viruses. With the aim to explore the roles of OVRs in HIV-1 pathogenesis, we have carried out an in-depth analysis on the association of gene overlapping with PID in 35 HIV1- M subtypes. Our study reveals an over representation of PID in OVR of HIV-1 genomes. These disordered residues endure several vital, structural features like short linear motifs (SLiMs) and protein phosphorylation (PP) sites which are previously shown to be involved in massive host–virus interaction. Moreover, SLiMs in OVRs are noticed to be more functionally potential as compared to that of non-overlapping region. Although, density of experimentally verified SLiMs, resided in 9 HIV-1 genes, involved in host–virus interaction do not show any bias toward clustering into OVR, tat and rev two important proteins mediates host–pathogen interaction by their experimentally verified SLiMs, which are mostly localized in OVR. Finally, our analysis suggests that the acquisition of SLiMs in OVR is mutually exclusive of the occurrence of disordered residues, while the enrichment of PPs in OVR is solely dependent on PID and not on overlapping coding frames. Thus, OVRs of HIV-1 genomes could be demarcated as potential molecular recognition sites during host–virus interaction.
Collapse
Affiliation(s)
- Deeya Saha
- Bioinformatics Centre, Bose Institute Kolkata, India
| | - Soumita Podder
- Department of Microbiology, Raiganj University Raiganj, India
| | | |
Collapse
|
13
|
Concomitant emergence of the antisense protein gene of HIV-1 and of the pandemic. Proc Natl Acad Sci U S A 2016; 113:11537-11542. [PMID: 27681623 DOI: 10.1073/pnas.1605739113] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Recent experiments provide sound arguments in favor of the in vivo expression of the AntiSense Protein (ASP) of HIV-1. This putative protein is encoded on the antisense strand of the provirus genome and entirely overlapped by the env gene with reading frame -2. The existence of ASP was suggested in 1988, but is still controversial, and its function has yet to be determined. We used a large dataset of ∼23,000 HIV-1 and SIV sequences to study the origin, evolution, and conservation of the asp gene. We found that the ASP ORF is specific to group M of HIV-1, which is responsible for the human pandemic. Moreover, the correlation between the presence of asp and the prevalence of HIV-1 groups and M subtypes appeared to be statistically significant. We then looked for evidence of selection pressure acting on asp Using computer simulations, we showed that the conservation of the ASP ORF in the group M could not be due to chance. Standard methods were ineffective in disentangling the two selection pressures imposed by both the Env and ASP proteins-an expected outcome with overlaps in frame -2. We thus developed a method based on careful evolutionary analysis of the presence/absence of stop codons, revealing that ASP does impose significant selection pressure. All of these results support the idea that asp is the 10th gene of HIV-1 group M and indicate a correlation with the spread of the pandemic.
Collapse
|
14
|
Allison JR, Lechner M, Hoeppner MP, Poole AM. Positive Selection or Free to Vary? Assessing the Functional Significance of Sequence Change Using Molecular Dynamics. PLoS One 2016; 11:e0147619. [PMID: 26871901 PMCID: PMC4752228 DOI: 10.1371/journal.pone.0147619] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2015] [Accepted: 01/06/2016] [Indexed: 11/18/2022] Open
Abstract
Evolutionary arms races between pathogens and their hosts may be manifested as selection for rapid evolutionary change of key genes, and are sometimes detectable through sequence-level analyses. In the case of protein-coding genes, such analyses frequently predict that specific codons are under positive selection. However, detecting positive selection can be non-trivial, and false positive predictions are a common concern in such analyses. It is therefore helpful to place such predictions within a structural and functional context. Here, we focus on the p19 protein from tombusviruses. P19 is a homodimer that sequesters siRNAs, thereby preventing the host RNAi machinery from shutting down viral infection. Sequence analysis of the p19 gene is complicated by the fact that it is constrained at the sequence level by overprinting of a viral movement protein gene. Using homology modeling, in silico mutation and molecular dynamics simulations, we assess how non-synonymous changes to two residues involved in forming the dimer interface—one invariant, and one predicted to be under positive selection—impact molecular function. Interestingly, we find that both observed variation and potential variation (where a non-synonymous change to p19 would be synonymous for the overprinted movement protein) does not significantly impact protein structure or RNA binding. Consequently, while several methods identify residues at the dimer interface as being under positive selection, MD results suggest they are functionally indistinguishable from a site that is free to vary. Our analyses serve as a caveat to using sequence-level analyses in isolation to detect and assess positive selection, and emphasize the importance of also accounting for how non-synonymous changes impact structure and function.
Collapse
Affiliation(s)
- Jane R. Allison
- Centre for Theoretical Chemistry and Physics & Institute of Natural and Mathematical Sciences, Massey University Albany, Auckland, New Zealand
- Biomolecular Interaction Centre, University of Canterbury, Christchurch, New Zealand
- Maurice Wilkins Centre for Molecular Biodiscovery, Massey University Albany, Auckland, New Zealand
- * E-mail: (JA); (AP)
| | - Marcus Lechner
- Department of Pharmaceutical Chemistry, Philipps-University Marburg, Marburg, Germany
| | - Marc P. Hoeppner
- Christian-Albrechts-University of Kiel, Institute of Clinical Molecular Biology, Kiel, Germany
| | - Anthony M. Poole
- Biomolecular Interaction Centre, University of Canterbury, Christchurch, New Zealand
- School of Biological Sciences, University of Canterbury, Christchurch, New Zealand
- * E-mail: (JA); (AP)
| |
Collapse
|
15
|
Stewart M, Hardy A, Barry G, Pinto RM, Caporale M, Melzi E, Hughes J, Taggart A, Janowicz A, Varela M, Ratinier M, Palmarini M. Characterization of a second open reading frame in genome segment 10 of bluetongue virus. J Gen Virol 2015; 96:3280-3293. [PMID: 26290332 PMCID: PMC4806581 DOI: 10.1099/jgv.0.000267] [Citation(s) in RCA: 73] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Viruses have often evolved overlapping reading frames in order to maximize their coding capacity. Until recently, the segmented dsRNA genome of viruses of the Orbivirus genus was thought to be monocistronic, but the identification of the bluetongue virus (BTV) NS4 protein changed this assumption. A small ORF in segment 10, overlapping the NS3 ORF in the +1 position, is maintained in more than 300 strains of the 27 different BTV serotypes and in more than 200 strains of the phylogenetically related African horse sickness virus (AHSV). In BTV, this ORF (named S10-ORF2 in this study) encodes a putative protein 50–59 residues in length and appears to be under strong positive selection. HA- or GFP-tagged versions of S10-ORF2 expressed from transfected plasmids localized within the nucleoli of transfected cells, unless a putative nucleolar localization signal was mutated. S10-ORF2 inhibited gene expression, but not RNA translation, in transient transfection reporter assays. In both mammalian and insect cells, BTV S10-ORF2 deletion mutants (BTV8ΔS10-ORF2) displayed similar replication kinetics to wt virus. In vivo, S10-ORF2 deletion mutants were pathogenic in mouse models of disease. Although further evidence is required for S10-ORF2 expression during infection, the data presented provide an initial characterization of this ORF.
Collapse
Affiliation(s)
- Meredith Stewart
- MRC-University of Glasgow Centre for Virus Research, Glasgow, UK
| | - Alexandra Hardy
- MRC-University of Glasgow Centre for Virus Research, Glasgow, UK
| | - Gerald Barry
- MRC-University of Glasgow Centre for Virus Research, Glasgow, UK
| | - Rute Maria Pinto
- MRC-University of Glasgow Centre for Virus Research, Glasgow, UK
| | - Marco Caporale
- MRC-University of Glasgow Centre for Virus Research, Glasgow, UK.,Istituto Zooprofilattico Sperimentale dell'Abruzzo e Molise 'G. Caporale', Teramo, Italy
| | - Eleonora Melzi
- MRC-University of Glasgow Centre for Virus Research, Glasgow, UK
| | - Joseph Hughes
- MRC-University of Glasgow Centre for Virus Research, Glasgow, UK
| | - Aislynn Taggart
- MRC-University of Glasgow Centre for Virus Research, Glasgow, UK
| | - Anna Janowicz
- MRC-University of Glasgow Centre for Virus Research, Glasgow, UK
| | - Mariana Varela
- MRC-University of Glasgow Centre for Virus Research, Glasgow, UK
| | - Maxime Ratinier
- MRC-University of Glasgow Centre for Virus Research, Glasgow, UK
| | | |
Collapse
|
16
|
Nelson CW, Hughes AL. Within-host nucleotide diversity of virus populations: insights from next-generation sequencing. INFECTION, GENETICS AND EVOLUTION : JOURNAL OF MOLECULAR EPIDEMIOLOGY AND EVOLUTIONARY GENETICS IN INFECTIOUS DISEASES 2015; 30:1-7. [PMID: 25481279 PMCID: PMC4316684 DOI: 10.1016/j.meegid.2014.11.026] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/02/2014] [Revised: 11/26/2014] [Accepted: 11/27/2014] [Indexed: 01/03/2023]
Abstract
Next-generation sequencing (NGS) technology offers new opportunities for understanding the evolution and dynamics of viral populations within individual hosts over the course of infection. We review simple methods for estimating synonymous and nonsynonymous nucleotide diversity in viral genes from NGS data without the need for inferring linkage. We discuss the potential usefulness of these data for addressing questions of both practical and theoretical interest, including fundamental questions regarding the effective population sizes of within-host viral populations and the modes of natural selection acting on them.
Collapse
Affiliation(s)
- Chase W Nelson
- Department of Biological Sciences, University of South Carolina, Columbia, SC 29208, USA
| | - Austin L Hughes
- Department of Biological Sciences, University of South Carolina, Columbia, SC 29208, USA.
| |
Collapse
|
17
|
Abstract
Overlapping genes are two protein-coding sequences sharing a significant part of the same DNA locus in different reading frames. Although in recent times an increasing number of examples have been found in bacteria the underlying mechanisms of their evolution are unknown. In this work we explore how selective pressure in a protein-coding sequence influences its overlapping genes in alternative reading frames. We model evolution using a time-continuous Markov process and derive the corresponding model for the remaining frames to quantify selection pressure and genetic noise. Our findings lead to the presumption that, once information is embedded in the reverse reading frame −2 (relative to the mother gene in +1) purifying selection in the protein-coding reading frame automatically protects the sequences in both frames. We also found that this coincides with the fact that the genetic noise measured using the conditional entropy is minimal in frame −2 under selection in the coding frame.
Collapse
Affiliation(s)
- Katharina Mir
- Institute of Communications Engineering, Ulm University, Ulm, Germany
- * E-mail:
| | - Steffen Schober
- Institute of Communications Engineering, Ulm University, Ulm, Germany
| |
Collapse
|
18
|
Bailey AL, Lauck M, Weiler A, Sibley SD, Dinis JM, Bergman Z, Nelson CW, Correll M, Gleicher M, Hyeroba D, Tumukunde A, Weny G, Chapman C, Kuhn JH, Hughes AL, Friedrich TC, Goldberg TL, O'Connor DH. High genetic diversity and adaptive potential of two simian hemorrhagic fever viruses in a wild primate population. PLoS One 2014; 9:e90714. [PMID: 24651479 PMCID: PMC3961216 DOI: 10.1371/journal.pone.0090714] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2013] [Accepted: 02/03/2014] [Indexed: 12/20/2022] Open
Abstract
Key biological properties such as high genetic diversity and high evolutionary rate enhance the potential of certain RNA viruses to adapt and emerge. Identifying viruses with these properties in their natural hosts could dramatically improve disease forecasting and surveillance. Recently, we discovered two novel members of the viral family Arteriviridae: simian hemorrhagic fever virus (SHFV)-krc1 and SHFV-krc2, infecting a single wild red colobus (Procolobus rufomitratus tephrosceles) in Kibale National Park, Uganda. Nearly nothing is known about the biological properties of SHFVs in nature, although the SHFV type strain, SHFV-LVR, has caused devastating outbreaks of viral hemorrhagic fever in captive macaques. Here we detected SHFV-krc1 and SHFV-krc2 in 40% and 47% of 60 wild red colobus tested, respectively. We found viral loads in excess of 106–107 RNA copies per milliliter of blood plasma for each of these viruses. SHFV-krc1 and SHFV-krc2 also showed high genetic diversity at both the inter- and intra-host levels. Analyses of synonymous and non-synonymous nucleotide diversity across viral genomes revealed patterns suggestive of positive selection in SHFV open reading frames (ORF) 5 (SHFV-krc2 only) and 7 (SHFV-krc1 and SHFV-krc2). Thus, these viruses share several important properties with some of the most rapidly evolving, emergent RNA viruses.
Collapse
Affiliation(s)
- Adam L. Bailey
- Department of Pathology and Laboratory Medicine, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
- Wisconsin National Primate Research Center, Madison, Wisconsin, United States of America
| | - Michael Lauck
- Department of Pathology and Laboratory Medicine, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
- Wisconsin National Primate Research Center, Madison, Wisconsin, United States of America
| | - Andrea Weiler
- Wisconsin National Primate Research Center, Madison, Wisconsin, United States of America
- Department of Pathobiological Sciences, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Samuel D. Sibley
- Wisconsin National Primate Research Center, Madison, Wisconsin, United States of America
- Department of Pathobiological Sciences, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Jorge M. Dinis
- Department of Pathobiological Sciences, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Zachary Bergman
- Wisconsin National Primate Research Center, Madison, Wisconsin, United States of America
- Department of Pathobiological Sciences, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Chase W. Nelson
- Department of Biological Sciences, University of South Carolina, Columbia, South Carolina, United States of America
| | - Michael Correll
- Department of Computer Sciences, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Michael Gleicher
- Department of Computer Sciences, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | | | | | | | - Colin Chapman
- Makerere University, Kampala, Uganda
- Department of Anthropology and School of Environment, McGill University, Montreal, Quebec, Canada
| | - Jens H. Kuhn
- Integrated Research Facility at Fort Detrick, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Fort Detrick, Frederick, Maryland, United States of America
| | - Austin L. Hughes
- Department of Biological Sciences, University of South Carolina, Columbia, South Carolina, United States of America
| | - Thomas C. Friedrich
- Wisconsin National Primate Research Center, Madison, Wisconsin, United States of America
- Department of Pathobiological Sciences, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Tony L. Goldberg
- Wisconsin National Primate Research Center, Madison, Wisconsin, United States of America
- Department of Pathobiological Sciences, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - David H. O'Connor
- Department of Pathology and Laboratory Medicine, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
- Wisconsin National Primate Research Center, Madison, Wisconsin, United States of America
- * E-mail:
| |
Collapse
|
19
|
Lo MK, Søgaard TM, Karlin DG. Evolution and structural organization of the C proteins of paramyxovirinae. PLoS One 2014; 9:e90003. [PMID: 24587180 PMCID: PMC3934983 DOI: 10.1371/journal.pone.0090003] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2013] [Accepted: 01/24/2014] [Indexed: 12/21/2022] Open
Abstract
The phosphoprotein (P) gene of most Paramyxovirinae encodes several proteins in overlapping frames: P and V, which share a common N-terminus (PNT), and C, which overlaps PNT. Overlapping genes are of particular interest because they encode proteins originated de novo, some of which have unknown structural folds, challenging the notion that nature utilizes only a limited, well-mapped area of fold space. The C proteins cluster in three groups, comprising measles, Nipah, and Sendai virus. We predicted that all C proteins have a similar organization: a variable, disordered N-terminus and a conserved, α-helical C-terminus. We confirmed this predicted organization by biophysically characterizing recombinant C proteins from Tupaia paramyxovirus (measles group) and human parainfluenza virus 1 (Sendai group). We also found that the C of the measles and Nipah groups have statistically significant sequence similarity, indicating a common origin. Although the C of the Sendai group lack sequence similarity with them, we speculate that they also have a common origin, given their similar genomic location and structural organization. Since C is dispensable for viral replication, unlike PNT, we hypothesize that C may have originated de novo by overprinting PNT in the ancestor of Paramyxovirinae. Intriguingly, in measles virus and Nipah virus, PNT encodes STAT1-binding sites that overlap different regions of the C-terminus of C, indicating they have probably originated independently. This arrangement, in which the same genetic region encodes simultaneously a crucial functional motif (a STAT1-binding site) and a highly constrained region (the C-terminus of C), seems paradoxical, since it should severely reduce the ability of the virus to adapt. The fact that it originated twice suggests that it must be balanced by an evolutionary advantage, perhaps from reducing the size of the genetic region vulnerable to mutations.
Collapse
Affiliation(s)
- Michael K. Lo
- Centers for Disease Control and Prevention, Viral Special Pathogens Branch, Atlanta, Georgia, United States of America
| | - Teit Max Søgaard
- Division of Structural Biology, Oxford University, Oxford, United Kingdom
| | - David G. Karlin
- Division of Structural Biology, Oxford University, Oxford, United Kingdom
- Department of Zoology, University of Oxford, Oxford, United Kingdom
- * E-mail:
| |
Collapse
|
20
|
Simon-Loriere E, Holmes EC, Pagán I. The effect of gene overlapping on the rate of RNA virus evolution. Mol Biol Evol 2013; 30:1916-28. [PMID: 23686658 DOI: 10.1093/molbev/mst094] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Gene overlapping is widely employed by RNA viruses to generate genetic novelty while retaining a small genome size. However, gene overlapping also increases the deleterious effect of mutations as they affect more than one gene, thereby reducing the evolutionary rate of RNA viruses and hence their adaptive capacity. Although there is general agreement on the benefits of gene overlapping as a mechanism of genomic compression for rapidly evolving organisms, its effect on the pace of RNA virus evolution remains a source of debate. To address this issue, we collected sequence data from 117 instances of gene overlapping across 19 families, 30 genera, and 55 species of RNA viruses. On these data, we analyzed how genetic distances, selective pressures, and the distribution of RNA secondary structures and conserved protein functional domains vary between overlapping (OV) and nonoverlapping (NOV) regions. We show that gene overlapping generally results in a decrease in the rate of RNA virus evolution through a reduction in the frequency of synonymous mutations. However, this effect is less pronounced in genes with a terminal rather than an internal gene overlap, which might result from a greater proportion of protein functional conserved domains in NOV than in OV regions, in turn reducing the number of nonsynonymous mutations in the former. Overall, our analyses clarify the role of gene overlapping as a modulator of the evolutionary rates exhibited by RNA viruses and shed light on the factors that shape the genetic diversity of this important group of pathogens.
Collapse
Affiliation(s)
- Etienne Simon-Loriere
- Institut Pasteur, Unité de Génétique Fonctionnelle des Maladies Infectieuses, Paris, France
| | | | | |
Collapse
|
21
|
Chen P, Gan Y, Han N, Fang W, Li J, Zhao F, Hu K, Rayner S. Computational evolutionary analysis of the overlapped surface (S) and polymerase (P) region in hepatitis B virus indicates the spacer domain in P is crucial for survival. PLoS One 2013; 8:e60098. [PMID: 23577084 PMCID: PMC3618453 DOI: 10.1371/journal.pone.0060098] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2012] [Accepted: 02/23/2013] [Indexed: 12/21/2022] Open
Abstract
Introduction The Hepatitis B Virus (HBV) genome contains four ORFs, S (surface), P (polymerase), C (core) and X. S is completely overlapped by P and as a consequence the overlapping region is subject to distinctive evolutionary constraints compared to the remainder of the genome. Specifically, a non-synonymous substitution in one coding frame may produce a synonymous substitution in the alternative frame, suggesting a possible conflict between requirements for diversifying and purifying forces. To examine how these contrasting requirements are balanced within this region, we investigated the relationship amongst positive selection sites, conserved regions, epitopes and elements of protein structure to consider how HBV balances the contrasting evolutionary pressures. Methodology/Results 323 HBV genotype D genome sequences were collected and analyzed to identify sites under positive selection and highly conserved regions. Epitopes sequences were retrieved from previously published experimental studies stored in the Immune Epitope Database. Predicted secondary structures were used to investigate the association between structure and conservation. Entropy was used as a measure of conservation and bivariate logistic regression was used to investigate the relationship between positive selection/conserved sites and epitope/secondary structure regions. Our results indicate: (i) conservation in S is primarily dictated by α-helix elements in the protein structure, (ii) variable residues are mainly located in PreS, the major hydrophilic region (MHR) and the C-terminus, (iii) epitopes in S, which are directly targeted by the host immune system, are significantly associated with sites under positive selection. Conclusions The highly variable spacer domain in P, which corresponds to PreS in S, appears to act as a harbor for the accumulation of mutations that can provide flexibility for conformational changes and responding to immune pressure.
Collapse
Affiliation(s)
- Ping Chen
- Key Laboratory of Agricultural and Environmental Microbiology, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China
- State Key Laboratory of Virology, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China
| | - Yun Gan
- State Key Laboratory of Virology, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China
| | - Na Han
- Key Laboratory of Agricultural and Environmental Microbiology, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China
| | - Wei Fang
- Key Laboratory of Agricultural and Environmental Microbiology, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China
| | - Jiafu Li
- Department of Obstetrics and Gynecology, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Fei Zhao
- State Key Laboratory of Virology, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China
| | - Kanghong Hu
- State Key Laboratory of Virology, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China
- Biomedical Center, Hubei University of Technology, Wuhan, China
- * E-mail: (SR); (KH)
| | - Simon Rayner
- Key Laboratory of Agricultural and Environmental Microbiology, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China
- * E-mail: (SR); (KH)
| |
Collapse
|
22
|
Torres C, Fernández MDB, Flichman DM, Campos RH, Mbayed VA. Influence of overlapping genes on the evolution of human hepatitis B virus. Virology 2013; 441:40-8. [PMID: 23541083 DOI: 10.1016/j.virol.2013.02.027] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2013] [Revised: 02/05/2013] [Accepted: 02/28/2013] [Indexed: 12/23/2022]
Abstract
The aim of this work was to analyse the influence of overlapping genes on the evolution of hepatitis B virus (HBV). A differential evolutionary behaviour among genetic regions and clinical status was found. Dissimilar levels of conservation of the different protein regions could derive from alternative mechanisms to maintain functionality. We propose that, in overlapping regions, selective constraints on one of the genes could drive the substitution process. This would allow protein conservation in one gene by synonymous substitutions while mechanisms of tolerance to the change operate in the overlapping gene (e.g. usage of amino acids with high-degeneracy codons, differential codon usage and replacement by physicochemically similar amino acids). In addition, differential selection pressure according to the HBeAg status was found in all genes, suggesting that the immune response could be one of the factors that would constrain viral replication by interacting with different HBV proteins during the HBeAg(-) stage.
Collapse
Affiliation(s)
- Carolina Torres
- Cátedra de Virología, Facultad de Farmacia y Bioquímica, Universidad de Buenos Aires, Ciudad Autónoma de Buenos Aires, Argentina; CONICET, Argentina
| | | | | | | | | |
Collapse
|
23
|
Genome dynamics in three different geographical isolates of white spot syndrome virus (WSSV). Arch Virol 2012; 157:2357-62. [PMID: 22836599 DOI: 10.1007/s00705-012-1395-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2012] [Accepted: 05/21/2012] [Indexed: 10/28/2022]
Abstract
White spot syndrome virus (WSSV), the sole member of the monotypic family Nimaviridae, is considered an extremely lethal shrimp pathogen. Despite its impact, some essential biological characteristics related to WSSV genome dynamics, such as the synonymous codon usage pattern and selection pressure in genes, remain to be elucidated. The results show that compositional limitations and mutational pressure determine the codon usage bias and base composition in WSSV. Furthermore, different forces of selective pressure are acting across various regions of the WSSV genome. Finally, this study points out the possible occurrence of two major recombination events.
Collapse
|
24
|
Sabath N, Wagner A, Karlin D. Evolution of viral proteins originated de novo by overprinting. Mol Biol Evol 2012; 29:3767-80. [PMID: 22821011 PMCID: PMC3494269 DOI: 10.1093/molbev/mss179] [Citation(s) in RCA: 104] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
New protein-coding genes can originate either through modification of existing genes or de novo. Recently, the importance of de novo origination has been recognized in eukaryotes, although eukaryotic genes originated de novo are relatively rare and difficult to identify. In contrast, viruses contain many de novo genes, namely those in which an existing gene has been “overprinted” by a new open reading frame, a process that generates a new protein-coding gene overlapping the ancestral gene. We analyzed the evolution of 12 experimentally validated viral genes that originated de novo and estimated their relative ages. We found that young de novo genes have a different codon usage from the rest of the genome. They evolve rapidly and are under positive or weak purifying selection. Thus, young de novo genes might have strain-specific functions, or no function, and would be difficult to detect using current genome annotation methods that rely on the sequence signature of purifying selection. In contrast to young de novo genes, older de novo genes have a codon usage that is similar to the rest of the genome. They evolve slowly and are under stronger purifying selection. Some of the oldest de novo genes evolve under stronger selection pressure than the ancestral gene they overlap, suggesting an evolutionary tug of war between the ancestral and the de novo gene.
Collapse
Affiliation(s)
- Niv Sabath
- Institute of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland.
| | | | | |
Collapse
|
25
|
Tomás G, Hernández M, Marandino A, Panzera Y, Maya L, Hernández D, Pereda A, Banda A, Villegas P, Aguirre S, Pérez R. Development and validation of a TaqMan-MGB real-time RT-PCR assay for simultaneous detection and characterization of infectious bursal disease virus. J Virol Methods 2012; 185:101-7. [PMID: 22728272 DOI: 10.1016/j.jviromet.2012.06.012] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2012] [Revised: 05/26/2012] [Accepted: 06/13/2012] [Indexed: 10/28/2022]
Abstract
Rapid and reliable detection and classification of infectious bursal disease viruses (IBDVs) is of crucial importance for disease surveillance and control. This study presents the development and validation of a real-time RT-PCR assay to detect and discriminate very virulent (vv) from non-vv (classic and variant) IBDV strains. The assay uses two fluorogenic, minor groove-binding (MGB) TaqMan probes targeted to a single nucleotide polymorphism (SNP) embedded in a highly conserved genomic region. The analytical sensitivity of the assay was determined using serial dilutions of in vitro-transcribed RNA. The assay demonstrated a wide dynamic range between 10(2) and 10(8) standard RNA copies per reaction. Good reproducibility was also detected, with intra- and inter-assay coefficients of variation ranging from 0.13% to 2.23% and 0.26% to 1.92%, respectively. The assay detected successfully all the assessed vv, classical, and variant field and vaccine strains and correctly discriminated all vvIBDV strains from non-vvIBDV strains. Other common avian RNA viruses tested negative, indicating high specificity of the assay. The high sensitivity, rapidity, reproducibility, and specificity of the real-time RT-PCR assay make this method suitable for general and genotype-specific detection and quantitation.
Collapse
Affiliation(s)
- Gonzalo Tomás
- Sección Genética Evolutiva, Instituto de Biología, Facultad de Ciencias, Universidad de la República, Iguá 4225, 11400 Montevideo, Uruguay
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
An overlapping genetic code for frameshifted overlapping genes in Drosophila mitochondria: Antisense antitermination tRNAs UAR insert serine. J Theor Biol 2012; 298:51-76. [DOI: 10.1016/j.jtbi.2011.12.026] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2010] [Revised: 12/19/2011] [Accepted: 12/22/2011] [Indexed: 01/27/2023]
|
27
|
Allard SD, de Goede AL, De Keersmaecker B, Heirman C, Lacor P, Osterhaus ADME, Demanet C, Thielemans K, Gruters RA, Aerts JL. Sequence evolution and escape from specific immune pressure of an HIV-1 Rev epitope with extensive sequence similarity to human nucleolar protein 6. ACTA ACUST UNITED AC 2012; 79:174-85. [DOI: 10.1111/j.1399-0039.2012.01837.x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
28
|
Sabath N, Morris JS, Graur D. Is there a twelfth protein-coding gene in the genome of influenza A? A selection-based approach to the detection of overlapping genes in closely related sequences. J Mol Evol 2011; 73:305-15. [PMID: 22187135 DOI: 10.1007/s00239-011-9477-9] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2011] [Accepted: 12/02/2011] [Indexed: 02/06/2023]
Abstract
Protein-coding genes often contain long overlapping open-reading frames (ORFs), which may or may not be functional. Current methods that utilize the signature of purifying selection to detect functional overlapping genes are limited to the analysis of sequences from divergent species, thus rendering them inapplicable to genes found only in closely related sequences. Here, we present a method for the detection of selection signatures on overlapping reading frames by using closely related sequences, and apply the method to several known overlapping genes, and to an overlapping ORF on the negative strand of segment 8 of influenza A virus (NEG8), for which the suggestion has been made that it is functional. We find no evidence that NEG8 is under selection, suggesting that the intact reading frame might be non-functional, although we cannot fully exclude the possibility that the method is not sensitive enough to detect the signature of selection acting on this gene. We present the limitations of the method using known overlapping genes and suggest several approaches to improve it in future studies. Finally, we examine alternative explanations for the sequence conservation of NEG8 in the absence of selection. We show that overlap type and genomic context affect the conservation of intact overlapping ORFs and should therefore be considered in any attempt of estimating the signature of selection in overlapping genes.
Collapse
Affiliation(s)
- Niv Sabath
- Institute of Evolutionary Biology and Environmental Studies, University of Zurich, 8057 Zurich, Switzerland.
| | | | | |
Collapse
|
29
|
Successful COG8 and PDF overlap is mediated by alterations in splicing and polyadenylation signals. Hum Genet 2011; 131:265-74. [PMID: 21805148 DOI: 10.1007/s00439-011-1075-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2011] [Accepted: 07/19/2011] [Indexed: 01/21/2023]
Abstract
Although gene-free areas compose the great majority of eukaryotic genomes, a significant fraction of genes overlaps, i.e., unique nucleotide sequences are part of more than one transcription unit. In this work, the evolutionary history and origin of a same-strand gene overlap is dissected through the analysis of COG8 (component of oligomeric Golgi complex 8) and PDF (peptide deformylase). Comparative genomic surveys reveal that the relative locations of these two genes have been changing over the last 445 million years from distinct chromosomal locations in fish to overlapping in rodents and primates, indicating that the overlap between these genes precedes their divergence. The overlap between the two genes was initiated by the gain of a novel splice donor site between the COG8 stop codon and PDF initiation codon. Splicing is accomplished by the use of the PDF acceptor, leading COG8 to share the 3'end with PDF. In primates, loss of the ancestral polyadenylation signal for COG8 makes the overlap between COG8 and PDF mandatory, while in mouse and rat concurrent overlapping and non-overlapping Cog8 transcripts exist. Altogether, we demonstrate that the origin, evolution and preservation of the COG8/PDF same-strand overlap follow similar mechanistic steps as those documented for antisense overlaps where gain and/or loss of splice sites and polyadenylation signals seems to drive the process.
Collapse
|
30
|
Recombinational histories of avian infectious bronchitis virus and turkey coronavirus. Arch Virol 2011; 156:1823-9. [PMID: 21744259 PMCID: PMC7086623 DOI: 10.1007/s00705-011-1061-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2011] [Accepted: 06/25/2011] [Indexed: 11/29/2022]
Abstract
Phylogenetic analysis of complete genomes of the avian coronaviruses avian infectious bronchitis (AIBV) and turkey coronavirus (TCoV) supported the hypothesis that numerous recombination events have occurred between these viruses. Although the two groups of viruses differed markedly in the sequence of the spike protein, the gene (S) encoding this protein showed no evidence of positive selection or of an elevated mutation rate. Rather, the data suggested that recombination events have homogenized the portions of the genome other than the S gene between the two groups of viruses, while continuing to maintain the two distinct, anciently diverged versions of the S gene. The latter hypothesis was supported by a phylogeny of S proteins from representative coronaviruses, in which S proteins of AIBV and TCoV fell in the same clade.
Collapse
|
31
|
Immune-induced evolutionary selection focused on a single reading frame in overlapping hepatitis B virus proteins. J Virol 2011; 85:4558-66. [PMID: 21307195 DOI: 10.1128/jvi.02142-10] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Viruses employ various means to evade immune detection. Reduction of CD8(+) T cell epitopes is one of the common strategies used for this purpose. Hepatitis B virus (HBV), a member of the Hepadnaviridae family, has four open reading frames, with about 50% overlap between the genes they encode. We computed the CD8(+) T cell epitope density within HBV proteins and the mutations within the epitopes. Our results suggest that HBV accumulates escape mutations that reduce the number of epitopes. These mutations are not equally distributed among genes and reading frames. While the highly expressed core and X proteins are selected to have low epitope density, polymerase, which is expressed at low levels, does not undergo the same selection. In overlapping regions, mutations in one protein-coding sequence also affect the other protein-coding sequence. We show that mutations lead to the removal of epitopes in X and surface proteins even at the expense of the addition of epitopes in polymerase. The total escape mutation rate for overlapping regions is lower than that for nonoverlapping regions. The lower epitope replacement rate for overlapping regions slows the evolutionary escape rate of these regions but leads to the accumulation of mutations more robust in the transfer between hosts, such as mutations preventing proteasomal cleavage into epitopes.
Collapse
|
32
|
Paul S, Piontkivska H. Frequent associations between CTL and T-Helper epitopes in HIV-1 genomes and implications for multi-epitope vaccine designs. BMC Microbiol 2010; 10:212. [PMID: 20696039 PMCID: PMC2924856 DOI: 10.1186/1471-2180-10-212] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2010] [Accepted: 08/09/2010] [Indexed: 01/17/2023] Open
Abstract
BACKGROUND Epitope vaccines have been suggested as a strategy to counteract viral escape and development of drug resistance. Multiple studies have shown that Cytotoxic T-Lymphocyte (CTL) and T-Helper (Th) epitopes can generate strong immune responses in Human Immunodeficiency Virus (HIV-1). However, not much is known about the relationship among different types of HIV epitopes, particularly those epitopes that can be considered potential candidates for inclusion in the multi-epitope vaccines. RESULTS In this study we used association rule mining to examine relationship between different types of epitopes (CTL, Th and antibody epitopes) from nine protein-coding HIV-1 genes to identify strong associations as potent multi-epitope vaccine candidates. Our results revealed 137 association rules that were consistently present in the majority of reference and non-reference HIV-1 genomes and included epitopes of two different types (CTL and Th) from three different genes (Gag, Pol and Nef). These rules involved 14 non-overlapping epitope regions that frequently co-occurred despite high mutation and recombination rates, including in genomes of circulating recombinant forms. These epitope regions were also highly conserved at both the amino acid and nucleotide levels indicating strong purifying selection driven by functional and/or structural constraints and hence, the diminished likelihood of successful escape mutations. CONCLUSIONS Our results provide a comprehensive systematic survey of CTL, Th and Ab epitopes that are both highly conserved and co-occur together among all subtypes of HIV-1, including circulating recombinant forms. Several co-occurring epitope combinations were identified as potent candidates for inclusion in multi-epitope vaccines, including epitopes that are immuno-responsive to different arms of the host immune machinery and can enable stronger and more efficient immune responses, similar to responses achieved with adjuvant therapies. Signature of strong purifying selection acting at the nucleotide level of the associated epitopes indicates that these regions are functionally critical, although the exact reasons behind such sequence conservation remain to be elucidated.
Collapse
Affiliation(s)
- Sinu Paul
- Department of Biological Sciences, Kent State University, Kent, Ohio 44242, USA
| | | |
Collapse
|
33
|
Pagán I, Holmes EC. Long-term evolution of the Luteoviridae: time scale and mode of virus speciation. J Virol 2010; 84:6177-87. [PMID: 20375155 PMCID: PMC2876656 DOI: 10.1128/jvi.02160-09] [Citation(s) in RCA: 87] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2009] [Accepted: 03/31/2010] [Indexed: 12/20/2022] Open
Abstract
Despite their importance as agents of emerging disease, the time scale and evolutionary processes that shape the appearance of new viral species are largely unknown. To address these issues, we analyzed intra- and interspecific evolutionary processes in the Luteoviridae family of plant RNA viruses. Using the coat protein gene of 12 members of the family, we determined their phylogenetic relationships, rates of nucleotide substitution, times to common ancestry, and patterns of speciation. An associated multigene analysis enabled us to infer the nature of selection pressures and the genomic distribution of recombination events. Although rates of evolutionary change and selection pressures varied among genes and species and were lower in some overlapping gene regions, all fell within the range of those seen in animal RNA viruses. Recombination breakpoints were commonly observed at gene boundaries but less so within genes. Our molecular clock analysis suggested that the origin of the currently circulating Luteoviridae species occurred within the last 4 millennia, with intraspecific genetic diversity arising within the last few hundred years. Speciation within the Luteoviridae may therefore be associated with the expansion of agricultural systems. Finally, our phylogenetic analysis suggested that viral speciation events tended to occur within the same plant host species and country of origin, as expected if speciation is largely sympatric, rather than allopatric, in nature.
Collapse
Affiliation(s)
- Israel Pagán
- Center for Infectious Disease Dynamics, Department of Biology, The Pennsylvania State University, University Park, PA 16802, USA.
| | | |
Collapse
|
34
|
Sequence variability and evolution of the terminal overlapping VP5 gene of the infectious bursal disease virus. Virus Genes 2010; 41:59-66. [DOI: 10.1007/s11262-010-0485-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2009] [Accepted: 04/15/2010] [Indexed: 10/19/2022]
|
35
|
Hughes AL, O'Connor S, Dudley DM, Burwitz BJ, Bimber BN, O'Connor D. Dynamics of haplotype frequency change in a CD8+TL epitope of simian immunodeficiency virus. INFECTION GENETICS AND EVOLUTION 2010; 10:555-60. [PMID: 20149896 DOI: 10.1016/j.meegid.2010.02.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/25/2009] [Revised: 02/01/2010] [Accepted: 02/03/2010] [Indexed: 10/19/2022]
Abstract
Deep pyrosequencing of a CD8+TL epitope from the Tat protein of simian immunodeficiency virus (SIV) from four infected rhesus macaques carrying the restricting MHC allele (Mamu-A*01) for that epitope, revealed that natural selection favoring escape mutations led to an increase in the frequency of haplotypes in the epitope region that differed from the inoculum. After 20 weeks of infection, a new sequence haplotype in the epitope region had increased to a frequency greater than 50% in each of the four monkeys (range 57.9-98.9%); but the predominant haplotype was not the same in all four monkeys. Thus, even under strong selection favoring escape from CD8+TL recognition, the random nature of mutation itself is the primary factor affecting which escape mutation is likely to become predominant within an individual host. The relationship between the frequency of the inoculum haplotype in the epitope region and time post-infection approximated a simple hyperbola. On this assumption, the expected ratio of the frequencies at the inoculum at two times t(1) and t(2), f(i)(t(2))/f(i)(t(1)), will be given by t(1)/t(2). Because standard phylogenetic methods for reconstructing ancestral sequences failed to predict the inoculum sequence correctly, we used this relationship to predict the inoculum sequence with 100% accuracy, given data on haplotype frequencies at different time periods.
Collapse
Affiliation(s)
- Austin L Hughes
- Department of Biological Sciences, University of South Carolina, Columbia, SC 29208, USA.
| | | | | | | | | | | |
Collapse
|
36
|
Liang JW, Tian FL, Lan ZR, Huang B, Zhuang WZ. Selection characterization on overlapping reading frame of multiple-protein-encoding P gene in Newcastle disease virus. Vet Microbiol 2009; 144:257-63. [PMID: 20079581 DOI: 10.1016/j.vetmic.2009.12.029] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2009] [Accepted: 12/21/2009] [Indexed: 01/08/2023]
Abstract
The aim of this study was to characterize the molecular evolution of P and V protein genes of the Newcastle disease virus (NDV). The P gene sequences of 55 NDV isolates, representing different chronological and geographic origins, were obtained from GenBank. In this paper, the evolution of the specific regions of the NDV P gene, encoding the P and V proteins, was analyzed. The nucleotides from the shared P/V region encoded the co-amino terminus of the two proteins, while the P-V/V-P region was respectively encoded by the nucleotides within the P ORF or the V ORF in the common sequence (after the mRNA editing site). As well, the P-cut region exclusively encoded the P protein. Finally, the P-V and V-P regions were further broken down into P1 and P2 fragments with the corresponding V1 and V2 fragments. In the P gene, the P-cut portion corresponding to the C-terminal of the P protein was the most highly conserved, while the P-V region was the most variable. This was interpreted as a lower constraint for function in the common sequence than in the unique P sequence that is known to contain an important function. Interestingly, in the common P-V/V-P function, variability of V1 was compensated by a higher conservation of the corresponding P1, and conversely for the P2/V2, which suggested that the flexibility of one ORF with less function served the purpose of allowing positive selection in the other overlapping ORF that exhibited more function.
Collapse
Affiliation(s)
- Jun-Wen Liang
- College of Life Science, Shandong Normal University, Wenhua East Road, Shandong Province, Jinan 250014, China
| | | | | | | | | |
Collapse
|
37
|
Overlapping genes produce proteins with unusual sequence properties and offer insight into de novo protein creation. J Virol 2009; 83:10719-36. [PMID: 19640978 DOI: 10.1128/jvi.00595-09] [Citation(s) in RCA: 141] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
It is widely assumed that new proteins are created by duplication, fusion, or fission of existing coding sequences. Another mechanism of protein birth is provided by overlapping genes. They are created de novo by mutations within a coding sequence that lead to the expression of a novel protein in another reading frame, a process called "overprinting." To investigate this mechanism, we have analyzed the sequences of the protein products of manually curated overlapping genes from 43 genera of unspliced RNA viruses infecting eukaryotes. Overlapping proteins have a sequence composition globally biased toward disorder-promoting amino acids and are predicted to contain significantly more structural disorder than nonoverlapping proteins. By analyzing the phylogenetic distribution of overlapping proteins, we were able to confirm that 17 of these had been created de novo and to study them individually. Most proteins created de novo are orphans (i.e., restricted to one species or genus). Almost all are accessory proteins that play a role in viral pathogenicity or spread, rather than proteins central to viral replication or structure. Most proteins created de novo are predicted to be fully disordered and have a highly unusual sequence composition. This suggests that some viral overlapping reading frames encoding hypothetical proteins with highly biased composition, often discarded as noncoding, might in fact encode proteins. Some proteins created de novo are predicted to be ordered, however, and whenever a three-dimensional structure of such a protein has been solved, it corresponds to a fold previously unobserved, suggesting that the study of these proteins could enhance our knowledge of protein space.
Collapse
|
38
|
Sabath N, Landan G, Graur D. A method for the simultaneous estimation of selection intensities in overlapping genes. PLoS One 2008; 3:e3996. [PMID: 19098983 PMCID: PMC2601044 DOI: 10.1371/journal.pone.0003996] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2008] [Accepted: 11/21/2008] [Indexed: 11/18/2022] Open
Abstract
Inferring the intensity of positive selection in protein-coding genes is important since it is used to shed light on the process of adaptation. Recently, it has been reported that overlapping genes, which are ubiquitous in all domains of life, seem to exhibit inordinate degrees of positive selection. Here, we present a new method for the simultaneous estimation of selection intensities in overlapping genes. We show that the appearance of positive selection is caused by assuming that selection operates independently on each gene in an overlapping pair, thereby ignoring the unique evolutionary constraints on overlapping coding regions. Our method uses an exact evolutionary model, thereby voiding the need for approximation or intensive computation. We test the method by simulating the evolution of overlapping genes of different types as well as under diverse evolutionary scenarios. Our results indicate that the independent estimation approach leads to the false appearance of positive selection even though the gene is in reality subject to negative selection. Finally, we use our method to estimate selection in two influenza A genes for which positive selection was previously inferred. We find no evidence for positive selection in both cases.
Collapse
Affiliation(s)
- Niv Sabath
- Department of Biology and Biochemistry, University of Houston, Houston, Texas, United States of America.
| | | | | |
Collapse
|
39
|
Cooke JN, Westover KM. Serotype-specific differences in antigenic regions of foot-and-mouth disease virus (FMDV): A comprehensive statistical analysis. INFECTION GENETICS AND EVOLUTION 2008; 8:855-63. [DOI: 10.1016/j.meegid.2008.08.004] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2008] [Revised: 08/11/2008] [Accepted: 08/15/2008] [Indexed: 10/21/2022]
|
40
|
Rokas A, Carroll SB. Frequent and widespread parallel evolution of protein sequences. Mol Biol Evol 2008; 25:1943-53. [PMID: 18583353 DOI: 10.1093/molbev/msn143] [Citation(s) in RCA: 85] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Understanding the patterns and causes of protein sequence evolution is a major challenge in evolutionary biology. One of the critical unresolved issues is the relative contribution of selection and genetic drift to the fixation of amino acid sequence differences between species. Molecular homoplasy, the independent evolution of the same amino acids at orthologous sites in different taxa, is one potential signature of selection; however, relatively little is known about its prevalence in eukaryotic proteomes. To quantify the extent and type of homoplasy among evolving proteins, we used phylogenetic methodology to analyze 8 genome-scale data matrices from clades of different evolutionary depths that span the eukaryotic tree of life. We found that the frequency of homoplastic amino acid substitutions in eukaryotic proteins was more than 2-fold higher than expected under neutral models of protein evolution. The overwhelming majority of homoplastic substitutions were parallelisms that involved the most frequently exchanged amino acids with similar physicochemical properties and that could be reached by a single-mutational step. We conclude that the role of homoplasy in shaping the protein record is much larger than generally assumed, and we suggest that its high frequency can be explained by both weak positive selection for certain substitutions and purifying selection that constrains substitutions to a small number of functionally equivalent amino acids.
Collapse
Affiliation(s)
- Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, USA
| | | |
Collapse
|
41
|
Soares AER, Soares MA, Schrago CG. Positive selection on HIV accessory proteins and the analysis of molecular adaptation after interspecies transmission. J Mol Evol 2008; 66:598-604. [PMID: 18465165 DOI: 10.1007/s00239-008-9112-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2007] [Revised: 04/22/2008] [Accepted: 04/23/2008] [Indexed: 10/22/2022]
Abstract
Studies examining positive selection on accessory proteins of HIV are rare, although these proteins play an important role in pathogenesis in vivo. Moreover, despite the biological relevance of analyses of molecular adaptation after viral transmission between species, the issue is still poorly studied. Here we present evidence that accessory proteins are subjected to positive selective forces exclusively in HIV. This scenario suggests that accessory protein genes are under adaptive evolution in HIV clades, while in SIVcpz such a phenomenon could not be detected. As a result, we show that comparative studies are critical to carry out functional investigation of positively selected protein sites, as they might help to achieve a better comprehension of the biology of HIV pathogenesis.
Collapse
Affiliation(s)
- André E R Soares
- Departamento de Genética, Universidade Federal do Rio de Janeiro, Ilha do Fundao, Rio de Janeiro, RJ CEP 21.941-590, Brazil
| | | | | |
Collapse
|
42
|
van Hemert FJ, Zaaijer HL, Berkhout B, Lukashov VV. Mosaic amino acid conservation in 3D-structures of surface protein and polymerase of hepatitis B virus. Virology 2007; 370:362-72. [PMID: 17935747 DOI: 10.1016/j.virol.2007.08.036] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2007] [Revised: 07/31/2007] [Accepted: 08/25/2007] [Indexed: 12/17/2022]
Abstract
Surface protein and polymerase of hepatitis B virus provide a striking example of gene overlap. Inclusion of more coding constraints in the phylogenetic analysis forces the tree toward accepted topology. Three-dimensional protein modeling demonstrates that participation in local protein function underlies the observed mosaic patterns of amino acid conservation and variability. Conserved amino acid residues of polymerase were typically clustered at the catalytic core marked by the YMDD motif. The proposed tertiary structure of surface protein displayed the expected transmembrane helices in a 2-domain constellation. Conserved amino acids like, for instance, cysteine residues are involved in the spatial orientation of the two domains, the exposed location of the a-determinant and the dimer formation of surface protein. By means of computational alanine replacement scanning, we demonstrated that the interfaces between domains in monomeric surface protein, between the monomers in dimeric surface protein and in a capsid-surface protein complex mainly consist of relatively well-conserved amino acid residues.
Collapse
Affiliation(s)
- Formijn J van Hemert
- Laboratory of Experimental Virology, Department of Medical Microbiology, Center for Infection and Immunity Amsterdam (CINIMA), Academic Medical Center, University of Amsterdam, Meibergdreef 15, 1105 AZ Amsterdam, The Netherlands.
| | | | | | | |
Collapse
|
43
|
Novel cytotoxic T-lymphocyte escape mutation by a three-amino-acid insertion in the human immunodeficiency virus type 1 p6Pol and p6Gag late domain associated with drug resistance. J Virol 2007; 82:495-502. [PMID: 17942528 DOI: 10.1128/jvi.01096-07] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Cytolytic T lymphocytes (CTL) play a major role in controlling human immunodeficiency virus type 1 (HIV-1) infection. To evade immune pressure, HIV-1 is selected at targeted CTL epitopes, which may consequentially alter viral replication fitness. In our longitudinal investigations of the interplay between T-cell immunity and viral evolution following acute HIV-1 infection, we observed in a treatment-naïve patient the emergence of highly avid, gamma interferon-secreting, CD8(+) CTL recognizing an HLA-Cw*0102-restricted epitope, NSPTRREL (NL8). This epitope lies in the p6(Pol) protein, located in the transframe region of the Gag-Pol polyprotein. Over the course of infection, an unusual viral escape mutation arose within the p6(Pol) epitope through insertion of a 3-amino-acid repeat, NSPT(SPT)RREL, with a concomitant insertion in the p6(Gag) late domain, PTAPP(APP). Interestingly, this p6(Pol) insertion mutation is often selected in viruses with the emergence of antiretroviral drug resistance, while the p6(Gag) late-domain PTAPP motif binds Tsg101 to permit viral budding. These results are the first to demonstrate viral evasion of immune pressure by amino acid insertions. Moreover, this escape mutation represents a novel mechanism whereby HIV-1 can alter its sequence within both the Gag and Pol proteins with potential functional consequences for viral replication and budding.
Collapse
|
44
|
Pavesi A. Pattern of nucleotide substitution in the overlapping nonstructural genes of influenza A virus and implication for the genetic diversity of the H5N1 subtype. Gene 2007; 402:28-34. [PMID: 17825505 DOI: 10.1016/j.gene.2007.07.013] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2007] [Revised: 07/12/2007] [Accepted: 07/12/2007] [Indexed: 11/24/2022]
Abstract
In viruses under strong pressure to minimize genome size, overlapping genes represent a fine strategy to condense a maximum amount of information into short nucleotide sequences. Here, we investigated the evolution of the genes encoding the nonstructural proteins NS1 and NS2 of influenza A virus (IAV), which are one of the best characterized cases of gene overlap. By a detailed analysis of about four hundred sequences grouped into 11 IAV subtypes, we found that the overlapping coding region of the NS1 gene shows a significant increase of the rate of nonsynonymous change, with respect to its nonoverlapping counterpart. The same feature was observed in the overlapping coding region of the NS2 gene. Such a variation pattern, which implies the occurrence of several amino acid substitutions in the protein regions encoded by overlapping frames, is different from the pattern of constrained evolution typical of other viral overlapping-gene systems. Amino acid sequence analysis of the NS1 and NS2 proteins revealed that some nonsynonymous substitutions, located in the region of gene overlap, play a critical role in shaping the genetic diversity of the highly pathogenic subtype H5N1. Since both proteins contribute to disease pathogenesis by affecting many virus and host-cell processes, information provided by this study should be useful to highlight the impact of nonstructural gene variation on the pathogenicity of H5N1 viruses.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Genetics, Biology of Microorganisms, Anthropology, Evolution, University of Parma, V. le G. P. Usberti 11/A, I-43100 Parma, Italy.
| |
Collapse
|
45
|
Hughes AL. Looking for Darwin in all the wrong places: the misguided quest for positive selection at the nucleotide sequence level. Heredity (Edinb) 2007; 99:364-73. [PMID: 17622265 DOI: 10.1038/sj.hdy.6801031] [Citation(s) in RCA: 189] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Recent years have seen an explosion of interest in evidence for positive Darwinian selection at the molecular level. This quest has been hampered by the use of statistical methods that fail adequately to rule out alternative hypotheses, particularly the relaxation of purifying selection and the effects of population bottlenecks, during which the effectiveness of purifying selection is reduced. A further problem has been the assumption that positive selection will generally involve repeated amino-acid changes to a single protein. This model was derived from the case of the vertebrate major histocompatibility complex (MHC), but the MHC proteins are unusual in being involved in protein-protein recognition and in a co-evolutionary process of pathogens. There is no reason to suppose that repeated amino-acid changes to a single protein are involved in selectively advantageous phenotypes in general. Rather adaptive phenotypes are much more likely to result from other causes, including single amino-acid changes; deletion or silencing of genes or changes in the pattern of gene expression.
Collapse
Affiliation(s)
- A L Hughes
- Department of Biological Sciences, University of South Carolina, Columbia, SC 29208, USA.
| |
Collapse
|
46
|
Zhao X, McGirr KM, Buehring GC. Potential evolutionary influences on overlapping reading frames in the bovine leukemia virus pXBL region. Genomics 2007; 89:502-11. [PMID: 17239558 DOI: 10.1016/j.ygeno.2006.12.007] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2006] [Revised: 11/27/2006] [Accepted: 12/14/2006] [Indexed: 01/25/2023]
Abstract
Bovine leukemia virus contains a pXBL region encoding the 3' parts of four regulatory proteins (Tax, Rex, G4, R3) in overlapping reading frames. Here we report the pXBL polymorphisms of 30 isolates from four countries. Rates of overall and synonymous substitutions were consistently lower, and nucleotide/amino acid composition bias and codon bias higher, in more-overlapped than in less-overlapped regions. Ratios of nonsynonymous/synonymous substitutions were lowest in the tax gene and its subregions. The 5' parts of the four genes showed selection patterns corresponding to their genomic context outside of the pXBL region. Longer G4 variants due to a natural stop codon mutation had additional triple overlap with reduced sequence variability. These data support the concept that a higher level of overlapping in coding regions correlates with greater evolutionary constraint. Tax, the most conserved among the four regulatory proteins, showed purifying selection consistent with its importance in the viral life cycle.
Collapse
Affiliation(s)
- Xiangrong Zhao
- Graduate Program in Endocrinology, University of California at Berkeley, 3060 Valley Life Science Building, Berkeley, CA 94720-3140, USA.
| | | | | |
Collapse
|
47
|
McGirr KM, Buehuring GC. Tax & rex: overlapping genes of the Deltaretrovirus group. Virus Genes 2006; 32:229-39. [PMID: 16732475 DOI: 10.1007/s11262-005-6907-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2005] [Accepted: 08/22/2005] [Indexed: 10/24/2022]
Abstract
Bovine leukemia virus and human T-cell leukemia viruses I and II, members of the Deltaretrovirus group, have two regulatory genes, tax and rex, that are coded in overlapping reading frames. We found that sequence variations in the rex gene of each virus result in amino acid differences significantly more often than variations in the tax gene. For all three viruses the highest ratio of non-synonymous to synonymous changes was found in the rex gene. In the overlapping regions of tax and rex, the second codon position of Rex corresponds to the third codon position of Tax. Nucleotide C was present in all genes of the three viruses at the highest frequency and this bias was most pronounced in the rex gene. More specifically we found that the C bias and nucleotide variation is greatest at the second codon position of Rex and the third codon position of Tax in the area of tax/rex overlap. Changes in the second codon position of Rex always resulted in amino acid change whereas changes in the third codon position of Tax resulted in amino acid changes less than a third of the time. Analysis of the amino acid frequencies in both proteins shows that there is a disproportionately large percentage of the amino acids alanine, proline, serine and threonine (the four amino acids whose second codon position is C) in Rex. These findings led us to hypothesize that the Rex protein can withstand more amino acid changes than can the Tax protein suggesting that the Tax protein experiences higher evolutionary constraints and is the more conserved of the two proteins.
Collapse
Affiliation(s)
- Kathleen Margaret McGirr
- School of Public Health, Division of Infectious Diseases, University of California, Berkeley, CA 94720, USA.
| | | |
Collapse
|
48
|
Abstract
The possibility of creating novel genes from pre-existing sequences, known as overprinting, is a widespread phenomenon in small viruses. Here, the origin and evolution of gene overlap in the bacteriophages belonging to the family Microviridae have been investigated. The distinction between ancestral and derived frames was carried out by comparing the patterns of codon usage in overlapping and non-overlapping genes. By this approach, a gradual increase in complexity of the phage genome--from an ancestral state lacking gene overlap to a derived state with a high density of genetic information--was inferred. Genes encoding less-essential proteins, yet playing a role in phage growth and diffusion, were predicted to be novel genes that originated by overprinting. Evaluation of the rates of synonymous and non-synonymous substitution yielded evidence for overlapping genes under positive selection in one frame and purifying selection in the alternative frame.
Collapse
Affiliation(s)
- Angelo Pavesi
- Department of Genetics, Anthropology and Evolution, University of Parma, Parco Area delle Scienze 11/A, I-43100 Parma, Italy
| |
Collapse
|
49
|
McCauley S, Hein J. Using hidden Markov models and observed evolution to annotate viral genomes. Bioinformatics 2006; 22:1308-16. [PMID: 16613911 DOI: 10.1093/bioinformatics/btl092] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION ssRNA (single stranded) viral genomes are generally constrained in length and utilize overlapping reading frames to maximally exploit the coding potential within the genome length restrictions. This overlapping coding phenomenon leads to complex evolutionary constraints operating on the genome. In regions which code for more than one protein, silent mutations in one reading frame generally have a protein coding effect in another. To maximize coding flexibility in all reading frames, overlapping regions are often compositionally biased towards amino acids which are 6-fold degenerate with respect to the 64 codon alphabet. Previous methodologies have used this fact in an ad hoc manner to look for overlapping genes by motif matching. In this paper differentiated nucleotide compositional patterns in overlapping regions are incorporated into a probabilistic hidden Markov model (HMM) framework which is used to annotate ssRNA viral genomes. This work focuses on single sequence annotation and applies an HMM framework to ssRNA viral annotation. A description of how the HMM is parameterized, whilst annotating within a missing data framework is given. A Phylogenetic HMM (Phylo-HMM) extension, as applied to 14 aligned HIV2 sequences is also presented. This evolutionary extension serves as an illustration of the potential of the Phylo-HMM framework for ssRNA viral genomic annotation. RESULTS The single sequence annotation procedure (SSA) is applied to 14 different strains of the HIV2 virus. Further results on alternative ssRNA viral genomes are presented to illustrate more generally the performance of the method. The results of the SSA method are encouraging however there is still room for improvement, and since there is overwhelming evidence to indicate that comparative methods can improve coding sequence (CDS) annotation, the SSA method is extended to a Phylo-HMM to incorporate evolutionary information. The Phylo-HMM extension is applied to the same set of 14 HIV2 sequences which are pre-aligned. The performance improvement that results from including the evolutionary information in the analysis is illustrated.
Collapse
|
50
|
Piontkivska H, Hughes AL. Patterns of sequence evolution at epitopes for host antibodies and cytotoxic T-lymphocytes in human immunodeficiency virus type 1. Virus Res 2006; 116:98-105. [PMID: 16214253 DOI: 10.1016/j.virusres.2005.09.001] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2005] [Revised: 08/31/2005] [Accepted: 09/01/2005] [Indexed: 12/01/2022]
Abstract
Analysis of published sequence data from the nine protein-coding genes of human immunodeficiency virus type 1 (HIV-1) showed striking differences in evolutionary pattern between epitopes for host neutralizing antibodies (Ab) and epitopes for cytotoxic T cells (CTL). In all sequences analyzed, the greatest median amino acid residue diversity was seen at sites that formed part of Ab epitopes, but not of CTL epitopes. By contrast, sites belonging to CTL epitopes but not to Ab epitopes showed reduced median amino acid sequence diversity not only in comparison to sites in Ab epitopes but also in comparison to non-epitope sites. Ab epitopes that did not overlap CTL epitopes showed the highest frequency of comparisons in which the rate of nonsynonymous (amino acid-altering) nucleotide substitution exceeded that of synonymous nucleotide substitution, supporting the hypothesis that much of the diversity at Ab epitopes results from positive selection exerted by the host immune system. Though less frequent than that at Ab epitopes, there was evidence of such selection at certain CTL epitopes as well; and amino acid differences between sister pairs of sequences in CTL epitopes were more likely to be convergent than those in Ab epitopes. The pattern seen at CTL epitopes may represent the result of conflicting pressures favoring conservation of the amino acid sequence for functional reasons and amino acid replacements for reasons of CTL escape.
Collapse
Affiliation(s)
- Helen Piontkivska
- Department of Biological Sciences, University of South Carolina, Coker Life Sciences Bldg., 700 Sumter St., Columbia SC 29208, USA
| | | |
Collapse
|