1
|
Genetic Analysis Based on Mitochondrial nad2 Gene Reveals a Recent Population Expansion of the Invasive Mussel, Mytella strigata, in China. Genes (Basel) 2023; 14:2038. [PMID: 38002981 PMCID: PMC10671778 DOI: 10.3390/genes14112038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 10/31/2023] [Accepted: 11/01/2023] [Indexed: 11/26/2023] Open
Abstract
Mytella strigata is a highly adaptable invasive alien species that has been established in coastal China since 2014. Mitochondrial DNA (mtDNA) is an important tool for studying the evolution and population genetics of invasive species. In this study, the mitochondrial genome of M. strigata from China was sequenced by Illumina high-throughput sequencing and characterized with 13 protein-coding genes (PCGs). By assessing the selective pressure of 13 PCGs, the nad2 gene had the fastest evolutionary rate and was finally selected for population genetic analysis. A total of 285 nad2 sequences from seven M. strigata populations in China were analyzed and showed obviously T-rich and C-rich characteristics. According to population genetic diversity analysis, all the seven populations had haplotype (gene) diversity (Hd) ≥ 0.5 and nucleotide diversity (Pi) < 0.005. Haplotype networks showed a "star" distribution. Population historical dynamic analyses showed that Fu's Fs and Tajima's D values of all populations were negative except the Qukou (QK) and Beihai (BH) populations. The Zhangzhou (ZJ) and Xiamen (XM) populations were unimodal while the other populations were multimodal. These results suggested that the population of M. strigata in China may have passed the bottleneck period and is currently in a state of population expansion.
Collapse
|
2
|
Natural selection pressure exerted on "Silent" mutations during the evolution of SARS-CoV-2: Evidence from codon usage and RNA structure. Virus Res 2023; 323:198966. [PMID: 36244617 PMCID: PMC9561399 DOI: 10.1016/j.virusres.2022.198966] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 10/08/2022] [Accepted: 10/10/2022] [Indexed: 01/25/2023]
Abstract
From the first emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) till now, multiple mutations that caused synonymous and nonsynonymous substitutions have accumulated. Among them, synonymous substitutions were regarded as "silent" mutations that received less attention than nonsynonymous substitutions that cause amino acid variations. However, the importance of synonymous substitutions can not be neglected. This research focuses on synonymous substitutions on SARS-CoV-2 and proves that synonymous substitutions were under purifying selection in its evolution. The evidence of purifying selection is provided by comparing the mutation number per site in coding and non-coding regions. We then study the two forces of purifying selection: synonymous codon usage and RNA secondary structure. Results show that the codon usage optimization leads to an adapted codon usage towards humans. Furthermore, our results show that the maintenance of RNA secondary structure causes the purifying of synonymous substitutions in the structural region. These results explain the selection pressure on synonymous substitutions during the evolution of SARS-CoV-2.
Collapse
|
3
|
HIV-1 sequences in lentiviral vector genomes can be substantially reduced without compromising transduction efficiency. Sci Rep 2021; 11:12067. [PMID: 34103612 PMCID: PMC8187449 DOI: 10.1038/s41598-021-91309-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 05/17/2021] [Indexed: 11/12/2022] Open
Abstract
Many lentiviral vectors used for gene therapy are derived from HIV-1. An optimal vector genome would include only the viral sequences required for transduction efficiency and gene expression to minimize the amount of foreign sequence inserted into a patient’s genome. However, it remains unclear whether all of the HIV-1 sequence in vector genomes is essential. To determine which viral sequences are required, we performed a systematic deletion analysis, which showed that most of the gag region and over 50% of the env region could be deleted. Because the splicing profile for lentiviral vectors is poorly characterized, we used long-read sequencing to determine canonical and cryptic splice site usage. Deleting specific regions of env sequence reduced the number of splicing events per transcript and increased the proportion of unspliced genomes. Finally, combining a large deletion in gag with repositioning the Rev-response element downstream of the 3’ R to prevent its reverse transcription showed that 1201 nucleotides of HIV-1 sequence can be removed from the integrated vector genome without substantially compromising transduction efficiency. Overall, this allows the creation of lentiviral vector genomes that contain minimal HIV-1 sequence, which could improve safety and transfer less viral sequence into a patient’s DNA.
Collapse
|
4
|
BlueFeather, the singleton that wasn't: Shared gene content analysis supports expansion of Arthrobacter phage Cluster FE. PLoS One 2021; 16:e0248418. [PMID: 33711060 PMCID: PMC7954295 DOI: 10.1371/journal.pone.0248418] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 02/26/2021] [Indexed: 12/02/2022] Open
Abstract
Bacteriophages (phages) exhibit high genetic diversity, and the mosaic nature of the shared genetic pool makes quantifying phage relatedness a shifting target. Early parameters for clustering of related Mycobacteria and Arthrobacter phage genomes relied on nucleotide identity thresholds but, more recently, clustering of Gordonia and Microbacterium phages has been performed according to shared gene content. Singleton phages lack the nucleotide identity and/or shared gene content required for clustering newly sequenced genomes with known phages. Whole genome metrics of novel Arthrobacter phage BlueFeather, originally designated a putative singleton, showed low nucleotide identity but high amino acid and gene content similarity with Arthrobacter phages originally assigned to Clusters FE and FI. Gene content similarity revealed that BlueFeather shared genes with these phages in excess of the parameter for clustering Gordonia and Microbacterium phages. Single gene analyses revealed evidence of horizontal gene transfer between BlueFeather and phages in unique clusters that infect a variety of bacterial hosts. Our findings highlight the advantage of using shared gene content to study seemingly genetically isolated phages and have resulted in the reclustering of BlueFeather, a putative singleton, as well as former Cluster FI phages, into a newly expanded Cluster FE.
Collapse
|
5
|
Selection Shapes Synonymous Stop Codon Use in Mammals. J Mol Evol 2020; 88:549-561. [PMID: 32617614 DOI: 10.1007/s00239-020-09957-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Accepted: 06/19/2020] [Indexed: 12/15/2022]
Abstract
Phylogenetic models of the evolution of protein-coding sequences can provide insights into the selection pressures that have shaped them. In the application of these models synonymous nucleotide substitutions, which do not alter the encoded amino acid, are often assumed to have limited functional consequences and used as a proxy for the neutral rate of evolution. The ratio of nonsynonymous to synonymous substitution rates is then used to categorize the selective regime that applies to the protein (e.g., purifying selection, neutral evolution, diversifying selection). Here, we extend the Muse and Gaut model of codon evolution to explore the extent of purifying selection acting on substitutions between synonymous stop codons. Using a large collection of coding sequence alignments, we estimate that a high proportion (approximately 57%) of mammalian genes are affected by selection acting on stop codon preference. This proportion varies substantially by codon, with UGA stop codons far more likely to be conserved. Genes with evidence of selection acting on synonymous stop codons have distinctive characteristics, compared to unconserved genes with the same stop codon, including longer [Formula: see text] untranslated regions (UTRs) and shorter mRNA half-life. The coding regions of these genes are also much more likely to be under strong purifying selection pressure. Our results suggest that the preference for UGA stop codons found in many multicellular eukaryotes is selective rather than mutational in origin.
Collapse
|
6
|
Abstract
Populations evolve as mutations arise in individual organisms and, through hereditary transmission, may become "fixed" (shared by all individuals) in the population. Most mutations are lethal or have negative fitness consequences for the organism. Others have essentially no effect on organismal fitness and can become fixed through the neutral stochastic process known as random drift. However, mutations may also produce a selective advantage that boosts their chances of reaching fixation. Regions of genomes where new mutations are beneficial, rather than neutral or deleterious, tend to evolve more rapidly due to positive selection. Genes involved in immunity and defense are a well-known example; rapid evolution in these genes presumably occurs because new mutations help organisms to prevail in evolutionary "arms races" with pathogens. In recent years genome-wide scans for selection have enlarged our understanding of the genome evolution of various species. In this chapter, we will focus on methods to detect selection on the genome. In particular, we will discuss probabilistic models and how they have changed with the advent of new genome-wide data now available.
Collapse
|
7
|
Abstract
Subtype A is one of the rare HIV-1 group M (HIV-1M) lineages that is both widely distributed throughout the world and persists at high frequencies in the Congo Basin (CB), the site where HIV-1M likely originated. This, together with its high degree of diversity suggests that subtype A is amongst the fittest HIV-1M lineages. Here we use a comprehensive set of published near full-length subtype A sequences and A-derived genome fragments from both circulating and unique recombinant forms (CRFs/URFs) to obtain some insights into how frequently these lineages have independently seeded HIV-1M sub-epidemics in different parts of the world. We do this by inferring when and where the major subtype A lineages and subtype A-derived CRFs originated. Following its origin in the CB during the 1940s, we track the diversification and recombination history of subtype A sequences before and during its dissemination throughout much of the world between the 1950s and 1970s. Collectively, the timings and numbers of detectable subtype A recombination and dissemination events, the present broad global distribution of the sub-epidemics that were seeded by these events, and the high prevalence of subtype A sequences within the regions where these sub-epidemics occurred, suggest that ancestral subtype A viruses (and particularly sub-subtype A1 ancestral viruses) may have been genetically predisposed to become major components of the present epidemic.
Collapse
|
8
|
Increasing the CpG dinucleotide abundance in the HIV-1 genomic RNA inhibits viral replication. Retrovirology 2017; 14:49. [PMID: 29121951 PMCID: PMC5679385 DOI: 10.1186/s12977-017-0374-1] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2017] [Accepted: 11/01/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The human immunodeficiency virus type 1 (HIV-1) structural protein Gag is necessary and sufficient to form viral particles. In addition to encoding the amino acid sequence for Gag, the underlying RNA sequence could encode cis-acting elements or nucleotide biases that are necessary for viral replication. Furthermore, RNA sequences that inhibit viral replication could be suppressed in gag. However, the functional relevance of RNA elements and nucleotide biases that promote or repress HIV-1 replication remain poorly understood. RESULTS To characterize if the RNA sequence in gag controls HIV-1 replication, the matrix (MA) region was codon modified, allowing the RNA sequence to be altered without affecting the protein sequence. Codon modification of nucleotides (nt) 22-261 or 22-378 in gag inhibited viral replication by decreasing genomic RNA (gRNA) abundance, gRNA stability, Gag expression, virion production and infectivity. Comparing the effect of these point mutations to deletions of the same region revealed that the mutations inhibited infectious virus production while the deletions did not. This demonstrated that codon modification introduced inhibitory sequences. There is a much lower than expected frequency of CpG dinucleotides in HIV-1 and codon modification introduced a substantial increase in CpG abundance. To determine if they are necessary for inhibition of HIV-1 replication, codons introducing CpG dinucleotides were mutated back to the wild type codon, which restored efficient Gag expression and infectious virion production. To determine if they are sufficient to inhibit viral replication, CpG dinucleotides were inserted into gag in the absence of other changes. The increased CpG dinucleotide content decreased HIV-1 infectivity and viral replication. CONCLUSIONS The HIV-1 RNA sequence contains low abundance of CpG dinucleotides. Increasing the abundance of CpG dinucleotides inhibits multiple steps of the viral life cycle, providing a functional explanation for why CpG dinucleotides are suppressed in HIV-1.
Collapse
|
9
|
Functional Segregation of Overlapping Genes in HIV. Cell 2017; 167:1762-1773.e12. [PMID: 27984726 DOI: 10.1016/j.cell.2016.11.031] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2016] [Revised: 09/29/2016] [Accepted: 11/15/2016] [Indexed: 11/28/2022]
Abstract
Overlapping genes pose an evolutionary dilemma as one DNA sequence evolves under the selection pressures of multiple proteins. Here, we perform systematic statistical and mutational analyses of the overlapping HIV-1 genes tat and rev and engineer exhaustive libraries of non-overlapped viruses to perform deep mutational scanning of each gene independently. We find a "segregated" organization in which overlapped sites encode functional residues of one gene or the other, but never both. Furthermore, this organization eliminates unfit genotypes, providing a fitness advantage to the population. Our comprehensive analysis reveals the extraordinary manner in which HIV minimizes the constraint of overlapping genes and repurposes that constraint to its own advantage. Thus, overlaps are not just consequences of evolutionary constraints, but rather can provide population fitness advantages.
Collapse
|
10
|
Abstract
Mutation rates and fitness costs of deleterious mutations are difficult to measure in vivo but essential for a quantitative understanding of evolution. Using whole genome deep sequencing data from longitudinal samples during untreated HIV-1 infection, we estimated mutation rates and fitness costs in HIV-1 from the dynamics of genetic variation. At approximately neutral sites, mutations accumulate with a rate of 1.2 × 10-5 per site per day, in agreement with the rate measured in cell cultures. We estimated the rate from G to A to be the largest, followed by the other transitions C to T, T to C, and A to G, while transversions are less frequent. At other sites, mutations tend to reduce virus replication. We estimated the fitness cost of mutations at every site in the HIV-1 genome using a model of mutation selection balance. About half of all non-synonymous mutations have large fitness costs (>10 percent), while most synonymous mutations have costs <1 percent. The cost of synonymous mutations is especially low in most of pol where we could not detect measurable costs for the majority of synonymous mutations. In contrast, we find high costs for synonymous mutations in important RNA structures and regulatory regions. The intra-patient fitness cost estimates are consistent across multiple patients, indicating that the deleterious part of the fitness landscape is universal and explains a large fraction of global HIV-1 group M diversity.
Collapse
|
11
|
Overlapping Regions in HIV-1 Genome Act as Potential Sites for Host-Virus Interaction. Front Microbiol 2016; 7:1735. [PMID: 27867372 PMCID: PMC5095123 DOI: 10.3389/fmicb.2016.01735] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2016] [Accepted: 10/17/2016] [Indexed: 01/05/2023] Open
Abstract
More than a decade, overlapping genes in RNA viruses became a subject of research which has explored various effect of gene overlapping on the evolution and function of viral genomes like genome size compaction. Additionally, overlapping regions (OVRs) are also reported to encode elevated degree of protein intrinsic disorder (PID) in unspliced RNA viruses. With the aim to explore the roles of OVRs in HIV-1 pathogenesis, we have carried out an in-depth analysis on the association of gene overlapping with PID in 35 HIV1- M subtypes. Our study reveals an over representation of PID in OVR of HIV-1 genomes. These disordered residues endure several vital, structural features like short linear motifs (SLiMs) and protein phosphorylation (PP) sites which are previously shown to be involved in massive host–virus interaction. Moreover, SLiMs in OVRs are noticed to be more functionally potential as compared to that of non-overlapping region. Although, density of experimentally verified SLiMs, resided in 9 HIV-1 genes, involved in host–virus interaction do not show any bias toward clustering into OVR, tat and rev two important proteins mediates host–pathogen interaction by their experimentally verified SLiMs, which are mostly localized in OVR. Finally, our analysis suggests that the acquisition of SLiMs in OVR is mutually exclusive of the occurrence of disordered residues, while the enrichment of PPs in OVR is solely dependent on PID and not on overlapping coding frames. Thus, OVRs of HIV-1 genomes could be demarcated as potential molecular recognition sites during host–virus interaction.
Collapse
|
12
|
Analysis of full-length genomes of porcine teschovirus (PTV) and the effect of purifying selection on phylogenetic trees. Arch Virol 2016; 161:1199-208. [DOI: 10.1007/s00705-015-2744-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2015] [Accepted: 12/21/2015] [Indexed: 10/22/2022]
|
13
|
Mapping overlapping functional elements embedded within the protein-coding regions of RNA viruses. Nucleic Acids Res 2014; 42:12425-39. [PMID: 25326325 PMCID: PMC4227794 DOI: 10.1093/nar/gku981] [Citation(s) in RCA: 63] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2014] [Revised: 09/20/2014] [Accepted: 10/04/2014] [Indexed: 12/29/2022] Open
Abstract
Identification of the full complement of genes and other functional elements in any virus is crucial to fully understand its molecular biology and guide the development of effective control strategies. RNA viruses have compact multifunctional genomes that frequently contain overlapping genes and non-coding functional elements embedded within protein-coding sequences. Overlapping features often escape detection because it can be difficult to disentangle the multiple roles of the constituent nucleotides via mutational analyses, while high-throughput experimental techniques are often unable to distinguish functional elements from incidental features. However, RNA viruses evolve very rapidly so that, even within a single species, substitutions rapidly accumulate at neutral or near-neutral sites providing great potential for comparative genomics to distinguish the signature of purifying selection. Computationally identified features can then be efficiently targeted for experimental analysis. Here we analyze alignments of protein-coding virus sequences to identify regions where there is a statistically significant reduction in the degree of variability at synonymous sites, a characteristic signature of overlapping functional elements. Having previously tested this technique by experimental verification of discoveries in selected viruses, we now analyze sequence alignments for ∼700 RNA virus species to identify hundreds of such regions, many of which have not been previously described.
Collapse
|
14
|
Evidence of pervasive biologically functional secondary structures within the genomes of eukaryotic single-stranded DNA viruses. J Virol 2013; 88:1972-89. [PMID: 24284329 DOI: 10.1128/jvi.03031-13] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Single-stranded DNA (ssDNA) viruses have genomes that are potentially capable of forming complex secondary structures through Watson-Crick base pairing between their constituent nucleotides. A few of the structural elements formed by such base pairings are, in fact, known to have important functions during the replication of many ssDNA viruses. Unknown, however, are (i) whether numerous additional ssDNA virus genomic structural elements predicted to exist by computational DNA folding methods actually exist and (ii) whether those structures that do exist have any biological relevance. We therefore computationally inferred lists of the most evolutionarily conserved structures within a diverse selection of animal- and plant-infecting ssDNA viruses drawn from the families Circoviridae, Anelloviridae, Parvoviridae, Nanoviridae, and Geminiviridae and analyzed these for evidence of natural selection favoring the maintenance of these structures. While we find evidence that is consistent with purifying selection being stronger at nucleotide sites that are predicted to be base paired than at sites predicted to be unpaired, we also find strong associations between sites that are predicted to pair with one another and site pairs that are apparently coevolving in a complementary fashion. Collectively, these results indicate that natural selection actively preserves much of the pervasive secondary structure that is evident within eukaryote-infecting ssDNA virus genomes and, therefore, that much of this structure is biologically functional. Lastly, we provide examples of various highly conserved but completely uncharacterized structural elements that likely have important functions within some of the ssDNA virus genomes analyzed here.
Collapse
|
15
|
Abstract
Intrapatient evolution of human immunodeficiency virus type 1 (HIV-1) is driven by the adaptive immune system resulting in rapid change of HIV-1 proteins. When cytotoxic CD8(+) T cells or neutralizing antibodies target a new epitope, the virus often escapes via nonsynonymous mutations that impair recognition. Synonymous mutations do not affect this interplay and are often assumed to be neutral. We test this assumption by tracking synonymous mutations in longitudinal intrapatient data from the C2-V5 part of the env gene. We find that most synonymous variants are lost even though they often reach high frequencies in the viral population, suggesting a cost to the virus. Using published data from SHAPE (selective 2'-hydroxyl acylation analyzed by primer extension) assays, we find that synonymous mutations that disrupt base pairs in RNA stems flanking the variable loops of gp120 are more likely to be lost than other synonymous changes: these RNA hairpins might be important for HIV-1. Computational modeling indicates that, to be consistent with the data, a large fraction of synonymous mutations in this genomic region need to be deleterious with a cost on the order of 0.002 per day. This weak selection against synonymous substitutions does not result in a strong pattern of conservation in cross-sectional data but slows down the rate of evolution considerably. Our findings are consistent with the notion that large-scale patterns of RNA structure are functionally relevant, whereas the precise base pairing pattern is not.
Collapse
|
16
|
Abstract
BACKGROUND Synonymous or silent mutations are usually thought to evolve neutrally. However, accumulating recent evidence has demonstrated that silent mutations may destabilize RNA structures or disrupt cis regulatory motifs superimposed on coding sequences. Such observations suggest the existence of stretches of codon sites that are evolutionary conserved at both DNA-RNA and protein levels. Such stretches may point to functionally important regions within protein coding sequences not necessarily reflecting functional constraints on the amino-acid sequence. The HIV-1 genome is highly compact, and often harbors overlapping functional elements at the protein, RNA, and DNA levels. This superimposition of functions leads to complex selective forces acting on all levels of the genome and proteome. Considering the constraints on HIV-1 to maintain such a highly compact genome, we hypothesized that stretches of synonymous conservation would be common within its genome. RESULTS We used a combined computational-experimental approach to detect and characterize regions exhibiting strong purifying selection against synonymous substitutions along the HIV-1 genome. Our methodology is based on advanced probabilistic evolutionary models that explicitly account for synonymous rate variation among sites and rate dependencies among adjacent sites. These models are combined with a randomization procedure to automatically identify the most statistically significant regions of conserved synonymous sites along the genome. Using this procedure we identified 21 conserved regions. Twelve of these are mapped to regions within overlapping genes, seven correlate with known functional elements, while the functions of the remaining four are yet unknown. Among these four regions, we chose the one that deviates most from synonymous rate homogeneity for in-depth computational and experimental characterization. In our assays aiming to quantify viral fitness in both early and late stages of the replication cycle, no differences were observed between the mutated and the wild type virus following the introduction of synonymous mutations. CONCLUSIONS The contradiction between the inferred purifying selective forces and the lack of effect of these mutations on viral replication may be explained by the fact that the phenotype was measured in single-cycle infection assays in cell culture. Such a system does not account for the complexity of HIV-1 infections in vivo, which involves multiple infection cycles and interaction with the host immune system.
Collapse
|
17
|
The evolution of HIV: inferences using phylogenetics. Mol Phylogenet Evol 2012; 62:777-92. [PMID: 22138161 PMCID: PMC3258026 DOI: 10.1016/j.ympev.2011.11.019] [Citation(s) in RCA: 63] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2011] [Revised: 11/17/2011] [Accepted: 11/21/2011] [Indexed: 12/02/2022]
Abstract
Molecular phylogenetics has revolutionized the study of not only evolution but also disparate fields such as genomics, bioinformatics, epidemiology, ecology, microbiology, molecular biology and biochemistry. Particularly significant are its achievements in population genetics as a result of the development of coalescent theory, which have contributed to more accurate model-based parameter estimation and explicit hypothesis testing. The study of the evolution of many microorganisms, and HIV in particular, have benefited from these new methodologies. HIV is well suited for such sophisticated population analyses because of its large population sizes, short generation times, high substitution rates and relatively small genomes. All these factors make HIV an ideal and fascinating model to study molecular evolution in real time. Here we review the significant advances made in HIV evolution through the application of phylogenetic approaches. We first examine the relative roles of mutation and recombination on the molecular evolution of HIV and its adaptive response to drug therapy and tissue allocation. We then review some of the fundamental questions in HIV evolution in relation to its origin and diversification and describe some of the insights gained using phylogenies. Finally, we show how phylogenetic analysis has advanced our knowledge of HIV dynamics (i.e., phylodynamics).
Collapse
|
18
|
Abstract
Populations evolve as mutations arise in individual organisms and, through hereditary transmission, may become "fixed" (shared by all individuals) in the population. Most mutations are lethal or have negative fitness consequences for the organism. Others have essentially no effect on organismal fitness and can become fixed through the neutral stochastic process known as random drift. However, mutations may also produce a selective advantage that boosts their chances of reaching fixation. Regions of genes where new mutations are beneficial, rather than neutral or deleterious, tend to evolve more rapidly due to positive selection. Genes involved in immunity and defense are a well-known example; rapid evolution in these genes presumably occurs because new mutations help organisms to prevail in evolutionary "arms races" with pathogens. In recent years, genome-wide scans for selection have enlarged our understanding of the evolution of the protein-coding regions of the various species. In this chapter, we focus on the methods to detect selection in protein-coding genes. In particular, we discuss probabilistic models and how they have changed with the advent of new genome-wide data now available.
Collapse
|
19
|
A phylogenetic analysis using full-length viral genomes of South American dengue serotype 3 in consecutive Venezuelan outbreaks reveals a novel NS5 mutation. INFECTION, GENETICS AND EVOLUTION : JOURNAL OF MOLECULAR EPIDEMIOLOGY AND EVOLUTIONARY GENETICS IN INFECTIOUS DISEASES 2011; 11:2011-9. [PMID: 21964598 PMCID: PMC3565618 DOI: 10.1016/j.meegid.2011.09.010] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/17/2011] [Revised: 09/05/2011] [Accepted: 09/08/2011] [Indexed: 11/24/2022]
Abstract
Dengue virus currently causes 50-100 million infections annually. Comprehensive knowledge about the evolution of Dengue in response to selection pressure is currently unavailable, but would greatly enhance vaccine design efforts. In the current study, we sequenced 187 new dengue virus serotype 3 (DENV-3) genotype III whole genomes isolated from Asia and the Americas. We analyzed them together with previously-sequenced isolates to gain a more detailed understanding of the evolutionary adaptations existing in this prevalent American serotype. In order to analyze the phylogenetic dynamics of DENV-3 during outbreak periods; we incorporated datasets of 48 and 11 sequences spanning two major outbreaks in Venezuela during 2001 and 2007-2008, respectively. Our phylogenetic analysis of newly sequenced viruses shows that subsets of genomes cluster primarily by geographic location, and secondarily by time of virus isolation. DENV-3 genotype III sequences from Asia are significantly divergent from those from the Americas due to their geographical separation and subsequent speciation. We measured amino acid variation for the E protein by calculating the Shannon entropy at each position between Asian and American genomes. We found a cluster of seven amino acid substitutions having high variability within E protein domain III, which has previously been implicated in serotype-specific neutralization escape mutants. No novel mutations were found in the E protein of sequences isolated during either Venezuelan outbreak. Shannon entropy analysis of the NS5 polymerase mature protein revealed that a G374E mutation, in a region that contributes to interferon resistance in other flaviviruses by interfering with JAK-STAT signaling was present in both the Asian and American sequences from the 2007-2008 Venezuelan outbreak, but was absent in the sequences from the 2001 Venezuelan outbreak. In addition to E, several NS5 amino acid changes were unique to the 2007-2008 epidemic in Venezuela and may give additional insight into the adaptive response of DENV-3 at the population level.
Collapse
|
20
|
Rev variation during persistent lentivirus infection. Viruses 2011; 3:1-11. [PMID: 21994723 PMCID: PMC3187595 DOI: 10.3390/v3010001] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2010] [Revised: 12/29/2010] [Accepted: 01/06/2011] [Indexed: 11/29/2022] Open
Abstract
The ability of lentiviruses to continually evolve and escape immune control is the central impediment in developing an effective vaccine for HIV-1 and other lentiviruses. Equine infectious anemia virus (EIAV) is considered a useful model for immune control of lentivirus infection. Virus-specific cytotoxic T lymphocytes (CTL) and broadly neutralizing antibody effectively control EIAV replication during inapparent stages of disease, but after years of low-level replication, the virus is still able to produce evasion genotypes that lead to late re-emergence of disease. There is a high rate of genetic variation in the EIAV surface envelope glycoprotein (SU) and in the region of the transmembrane protein (TM) overlapped by the major exon of Rev. This review examines genetic and phenotypic variation in Rev during EIAV disease and a possible role for Rev in immune evasion and virus persistence.
Collapse
|
21
|
Phylogenetic analysis of population-based and deep sequencing data to identify coevolving sites in the nef gene of HIV-1. Mol Biol Evol 2009; 27:819-32. [PMID: 19955476 DOI: 10.1093/molbev/msp289] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Rapidly evolving viruses such as HIV-1 display extensive sequence variation in response to host-specific selection, while simultaneously maintaining functions that are critical to replication and infectivity. This apparent conflict between diversifying and purifying selection may be resolved by an abundance of epistatic interactions such that the same functional requirements can be met by highly divergent sequences. We investigate this hypothesis by conducting an extensive characterization of sequence variation in the HIV-1 nef gene that encodes a highly variable multifunctional protein. Population-based sequences were obtained from 686 patients enrolled in the HOMER cohort in British Columbia, Canada, from which the distribution of nonsynonymous substitutions in the phylogeny was reconstructed by maximum likelihood. We used a phylogenetic comparative method on these data to identify putative epistatic interactions between residues. Two interactions (Y120/Q125 and N157/S169) were chosen to further investigate within-host evolution using HIV-1 RNA extractions from plasma samples from eight patients. Clonal sequencing confirmed strong linkage between polymorphisms at these sites in every case. We used massively parallel pyrosequencing (MPP) to reconstruct within-host evolution in these patients. Experimental error associated with MPP was quantified by performing replicates at two different stages of the protocol, which were pooled prior to analysis to reduce this source of variation. Phylogenetic reconstruction from these data revealed correlated substitutions at Y120/Q125 or N157/S169 repeated across multiple lineages in every host, indicating convergent within-host evolution shaped by epistatic interactions.
Collapse
|
22
|
Evidence of HIV-1 adaptation to host HLA alleles following chimp-to-human transmission. Virol J 2009; 6:164. [PMID: 19818146 PMCID: PMC2765438 DOI: 10.1186/1743-422x-6-164] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2009] [Accepted: 10/10/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The cytotoxic T-lymphocyte immune response is important in controlling HIV-1 replication in infected humans. In this immune pathway, viral peptides within infected cells are presented to T-lymphocytes by the polymorphic human leukocyte antigens (HLA). HLA alleles exert selective pressure on the peptide regions and immune escape mutations that occur at some of the targeted sites can enable the virus to adapt to the infected host. The pattern of ongoing immune escape and reversion associated with several human HLA alleles has been studied extensively. Such mutations revert upon transmission to a host without the HLA allele because the escape mutation incurs a fitness cost. However, to-date there has been little attempt to study permanent loss of CTL epitopes due to escape mutations without an effect on fitness. RESULTS Here, we set out to determine the extent of adaptation of HIV-1 to three well-characterized HLA alleles during the initial exposure of the virus to the human cytotoxic immune responses following transmission from chimpanzee. We generated a chimpanzee consensus sequence to approximate the virus sequence that was initially transmitted to the human host and used a method based on peptide binding affinity to HLA crystal structures to predict peptides that were potentially targeted by the HLA alleles on this sequence. Next, we used codon-based phylogenetic models to quantify the average selective pressure that acted on these regions during the period immediately following the zoonosis event, corresponding to the branch of the phylogenetic tree leading to the common ancestor of all of the HIV-1 sequences. Evidence for adaptive evolution during this period was observed at regions recognised by HLA A*6801 and A*0201, both of which are common in African populations. No evidence of adaptive evolution was observed at sites targeted by HLA-B*2705, which is a rare allele in African populations. CONCLUSION Our results suggest that the ancestral HIV-1 virus experienced a period of positive selective pressure due to immune responses associated with HLA alleles that were common in the infected human population. We propose that this resulted in permanent escape from immune responses targeting unconstrained regions of the virus.
Collapse
|
23
|
Quantifying differences in the tempo of human immunodeficiency virus type 1 subtype evolution. J Virol 2009; 83:12917-24. [PMID: 19793809 DOI: 10.1128/jvi.01022-09] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
Human immunodeficiency virus type 1 (HIV-1) genetic diversity, due to its high evolutionary rate, has long been identified as a main cause of problems in the development of an efficient HIV-1 vaccine. However, little is known about differences in evolutionary rate between different subtypes. In this study, we collected representative samples of the main epidemic subtypes and circulating recombinant forms (CRFs), namely, sub-subtype A1, subtypes B, C, D, and G, and CRFs 01_AE and 02_AG. We analyzed separate data sets for pol and env. We performed a Bayesian Markov chain Monte Carlo relaxed-clock phylogenetic analysis and applied a codon model to the resulting phylogenetic trees to estimate nonsynonymous (dN) and synonymous (dS) rates along each and every branch. We found important differences in the evolutionary rates of the different subtypes. These are due to differences not only in the dN rate but also in the dS rate, varying in roughly similar ways, indicating that these differences are caused by both different selective pressures (for dN rate) and the replication dynamics (for dS rate) (i.e., mutation rate or generation time) of the strains. CRF02_AG and subtype G had higher rates, while subtype D had lower dN and dS rates than the other subtypes. The dN/dS ratio estimates were also different, especially for the env gene, with subtype G showing the lowest dN/dS ratio of all subtypes.
Collapse
|