1
|
Hexaploid sweetpotato (Ipomoea batatas (L.) Lam.) may not be a true type to either auto- or allopolyploid. PLoS One 2020; 15:e0229624. [PMID: 32126067 PMCID: PMC7053752 DOI: 10.1371/journal.pone.0229624] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Accepted: 02/10/2020] [Indexed: 01/10/2023] Open
Abstract
To better define the sweetpotato polyploidy, we sought to reconstruct phylogenies of its subgenomes based on hybridization networks that could trace reticulate lineages of differentiated homoeolog triplets of multiple single-copy genes. In search of such homoeolog triplets, we distinguished cDNA variants of 811 single-copy Conserved Ortholog Set II (COSII) genes from two sweetpotato clones into variation partitions specified by corresponding homologs from two I. trifida lines, I. tenuissima and I. littoralis using a phylogenetic partition method, and amplicon variants of the COSII-marker regions from 729 of these genes from two sweetpotato clones into putative homoeoallele groups using haplotype tree and the partition methods referenced by corresponding homologs from I. tenuissima. These analyses revealed partly or completely differentiated expressed-homoeologs and homoeologs from a majority of these genes with three important features. 1. Two variation types: the predominant interspecific variations (homoeoalleles), which are non-randomly clustered, differentially interspecifically conserved or sweetpotato-specific, and the minor intraspecific ones (alleles), which are randomly distributed mostly at non-interspecifically variable sites, and usually sweetpotato-specific. 2. A clear differentiation of cDNA variants of many COSII genes into the variation partition specified by I. tenuissima or I. littoralis from that by I. trifida. 3. Three species-homolog-specified and one sweetpotato-specific variation partitions among 293 different COSII cDNAs, and two or three out of the four partitions among cDNA variants of 306 COSII genes. We then constructed hybridization networks from two concatenations of 16 and 4 alignments of 8 homologous COSII cDNA regions each, which included three taxa of expressed homoeologs in a triple-partition combination from the 16 or 4 sweetpotato COSII genes and 5 taxa each of respective cDNA homologs from the three sweetpotato relatives and I. nil, and inferred a species tree embodying both networks. The species tree predicted close-relative origins of three partly differentiated sweetpotato subgenomes.
Collapse
|
2
|
Dumont BL. Interlocus gene conversion explains at least 2.7% of single nucleotide variants in human segmental duplications. BMC Genomics 2015; 16:456. [PMID: 26077037 PMCID: PMC4467073 DOI: 10.1186/s12864-015-1681-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2015] [Accepted: 06/01/2015] [Indexed: 01/24/2023] Open
Abstract
Background Interlocus gene conversion (IGC) is a recombination-based mechanism that results in the unidirectional transfer of short stretches of sequence between paralogous loci. Although IGC is a well-established mechanism of human disease, the extent to which this mutagenic process has shaped overall patterns of segregating variation in multi-copy regions of the human genome remains unknown. One expected manifestation of IGC in population genomic data is the presence of one-to-one paralogous SNPs that segregate identical alleles. Results Here, I use SNP genotype calls from the low-coverage phase 3 release of the 1000 Genomes Project to identify 15,790 parallel, shared SNPs in duplicated regions of the human genome. My approach for identifying these sites accounts for the potential redundancy of short read mapping in multi-copy genomic regions, thereby effectively eliminating false positive SNP calls arising from paralogous sequence variation. I demonstrate that independent mutation events to identical nucleotides at paralogous sites are not a significant source of shared polymorphisms in the human genome, consistent with the interpretation that these sites are the outcome of historical IGC events. These putative signals of IGC are enriched in genomic contexts previously associated with non-allelic homologous recombination, including clear signals in gene families that form tandem intra-chromosomal clusters. Conclusions Taken together, my analyses implicate IGC, not point mutation, as the mechanism generating at least 2.7 % of single nucleotide variants in duplicated regions of the human genome. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1681-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Beth L Dumont
- Initiative in Biological Complexity, North Carolina State University, 112 Derieux Place, 3510 Thomas Hall, Campus Box 7614, Raleigh, NC, 27695-7614, USA.
| |
Collapse
|
3
|
Pérez-Losada M, Arenas M, Galán JC, Palero F, González-Candelas F. Recombination in viruses: mechanisms, methods of study, and evolutionary consequences. INFECTION, GENETICS AND EVOLUTION : JOURNAL OF MOLECULAR EPIDEMIOLOGY AND EVOLUTIONARY GENETICS IN INFECTIOUS DISEASES 2015; 30:296-307. [PMID: 25541518 PMCID: PMC7106159 DOI: 10.1016/j.meegid.2014.12.022] [Citation(s) in RCA: 198] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/23/2014] [Revised: 12/15/2014] [Accepted: 12/17/2014] [Indexed: 02/08/2023]
Abstract
Recombination is a pervasive process generating diversity in most viruses. It joins variants that arise independently within the same molecule, creating new opportunities for viruses to overcome selective pressures and to adapt to new environments and hosts. Consequently, the analysis of viral recombination attracts the interest of clinicians, epidemiologists, molecular biologists and evolutionary biologists. In this review we present an overview of three major areas related to viral recombination: (i) the molecular mechanisms that underlie recombination in model viruses, including DNA-viruses (Herpesvirus) and RNA-viruses (Human Influenza Virus and Human Immunodeficiency Virus), (ii) the analytical procedures to detect recombination in viral sequences and to determine the recombination breakpoints, along with the conceptual and methodological tools currently used and a brief overview of the impact of new sequencing technologies on the detection of recombination, and (iii) the major areas in the evolutionary analysis of viral populations on which recombination has an impact. These include the evaluation of selective pressures acting on viral populations, the application of evolutionary reconstructions in the characterization of centralized genes for vaccine design, and the evaluation of linkage disequilibrium and population structure.
Collapse
Affiliation(s)
- Marcos Pérez-Losada
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Campus Agrário de Vairão, Portugal; Computational Biology Institute, George Washington University, Ashburn, VA 20147, USA
| | - Miguel Arenas
- Centre for Molecular Biology "Severo Ochoa", Consejo Superior de Investigaciones Científicas (CSIC), Madrid, Spain
| | - Juan Carlos Galán
- Servicio de Microbiología, Hospital Ramón y Cajal and Instituto Ramón y Cajal de Investigación Sanitaria (IRYCIS), Madrid, Spain; CIBER en Epidemiología y Salud Pública, Spain
| | - Ferran Palero
- CIBER en Epidemiología y Salud Pública, Spain; Unidad Mixta Infección y Salud Pública, FISABIO-Universitat de València, Valencia, Spain
| | - Fernando González-Candelas
- CIBER en Epidemiología y Salud Pública, Spain; Unidad Mixta Infección y Salud Pública, FISABIO-Universitat de València, Valencia, Spain.
| |
Collapse
|
4
|
Croucher NJ, Page AJ, Connor TR, Delaney AJ, Keane JA, Bentley SD, Parkhill J, Harris SR. Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins. Nucleic Acids Res 2014; 43:e15. [PMID: 25414349 PMCID: PMC4330336 DOI: 10.1093/nar/gku1196] [Citation(s) in RCA: 1390] [Impact Index Per Article: 139.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
The emergence of new sequencing technologies has facilitated the use of bacterial whole genome alignments for evolutionary studies and outbreak analyses. These datasets, of increasing size, often include examples of multiple different mechanisms of horizontal sequence transfer resulting in substantial alterations to prokaryotic chromosomes. The impact of these processes demands rapid and flexible approaches able to account for recombination when reconstructing isolates' recent diversification. Gubbins is an iterative algorithm that uses spatial scanning statistics to identify loci containing elevated densities of base substitutions suggestive of horizontal sequence transfer while concurrently constructing a maximum likelihood phylogeny based on the putative point mutations outside these regions of high sequence diversity. Simulations demonstrate the algorithm generates highly accurate reconstructions under realistically parameterized models of bacterial evolution, and achieves convergence in only a few hours on alignments of hundreds of bacterial genome sequences. Gubbins is appropriate for reconstructing the recent evolutionary history of a variety of haploid genotype alignments, as it makes no assumptions about the underlying mechanism of recombination. The software is freely available for download at github.com/sanger-pathogens/Gubbins, implemented in Python and C and supported on Linux and Mac OS X.
Collapse
Affiliation(s)
- Nicholas J Croucher
- Pathogen Genomics, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK Center for Communicable Disease Dynamics, Harvard School of Public Health, 677 Longwood Avenue, Boston, MA 02115, USA Department of Infectious Disease Epidemiology, Imperial College London, St. Mary's Campus, Norfolk Place, London W2 1PG, UK
| | - Andrew J Page
- Pathogen Genomics, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Thomas R Connor
- Pathogen Genomics, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK Cardiff School of Biosciences, Sir Martin Evans Building, Museum Avenue, Cardiff CF10 3AX, UK
| | - Aidan J Delaney
- School of Computing, Engineering and Mathematics, University of Brighton, Brighton BN2 4GJ, UK
| | - Jacqueline A Keane
- Pathogen Genomics, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Stephen D Bentley
- Pathogen Genomics, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK Department of Medicine, University of Cambridge, Addenbrooke's Hospital, Cambridge CB2 0SP, UK
| | - Julian Parkhill
- Pathogen Genomics, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Simon R Harris
- Pathogen Genomics, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| |
Collapse
|
5
|
The Rate and Tract Length of Gene Conversion between Duplicated Genes. Genes (Basel) 2011; 2:313-31. [PMID: 24710193 PMCID: PMC3924818 DOI: 10.3390/genes2020313] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2011] [Revised: 03/11/2011] [Accepted: 03/17/2011] [Indexed: 11/26/2022] Open
Abstract
Interlocus gene conversion occurs such that a certain length of DNA fragment is non-reciprocally transferred (copied and pasted) between paralogous regions. To understand the rate and tract length of gene conversion, there are two major approaches. One is based on mutation-accumulation experiments, and the other uses natural DNA sequence variation. In this review, we overview the two major approaches and discuss their advantages and disadvantages. In addition, to demonstrate the importance of statistical analysis of empirical and evolutionary data for estimating tract length, we apply a maximum likelihood method to several data sets.
Collapse
|
6
|
Genetic diversity of O-antigen biosynthesis regions in Vibrio cholerae. Appl Environ Microbiol 2011; 77:2247-53. [PMID: 21317260 DOI: 10.1128/aem.01663-10] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
O-antigen biosynthetic (wbf) regions for Vibrio cholerae serogroups O5, O8, and O108 were isolated and sequenced. Sequences were compared to those of other published V. cholerae O-antigen regions. These wbf regions showed a high degree of heterogeneity both in gene content and in gene order. Genes identified frequently showed greater similarities to polysaccharide biosynthesis genes from species other than V. cholerae. Our results demonstrate the plasticity of O-antigen genes in V. cholerae, the diversity of the genetic pool from which they are drawn, and the likelihood that new pandemic serogroups will emerge.
Collapse
|
7
|
Abstract
Interlocus gene conversion can homogenize DNA sequences of duplicated regions with high homology. Such nonvertical events sometimes cause a misleading evolutionary interpretation of data when the effect of gene conversion is ignored. To avoid this problem, it is crucial to test the data for the presence of gene conversion. Here, we performed extensive simulations to compare four major methods to detect gene conversion. One might expect that the power increases with increase of the gene conversion rate. However, we found this is true for only two methods. For the other two, limited power is expected when gene conversion is too frequent. We suggest using multiple methods to minimize the chance of missing the footprint of gene conversion.
Collapse
|
8
|
Bapteste E, O'Malley MA, Beiko RG, Ereshefsky M, Gogarten JP, Franklin-Hall L, Lapointe FJ, Dupré J, Dagan T, Boucher Y, Martin W. Prokaryotic evolution and the tree of life are two different things. Biol Direct 2009; 4:34. [PMID: 19788731 PMCID: PMC2761302 DOI: 10.1186/1745-6150-4-34] [Citation(s) in RCA: 128] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2009] [Accepted: 09/29/2009] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND The concept of a tree of life is prevalent in the evolutionary literature. It stems from attempting to obtain a grand unified natural system that reflects a recurrent process of species and lineage splittings for all forms of life. Traditionally, the discipline of systematics operates in a similar hierarchy of bifurcating (sometimes multifurcating) categories. The assumption of a universal tree of life hinges upon the process of evolution being tree-like throughout all forms of life and all of biological time. In multicellular eukaryotes, the molecular mechanisms and species-level population genetics of variation do indeed mainly cause a tree-like structure over time. In prokaryotes, they do not. Prokaryotic evolution and the tree of life are two different things, and we need to treat them as such, rather than extrapolating from macroscopic life to prokaryotes. In the following we will consider this circumstance from philosophical, scientific, and epistemological perspectives, surmising that phylogeny opted for a single model as a holdover from the Modern Synthesis of evolution. RESULTS It was far easier to envision and defend the concept of a universal tree of life before we had data from genomes. But the belief that prokaryotes are related by such a tree has now become stronger than the data to support it. The monistic concept of a single universal tree of life appears, in the face of genome data, increasingly obsolete. This traditional model to describe evolution is no longer the most scientifically productive position to hold, because of the plurality of evolutionary patterns and mechanisms involved. Forcing a single bifurcating scheme onto prokaryotic evolution disregards the non-tree-like nature of natural variation among prokaryotes and accounts for only a minority of observations from genomes. CONCLUSION Prokaryotic evolution and the tree of life are two different things. Hence we will briefly set out alternative models to the tree of life to study their evolution. Ultimately, the plurality of evolutionary patterns and mechanisms involved, such as the discontinuity of the process of evolution across the prokaryote-eukaryote divide, summons forth a pluralistic approach to studying evolution. REVIEWERS This article was reviewed by Ford Doolittle, John Logsdon and Nicolas Galtier.
Collapse
|
9
|
Zhang Z, Townsend JP. Maximum-likelihood model averaging to profile clustering of site types across discrete linear sequences. PLoS Comput Biol 2009; 5:e1000421. [PMID: 19557160 PMCID: PMC2695770 DOI: 10.1371/journal.pcbi.1000421] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2009] [Accepted: 05/21/2009] [Indexed: 11/19/2022] Open
Abstract
A major analytical challenge in computational biology is the detection and description of clusters of specified site types, such as polymorphic or substituted sites within DNA or protein sequences. Progress has been stymied by a lack of suitable methods to detect clusters and to estimate the extent of clustering in discrete linear sequences, particularly when there is no a priori specification of cluster size or cluster count. Here we derive and demonstrate a maximum likelihood method of hierarchical clustering. Our method incorporates a tripartite divide-and-conquer strategy that models sequence heterogeneity, delineates clusters, and yields a profile of the level of clustering associated with each site. The clustering model may be evaluated via model selection using the Akaike Information Criterion, the corrected Akaike Information Criterion, and the Bayesian Information Criterion. Furthermore, model averaging using weighted model likelihoods may be applied to incorporate model uncertainty into the profile of heterogeneity across sites. We evaluated our method by examining its performance on a number of simulated datasets as well as on empirical polymorphism data from diverse natural alleles of the Drosophila alcohol dehydrogenase gene. Our method yielded greater power for the detection of clustered sites across a breadth of parameter ranges, and achieved better accuracy and precision of estimation of clusters, than did the existing empirical cumulative distribution function statistics.
Collapse
Affiliation(s)
- Zhang Zhang
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America
| | - Jeffrey P. Townsend
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America
- * E-mail:
| |
Collapse
|
10
|
Abstract
Rotifers of class Bdelloidea have evolved for millions of years apparently without sexual reproduction. We have sequenced 45- to 70-kb regions surrounding the four copies of the hsp82 gene of the bdelloid rotifer Philodina roseola, each of which is on a separate chromosome. The four regions comprise two colinear gene-rich pairs with gene content, order, and orientation conserved within each pair. Only a minority of genes are common to both pairs, also in the same orientation and order, but separated by gene-rich segments present in only one or the other pair. The pattern is consistent with degenerate tetraploidy with numerous segmental deletions, some in one pair of colinear chromosomes and some in the other. Divergence in 1,000-bp windows varies along an alignment of a colinear pair, from zero to as much as 20% in a pattern consistent with gene conversion associated with recombinational repair of DNA double-strand breaks. Although pairs of colinear chromosomes are a characteristic of sexually reproducing diploids and polyploids, a quite different explanation for their presence in bdelloids is suggested by the recent finding that bdelloid rotifers can recover and resume reproduction after suffering hundreds of radiation-induced DNA double-strand breaks per oocyte nucleus. Because bdelloid primary oocytes are in G(1) and therefore lack sister chromatids, we propose that bdelloid colinear chromosome pairs are maintained as templates for the repair of DNA double-strand breaks caused by the frequent desiccation and rehydration characteristic of bdelloid habitats.
Collapse
|
11
|
Cohen ALV, Oliver JD, DePaola A, Feil EJ, Boyd EF. Emergence of a virulent clade of Vibrio vulnificus and correlation with the presence of a 33-kilobase genomic island. Appl Environ Microbiol 2007; 73:5553-65. [PMID: 17616611 PMCID: PMC2042058 DOI: 10.1128/aem.00635-07] [Citation(s) in RCA: 70] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Vibrio vulnificus is a ubiquitous inhabitant of the marine coastal environment, and an important pathogen of humans. We characterized a globally distributed sample of environmental isolates from a range of habitats and hosts and compared these with isolates recovered from cases of human infection. Multilocus sequence typing data using six housekeeping genes divided 63 of the 67 isolates into the two main lineages previously noted for this species, and this division was also confirmed using the 16S rRNA and open reading frame VV0401 markers. Lineage I was comprised exclusively of biotype 1 isolates, whereas lineage II contained biotype 1 and all biotype 2 isolates. Four isolates did not cluster within either lineage: two biotype 3 and two biotype 1 isolates. The proportion of isolates recovered from a clinical setting was noted to be higher in lineage I than in lineage II. Lineage I isolates were also associated with a 33-kb genomic island (region XII), one of three regions identified by genome comparisons as unique to the species. Region XII contained an arylsulfatase gene cluster, a sulfate reduction system, two chondroitinase genes, and an oligopeptide ABC transport system, all of which are absent from the majority of lineage II isolates. Arylsulfatases and the sulfate reduction system, along with performing a scavenging role, have been hypothesized to play a role in pathogenic processes in other bacteria. Our data suggest that lineage I may have a higher pathogenic potential and that region XII, along with other regions, may give isolates a selective advantage either in the human host or in the aquatic environment or both.
Collapse
Affiliation(s)
- Ana Luisa V Cohen
- Department of Biological Sciences, University of Delaware, Newark, DE 19716, USA
| | | | | | | | | |
Collapse
|
12
|
Boni MF, Posada D, Feldman MW. An exact nonparametric method for inferring mosaic structure in sequence triplets. Genetics 2007; 176:1035-47. [PMID: 17409078 PMCID: PMC1894573 DOI: 10.1534/genetics.106.068874] [Citation(s) in RCA: 570] [Impact Index Per Article: 33.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2006] [Accepted: 03/18/2007] [Indexed: 11/18/2022] Open
Abstract
Statistical tests for detecting mosaic structure or recombination among nucleotide sequences usually rely on identifying a pattern or a signal that would be unlikely to appear under clonal reproduction. Dozens of such tests have been described, but many are hampered by long running times, confounding of selection and recombination, and/or inability to isolate the mosaic-producing event. We introduce a test that is exact, nonparametric, rapidly computable, free of the infinite-sites assumption, able to distinguish between recombination and variation in mutation/fixation rates, and able to identify the breakpoints and sequences involved in the mosaic-producing event. Our test considers three sequences at a time: two parent sequences that may have recombined, with one or two breakpoints, to form the third sequence (the child sequence). Excess similarity of the child sequence to a candidate recombinant of the parents is a sign of recombination; we take the maximum value of this excess similarity as our test statistic Delta(m,n,b). We present a method for rapidly calculating the distribution of Delta(m,n,b) and demonstrate that it has comparable power to and a much improved running time over previous methods, especially in detecting recombination in large data sets.
Collapse
Affiliation(s)
- Maciej F Boni
- Stanford Genome Technology Center, Palo Alto, California 94304, USA.
| | | | | |
Collapse
|
13
|
Hasselmann M, Beye M. Pronounced differences of recombination activity at the sex determination locus of the honeybee, a locus under strong balancing selection. Genetics 2006; 174:1469-80. [PMID: 16951061 PMCID: PMC1667079 DOI: 10.1534/genetics.106.062018] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Recombination decreases the association of linked nucleotide sites and can influence levels of polymorphism in natural populations. When coupled with selection, recombination may relax potential conflict among linked genes, a concept that has played a central role in research on the evolution of recombination. The sex determination locus (SDL) of the honeybee is an informative example for exploring the combined forces of recombination, selection, and linkage on sequence evolution. Balancing selection at SDL is very strong and homozygous individuals at SDL are eliminated by worker bees. The recombination rate is increased up to four times that of the genomewide average in the region surrounding SDL. Analysis of nucleotide diversity (pi) reveals a sevenfold increase of polymorphism within the sex determination gene complementary sex determiner (csd) that rapidly declines within 45 kb to levels of genomewide estimates. Although no recombination was observed within SDL, which contains csd, analyses of heterogeneity, shared polymorphic sites, and linkage disequilibrium (LD) show that recombination has contributed to the evolution of the 5' part of some csd sequences. Gene conversion, however, has not obviously contributed to the evolution of csd sequences. The local control of recombination appears to be related to SDL function and mode of selection. The homogenizing force of recombination is reduced within SDL, which preserves allelic differences and specificity, while the increase of recombination activity around SDL relaxes conflict between SDL and linked genes.
Collapse
Affiliation(s)
- Martin Hasselmann
- Heinrich Heine Universität Düsseldorf, Institut für Genetik, 40225 Düsseldorf, Germany.
| | | |
Collapse
|
14
|
Mau B, Glasner JD, Darling AE, Perna NT. Genome-wide detection and analysis of homologous recombination among sequenced strains of Escherichia coli. Genome Biol 2006; 7:R44. [PMID: 16737554 PMCID: PMC1779527 DOI: 10.1186/gb-2006-7-5-r44] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2005] [Revised: 02/08/2006] [Accepted: 05/08/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Comparisons of complete bacterial genomes reveal evidence of lateral transfer of DNA across otherwise clonally diverging lineages. Some lateral transfer events result in acquisition of novel genomic segments and are easily detected through genome comparison. Other more subtle lateral transfers involve homologous recombination events that result in substitution of alleles within conserved genomic regions. This type of event is observed infrequently among distantly related organisms. It is reported to be more common within species, but the frequency has been difficult to quantify since the sequences under comparison tend to have relatively few polymorphic sites. RESULTS Here we report a genome-wide assessment of homologous recombination among a collection of six complete Escherichia coli and Shigella flexneri genome sequences. We construct a whole-genome multiple alignment and identify clusters of polymorphic sites that exhibit atypical patterns of nucleotide substitution using a random walk-based method. The analysis reveals one large segment (approximately 100 kb) and 186 smaller clusters of single base pair differences that suggest lateral exchange between lineages. These clusters include portions of 10% of the 3,100 genes conserved in six genomes. Statistical analysis of the functional roles of these genes reveals that several classes of genes are over-represented, including those involved in recombination, transport and motility. CONCLUSION We demonstrate that intraspecific recombination in E. coli is much more common than previously appreciated and may show a bias for certain types of genes. The described method provides high-specificity, conservative inference of past recombination events.
Collapse
Affiliation(s)
- Bob Mau
- Department of Mathematics, Lincoln Drive, University of Wisconsin, Madison WI 53706, USA
- Department of Oncology, University Ave, University of Wisconsin, Madison WI 53706, USA
- Genome Center of Wisconsin, Henry Mall, University of Wisconsin, Madison WI 53706, USA
| | - Jeremy D Glasner
- Genome Center of Wisconsin, Henry Mall, University of Wisconsin, Madison WI 53706, USA
| | - Aaron E Darling
- Department of Computer Science, W. Dayton St, University of Wisconsin, Madison WI 53706, USA
| | - Nicole T Perna
- Genome Center of Wisconsin, Henry Mall, University of Wisconsin, Madison WI 53706, USA
- Department of Animal Health and Biomedical Sciences, Linden Drive, University of Wisconsin, Madison WI 53706, USA
| |
Collapse
|
15
|
Hughes JF, Coffin JM. Human endogenous retroviral elements as indicators of ectopic recombination events in the primate genome. Genetics 2005; 171:1183-94. [PMID: 16157677 PMCID: PMC1456821 DOI: 10.1534/genetics.105.043976] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
HERV elements make up a significant fraction of the human genome and, as interspersed repetitive elements, have the capacity to provide substrates for ectopic recombination and gene conversion events. To understand the extent to which these events occur and gain further insight into the complex evolutionary history of these elements in our genome, we undertook a phylogenetic study of the long terminal repeat sequences of 15 HERV-K(HML-2) elements in various primate species. This family of human endogenous retroviruses first entered the primate genome between 35 and 45 million years ago. Throughout primate evolution, these elements have undergone bursts of amplification. From this analysis, which is the largest-scale study of HERV sequence dynamics during primate evolution to date, we were able to detect intraelement gene conversion and recombination at five HERV-K loci. We also found evidence for replacement of an ancient element by another HERV-K provirus, apparently reflecting an occurrence of retroviral integration by homologous recombination. The high frequency of these events casts doubt on the accuracy of integration time estimates based only on divergence between retroelement LTRs.
Collapse
Affiliation(s)
- Jennifer F Hughes
- Department of Molecular Microbiology and Program in Genetics, Tufts University School of Medicine, Boston, Massachusetts 02111, USA
| | | |
Collapse
|
16
|
Eardly BD, Nour SM, van Berkum P, Selander RK. Rhizobial 16S rRNA and dnaK genes: mosaicism and the uncertain phylogenetic placement of Rhizobium galegae. Appl Environ Microbiol 2005; 71:1328-35. [PMID: 15746335 PMCID: PMC1065159 DOI: 10.1128/aem.71.3.1328-1335.2005] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The phylogenetic relatedness among 12 agriculturally important species in the order Rhizobiales was estimated by comparative 16S rRNA and dnaK sequence analyses. Two groups of related species were identified by neighbor-joining and maximum-parsimony analysis. One group consisted of Mesorhizobium loti and Mesorhizobium ciceri, and the other group consisted of Agrobacterium rhizogenes, Rhizobium tropici, Rhizobium etli, and Rhizobium leguminosarum. Although bootstrap support for the placement of the remaining six species varied, A. tumefaciens, Agrobacterium rubi, and Agrobacterium vitis were consistently associated in the same subcluster. The three other species included Rhizobium galegae, Sinorhizobium meliloti, and Brucella ovis. Among these, the placement of R. galegae was the least consistent, in that it was placed flanking the A. rhizogenes-Rhizobium cluster in the dnaK nucleotide sequence trees, while it was placed with the other three Agrobacterium species in the 16S rRNA and the DnaK amino acid trees. In an effort to explain the inconsistent placement of R. galegae, we examined polymorphic site distribution patterns among the various species. Localized runs of nucleotide sequence similarity were evident between R. galegae and certain other species, suggesting that the R. galegae genes are chimeric. These results provide a tenable explanation for the weak statistical support often associated with the phylogenetic placement of R. galegae, and they also illustrate a potential pitfall in the use of partial sequences for species identification.
Collapse
Affiliation(s)
- B D Eardly
- Pennsylvania State University, Berks Campus, PO Box 7009, Reading, PA 19610, USA.
| | | | | | | |
Collapse
|
17
|
Gilcrease EB, Winn-Stapley DA, Hewitt FC, Joss L, Casjens SR. Nucleotide sequence of the head assembly gene cluster of bacteriophage L and decoration protein characterization. J Bacteriol 2005; 187:2050-7. [PMID: 15743953 PMCID: PMC1064062 DOI: 10.1128/jb.187.6.2050-2057.2005] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
The temperate Salmonella enterica bacteriophage L is a close relative of the very well studied bacteriophage P22. In this study we show that the L procapsid assembly and DNA packaging genes, which encode terminase, portal, scaffold, and coat proteins, are extremely close relatives of the homologous P22 genes (96.3 to 99.1% identity in encoded amino acid sequence). However, we also identify an L gene, dec, which is not present in the P22 genome and which encodes a protein (Dec) that is present on the surface of L virions in about 150 to 180 molecules/virion. We also show that the Dec protein is a trimer in solution and that it binds to P22 virions in numbers similar to those for L virions. Its binding dramatically stabilizes P22 virions against disruption by a magnesium ion chelating agent. Dec protein binds to P22 coat protein shells that have expanded naturally in vivo or by sodium dodecyl sulfate treatment in vitro but does not bind to unexpanded procapsid shells. Finally, analysis of phage L restriction site locations and a number of patches of nucleotide sequence suggest that phages ST64T and L are extremely close relatives, perhaps the two closest relatives that have been independently isolated to date among the lambdoid phages.
Collapse
Affiliation(s)
- Eddie B Gilcrease
- Division of Cell Biology and Immunology, Department of Pathology, University of Utah Medical School, Salt Lake City, UT 84132, USA
| | | | | | | | | |
Collapse
|
18
|
Qiu WG, Schutzer SE, Bruno JF, Attie O, Xu Y, Dunn JJ, Fraser CM, Casjens SR, Luft BJ. Genetic exchange and plasmid transfers in Borrelia burgdorferi sensu stricto revealed by three-way genome comparisons and multilocus sequence typing. Proc Natl Acad Sci U S A 2004; 101:14150-5. [PMID: 15375210 PMCID: PMC521097 DOI: 10.1073/pnas.0402745101] [Citation(s) in RCA: 103] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Comparative genomics of closely related bacterial isolates is a powerful method for uncovering virulence and other important genome elements. We determined draft sequences (8-fold coverage) of the genomes of strains JD1 and N40 of Borrelia burgdorferi sensu stricto, the causative agent of Lyme disease, and we compared the predicted genes from the two genomes with those from the previously sequenced B31 genome. The three genomes are closely related and are evolutionarily approximately equidistant ( approximately 0.5% pairwise nucleotide differences on the main chromosome). We used a Poisson model of nucleotide substitution to screen for genes with elevated levels of nucleotide polymorphisms. The three-way genome comparison allowed distinction between polymorphisms introduced by mutations and those introduced by recombination using the method of phylogenetic partitioning. Tests for recombination suggested that patches of high-density nucleotide polymorphisms on the chromosome and plasmids arise by DNA exchange. The role of recombination as the main mechanism driving B. burgdorferi diversification was confirmed by multilocus sequence typing of 18 clinical isolates at 18 polymorphic loci. A strong linkage between the multilocus sequence genotypes and the major alleles of outer-surface protein C (ospC) suggested that balancing selection at ospC is a dominant force maintaining B. burgdorferi diversity in local populations. We conclude that B. burgdorferi undergoes genome-wide genetic exchange, including plasmid transfers, and previous reports of its clonality are artifacts from the use of geographically and ecological isolated samples. Frequent recombination implies a potential for rapid adaptive evolution and a possible polygenic basis of B. burgdorferi pathogenicity.
Collapse
Affiliation(s)
- Wei-Gang Qiu
- Department of Biological Sciences, Hunter College of the City University of New York, 695 Park Avenue, New York, NY 10021, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Wang L, Rothemund D, Curd H, Reeves PR. Species-wide variation in the Escherichia coli flagellin (H-antigen) gene. J Bacteriol 2003; 185:2936-43. [PMID: 12700273 PMCID: PMC154406 DOI: 10.1128/jb.185.9.2936-2943.2003] [Citation(s) in RCA: 106] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Escherichia coli is a clonal species. The best-understood components of its clonal variation are the flagellar (H) and polysaccharide (O) antigens, both well documented since the mid-1930s because of their use in serotyping. Flagellin is the protein subunit of the flagellum that carries H-antigen specificity. We show that 43 of the 54 H-antigen specificities of E. coli map to the flagellin gene at fliC and sequenced all 43 forms and confirmed specificity of each by cloning and expression. This is, to our knowledge, the first time that all known forms of such a highly polymorphic gene have been fully sequenced and characterized for any species. The established distinction between a highly variable central region and more conserved flanking regions is upheld. The sequences fall into two groups, one of which may be derived from the fliC gene of the E. coli/Salmonella enterica common ancestor, the other perhaps obtained by lateral transfer since species divergence. Comparison of sequences revealed that both horizontal DNA transfer and fixation of mutations under diversifying selection pressure contributed to polymorphism in this locus.
Collapse
Affiliation(s)
- Lei Wang
- School of Molecular and Microbial Biosciences (GO8), The University of Sydney, Sydney, NSW 2006, Australia
| | | | | | | |
Collapse
|
20
|
Posada D, Crandall KA. Evaluation of methods for detecting recombination from DNA sequences: computer simulations. Proc Natl Acad Sci U S A 2001; 98:13757-62. [PMID: 11717435 PMCID: PMC61114 DOI: 10.1073/pnas.241370698] [Citation(s) in RCA: 1074] [Impact Index Per Article: 46.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2001] [Indexed: 11/18/2022] Open
Abstract
Recombination is a key evolutionary process that shapes the architecture of genomes and the genetic structure of populations. Although many statistical methods are available for the detection of recombination from DNA sequences, their absolute and relative performance is still unknown. Here we evaluated the performance of 14 different recombination detection algorithms. We used the coalescent with recombination to simulate DNA sequences with different levels of recombination, genetic diversity, and rate variation among sites. Recombination detection methods were applied to these data sets, and whether they detected or not recombination was recorded. Different recombination methods showed distinct performance depending on the amount of recombination, genetic diversity, and rate variation among sites. The model of nucleotide substitution under which the data were generated did not seem to have a significant effect. Most methods increase power with more sequence divergence. In general, recombination detection methods seem to capture the presence of recombination, but they are not very powerful. Methods that use substitution patterns or incompatibility among sites were more powerful than methods based on phylogenetic incongruence. Most methods do not seem to infer more false positives than expected by chance. Especially depending on the amount of diversity in the data, different methods could be used to attain maximum power while minimizing false positives. Results shown here will provide some guidance in the selection of the most appropriate method/s for the analysis of the particular data at hand.
Collapse
Affiliation(s)
- D Posada
- Department of Zoology, Brigham Young University, Provo, UT 84602, USA.
| | | |
Collapse
|
21
|
Jiang SM, Wang L, Reeves PR. Molecular characterization of Streptococcus pneumoniae type 4, 6B, 8, and 18C capsular polysaccharide gene clusters. Infect Immun 2001; 69:1244-55. [PMID: 11179285 PMCID: PMC98014 DOI: 10.1128/iai.69.3.1244-1255.2001] [Citation(s) in RCA: 95] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Capsular polysaccharide (CPS) is a major virulence factor in Streptococcus pneumoniae. CPS gene clusters of S. pneumoniae types 4, 6B, 8, and 18C were sequenced and compared with those of CPS types 1, 2, 14, 19F, 19A, 23F, and 33F. All have the same four genes at the 5' end, encoding proteins thought to be involved in regulation and export. Sequences of these genes can be divided into two classes, and evidence of recombination between them was observed. Next is the gene encoding the transferase for the first step in the synthesis of CPS. The predicted amino acid sequences of these first sugar transferases have multiple transmembrane segments, a feature lacking in other transferases. Sugar pathway genes are located at the 3' end of the gene cluster. Comparison of the four dTDP-L-rhamnose pathway genes (rml genes) of CPS types 1, 2, 6B, 18C, 19F, 19A, and 23F shows that they have the same gene order and are highly conserved. There is a gradient in the nature of the variation of rml genes, the average pairwise difference for those close to the central region being higher than that for those close to the end of the gene cluster and, again, recombination sites can be observed in these genes. This is similar to the situation we observed for rml genes of O-antigen gene clusters of Salmonella enterica. Our data indicate that the conserved first four genes at the 5' ends and the relatively conserved rml genes at the 3' ends of the CPS gene clusters were sites for recombination events involved in forming new forms of CPS. We have also identified wzx and wzy genes for all sequenced CPS gene clusters by use of motifs.
Collapse
Affiliation(s)
- S M Jiang
- Department of Microbiology, The University of Sydney, Sydney, New South Wales 2006, Australia
| | | | | |
Collapse
|
22
|
Heterogeneous geographic patterns of nucleotide sequence diversity between two alcohol dehydrogenase genes in wild barley (Hordeum vulgare subspecies spontaneum). Proc Natl Acad Sci U S A 2001. [PMID: 11149938 PMCID: PMC14621 DOI: 10.1073/pnas.011537898] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Patterns of nucleotide sequence diversity in the predominantly self-fertilizing species Hordeum vulgare subspecies spontaneum (wild barley) are compared between the putative alcohol dehydrogenase 3 locus (denoted "adh3") and alcohol dehydrogenase 1 (adh1), two related but unlinked loci. The data consist of a sequence sample of 1,873 bp of "adh3" drawn from 25 accessions that span the species range. There were 104 polymorphic sites in the sequenced region of "adh3." The data reveal a strong geographic pattern of diversity at "adh3" despite geographic uniformity at adh1. Moreover, levels of nucleotide sequence diversity differ by nearly an order of magnitude between the two loci. Genealogical analysis resolved two distinct clusters of "adh3" alleles (dimorphic sequence types) that coalesce roughly 3 million years ago. One type consists of accessions from the Middle East, and the other consists of accessions predominantly from the Near East. The two "adh3" sequence types are characterized by a high level of differentiation between clusters ( approximately 2.2%), which induces an overall excess of intermediate frequency variants in the pooled sample. Finally, there is evidence of intralocus recombination in the "adh3" data, despite the high level of self-fertilization characteristic of wild barley.
Collapse
|
23
|
Anderson JP, Rodrigo AG, Learn GH, Madan A, Delahunty C, Coon M, Girard M, Osmanov S, Hood L, Mullins JI. Testing the hypothesis of a recombinant origin of human immunodeficiency virus type 1 subtype E. J Virol 2000; 74:10752-65. [PMID: 11044120 PMCID: PMC110950 DOI: 10.1128/jvi.74.22.10752-10765.2000] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The human immunodeficiency virus type 1 (HIV-1) epidemic in Southeast Asia has been largely due to the emergence of clade E (HIV-1E). It has been suggested that HIV-1E is derived from a recombinant lineage of subtype A (HIV-1A) and subtype E, with multiple breakpoints along the E genome. We obtained complete genome sequences of clade E viruses from Thailand (93TH057 and 93TH065) and from the Central African Republic (90CF11697 and 90CF4071), increasing the total number of HIV-1E complete genome sequences available to seven. Phylogenetic analysis of complete genomes showed that subtypes A and E are themselves monophyletic, although together they also form a larger monophyletic group. The apparent phylogenetic incongruence at different regions of the genome that was previously taken as evidence of recombination is shown to be not statistically significant. Furthermore, simulations indicate that bootscanning and pairwise distance results, previously used as evidence for recombination, can be misleading, particularly when there are differences in substitution or evolutionary rates across the genomes of different subtypes. Taken jointly, our analyses suggest that there is inadequate support for the hypothesis that subtype E variants are derived from a recombinant lineage. In contrast, many other HIV strains claimed to have a recombinant origin, including viruses for which only a single parental strain was employed for analysis, do indeed satisfy the statistical criteria we propose. Thus, while intersubtype recombinant HIV strains are indeed circulating, the criteria for assigning a recombinant origin to viral structures should include statistical testing of alternative hypotheses to avoid inappropriate assignments that would obscure the true evolutionary properties of these viruses.
Collapse
Affiliation(s)
- J P Anderson
- Departments of Molecular Biotechnology, Health Sciences Center, University of Washington, Seattle, Washington 98195, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Porcella SF, Fitzpatrick CA, Bono JL. Expression and immunological analysis of the plasmid-borne mlp genes of Borrelia burgdorferi strain B31. Infect Immun 2000; 68:4992-5001. [PMID: 10948116 PMCID: PMC101720 DOI: 10.1128/iai.68.9.4992-5001.2000] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
A lipoprotein gene family first identified in Borrelia burgdorferi strain 297, designated 2.9 LP and recently renamed mlp, was found on circular and linear plasmids in the genome sequence of B. burgdorferi strain B31-M1. Sequence analyses of the B31 mlp genes and physically linked variant gene families indicated that mlp gene heterogeneity is unique and unrelated to location or linkage to divergent sequences. Evidence of recombination between B31 mlp alleles was also detected. Northern blot analysis of cultured strain B31 indicated that the mlp genes were not expressed at a temperature (23 degrees C) characteristic of that of ticks in the environment. In striking contrast, expression of many mlp genes increased substantially when strain B31 was shifted to 35 degrees C, a temperature change mimicking that occurring in the natural transmission cycle of the spirochete from tick to mammal. Primer extension analysis of the mlp mRNA transcripts suggested that sigma 70-like promoters are involved in mlp expression during temperature shift conditions. Antibodies were made against strain B31 Mlp proteins within the first 4 weeks after experimental mouse infection. Importantly, Lyme disease patients also had serum antibodies reactive with purified recombinant Mlp proteins from strain B31, a result indicating that humans are exposed to Mlp proteins during infection. Taken together, the data indicate that strain B31 mlp genes encode a diverse array of lipoproteins which may participate in early infection processes in the mammalian host.
Collapse
Affiliation(s)
- S F Porcella
- Laboratory of Human Bacterial Pathogenesis, National Institute of Allergy and Infectious Disease, National Institutes of Health, Rocky Mountain Laboratories, Hamilton, Montana 59840, USA.
| | | | | |
Collapse
|
25
|
Kiewitz C, Tümmler B. Sequence diversity of Pseudomonas aeruginosa: impact on population structure and genome evolution. J Bacteriol 2000; 182:3125-35. [PMID: 10809691 PMCID: PMC94498 DOI: 10.1128/jb.182.11.3125-3135.2000] [Citation(s) in RCA: 126] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Comparative sequencing of Pseudomonas aeruginosa genes oriC, citS, ampC, oprI, fliC, and pilA in 19 environmental and clinical isolates revealed the sequence diversity to be about 1 order of magnitude lower than in comparable housekeeping genes of Salmonella. In contrast to the low nucleotide substitution rate, the frequency of recombination among different P. aeruginosa genotypes was high, leading to the random association of alleles. The P. aeruginosa population consists of equivalent genotypes that form a net-like population structure. However, each genotype represents a cluster of closely related strains which retain their sequence signature in the conserved gene pool and carry a set of genotype-specific DNA blocks. The codon adaptation index, a quantitative measure of synonymous codon bias of genes, was found to be consistently high in the P. aeruginosa genome irrespective of the metabolic category and the abundance of the encoded gene product. Such uniformly high codon adaptation indices of 0.55 to 0.85 fit the ubiquitous lifestyle of P. aeruginosa.
Collapse
Affiliation(s)
- C Kiewitz
- Klinische Forschergruppe, Medizinische Hochschule Hannover, D-30623 Hannover, Germany.
| | | |
Collapse
|
26
|
Abstract
Diverse African and non-African samples of the X-linked PDHA1 (pyruvate dehydrogenase E1 alpha subunit) locus revealed a fixed DNA sequence difference between the two sample groups. The age of onset of population subdivision appears to be about 200 thousand years ago. This predates the earliest modern human fossils, suggesting the transformation to modern humans occurred in a subdivided population. The base of the PDHA1 gene tree is relatively ancient, with an estimated age of 1.86 million years, a late Pliocene time associated with early species of Homo. PDHA1 revealed very low variation among non-Africans, but in other respects the data are consistent with reports from other X-linked and autosomal haplotype data sets. Like these other genes, but in conflict with microsatellite and mitochondrial data, PDHA1 does not show evidence of human population expansion.
Collapse
Affiliation(s)
- E E Harris
- Department of Genetics, Rutgers University, Nelson Biological Labs, 604 Allison Road, Piscataway, NJ 08854-8082, USA
| | | |
Collapse
|
27
|
Clark AG, Weiss KM, Nickerson DA, Taylor SL, Buchanan A, Stengård J, Salomaa V, Vartiainen E, Perola M, Boerwinkle E, Sing CF. Haplotype structure and population genetic inferences from nucleotide-sequence variation in human lipoprotein lipase. Am J Hum Genet 1998; 63:595-612. [PMID: 9683608 PMCID: PMC1377318 DOI: 10.1086/301977] [Citation(s) in RCA: 326] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
Allelic variation in 9.7 kb of genomic DNA sequence from the human lipoprotein lipase gene (LPL) was scored in 71 healthy individuals (142 chromosomes) from three populations: African Americans (24) from Jackson, MS; Finns (24) from North Karelia, Finland; and non-Hispanic Whites (23) from Rochester, MN. The sequences had a total of 88 variable sites, with a nucleotide diversity (site-specific heterozygosity) of .002+/-.001 across this 9.7-kb region. The frequency spectrum of nucleotide variation exhibited a slight excess of heterozygosity, but, in general, the data fit expectations of the infinite-sites model of mutation and genetic drift. Allele-specific PCR helped resolve linkage phases, and a total of 88 distinct haplotypes were identified. For 1,410 (64%) of the 2,211 site pairs, all four possible gametes were present in these haplotypes, reflecting a rich history of past recombination. Despite the strong evidence for recombination, extensive linkage disequilibrium was observed. The number of haplotypes generally is much greater than the number expected under the infinite-sites model, but there was sufficient multisite linkage disequilibrium to reveal two major clades, which appear to be very old. Variation in this region of LPL may depart from the variation expected under a simple, neutral model, owing to complex historical patterns of population founding, drift, selection, and recombination. These data suggest that the design and interpretation of disease-association studies may not be as straightforward as often is assumed.
Collapse
Affiliation(s)
- A G Clark
- Institute of Molecular Evolutionary Genetics, Department of Biology, Pennsylvania State University, University Park, PA 16802, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Wang FS, Whittam TS, Selander RK. Evolutionary genetics of the isocitrate dehydrogenase gene (icd) in Escherichia coli and Salmonella enterica. J Bacteriol 1997; 179:6551-9. [PMID: 9352899 PMCID: PMC179578 DOI: 10.1128/jb.179.21.6551-6559.1997] [Citation(s) in RCA: 45] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Sequences of the icd gene, encoding isocitrate dehydrogenase (IDH), were obtained for 33 strains representing the major phylogenetic lineages of Escherichia coli and Salmonella enterica. Evolutionary relationships of the strains based on variation in icd are generally similar to those previously obtained for several other housekeeping and for invasion genes, but the sequences of S. enterica subspecies V strains are unusual in being almost intermediate between those of the other S. enterica subspecies and E. coli. For S. enterica, the ratio of synonymous (silent) to nonsynonymous (replacement) nucleotide substitutions between pairs of strains was larger than comparable values for 12 other housekeeping and invasion genes, reflecting unusually strong purifying selection against amino acid replacement in the IDH enzyme. All amino acids involved in the catalytic activity and conformational changes of IDH are strictly conserved within and between species. In E. coli, the level of variation at the 3' end of the gene is elevated by the presence in some strains of a 165-bp replacement sequence supplied by the integration of either lambdoid phage 21 or defective prophage element e14. The 72 members of the E. coli Reference Collection (ECOR) and five additional E. coli strains were surveyed for the presence of phage 21 (as prophage) by PCR amplification of a phage 21-specific fragment in and adjacent to the host icd, and the sequence of the phage 21 segment extending from the 3' end of icd through the integrase gene (int) was determined in nine strains of E. coli. Phage 21 was found in 39% of E. coli strains, and its distribution among the ECOR strains is nonrandom. In two ECOR strains, the phage 21 int gene is interrupted by a 1,313-bp insertion element that has 99.3% nucleotide sequence identity with IS3411 of E. coli. The phylogenetic relationships of phage 21 strains derived from sequences of two different genomic regions were strongly incongruent, providing evidence of frequent recombination.
Collapse
Affiliation(s)
- F S Wang
- Institute of Molecular Evolutionary Genetics, Mueller Laboratory, Pennsylvania State University, University Park 16802, USA.
| | | | | |
Collapse
|
29
|
Boyd EF, Li J, Ochman H, Selander RK. Comparative genetics of the inv-spa invasion gene complex of Salmonella enterica. J Bacteriol 1997; 179:1985-91. [PMID: 9068645 PMCID: PMC178923 DOI: 10.1128/jb.179.6.1985-1991.1997] [Citation(s) in RCA: 60] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
The chromosomal region containing the Salmonella enterica pathogenic island inv-spa was present in the last common ancestor of all the contemporary lineages of salmonellae. For multiple strains of S. enterica, representing all eight subspecies, nucleotide sequences were obtained for five genes of the inv-spa invasion complex, invH, invE, invA, spaM, and spaN, al of which encode proteins that are required for entry of the bacteria into cultured epithelial cells. The invE, invA, spaM, and spaN genes were present in all eight subspecies of S. enterica, and for invE and invA and their products, levels of sequence variation among strains were within the ranges reported for housekeeping genes. In contrast, the InvH, SpaM, and SpaN proteins were unusually variable in amino acid sequence. Furthermore, invH was absent from the subspecies V isolates examined. The SpaM and SpaN proteins provide further evidence of a relationship (first detected by Li et al. [J. Li, H. Ochman, E. A. Groisman, E. F. Boyd, F. Solomon, K. Nelson, and R. K. Selander, Proc. Natl. Acad. Sci. USA 92:7252-7256, 1995]) between the cellular location of the products of the inv-spa genes and evolutionary rate, as reflected in the level of polymorphism within S. enterica. Invasion proteins that are membrane bound or membrane associated are relatively conserved in amino acid sequence, whereas those that are exported to the extracellular environment are hypervariable, possibly reflecting the action of diversifying selection.
Collapse
Affiliation(s)
- E F Boyd
- Institute of Molecular Evolutionary Genetics, Pennsylvania State University, University Park 16802, USA
| | | | | | | |
Collapse
|
30
|
Abstract
A phylogenetic analysis of mammalian type I interferon (IFN) genes showed: (1) that the three main subfamilies of these genes in mammals (IFN-beta, IFN-alpha, and IFN-omega) diverged after the divergence of birds and mammals but before radiation of the eutherian orders and (2) that IFN-beta diverged first. Although apparent cases of interlocus recombination among mouse IFN-alpha genes were identified, the hypothesis that coding regions of IFN-alpha genes have been homogenized within species by interlocus recombination was not supported. Flanking regions as well as coding regions of IFN-alpha were more similar within human and mouse than between these species; and reconstruction of the pattern of nucleotide substitution in IFN-alpha coding regions of four mammalian species by the maximum parsimony method suggested that parallel substitutions have occurred far more frequently between species than within species. Therefore, it seems likely that IFN-alpha genes have duplicated independently within different eutherian orders. In general, type I IFN genes are subject to purifying selection, which in the case of IFN-alpha and IFN-beta is strongest in the putative receptor-binding domains. However, analysis of the pattern of nucleotide substitution among IFN-omega genes suggested that positive Darwinian selection may have acted in some cases to diversify members of this subfamily at the amino acid level.
Collapse
Affiliation(s)
- A L Hughes
- Department of Biology, Pennsylvania State University, University Park 16802, USA
| |
Collapse
|
31
|
Karaolis DK, Lan R, Reeves PR. The sixth and seventh cholera pandemics are due to independent clones separately derived from environmental, nontoxigenic, non-O1 Vibrio cholerae. J Bacteriol 1995; 177:3191-8. [PMID: 7768818 PMCID: PMC177010 DOI: 10.1128/jb.177.11.3191-3198.1995] [Citation(s) in RCA: 134] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
The DNA sequences of the asd genes from 45 isolates of Vibrio cholerae (19 clinical O1 isolates, 2 environmental nontoxigenic O1 isolates, and 24 isolates with different non-O1 antigens) were determined. No differences were found within either sixth- or seventh-pandemic isolates; however, variation was found between the two forms and among the non-O1 isolates. O139 isolates had sequences identical to those of seventh-pandemic isolates. Phylogenetic trees with Vibrio mimicus as the outgroup suggest that the sixth-pandemic, seventh-pandemic, and U.S. Gulf isolates are three clones that have evolved independently from different lineages of environmental, nontoxigenic, non-O1 V. cholerae isolates. There is evidence for horizontal transfer of O antigen, since isolates with nearly identical asd sequences had different O antigens, and isolates with the O1 antigen did not cluster together but were found in different lineages. We also found evidence for recombination events within the asd gene of V. cholerae. V. cholerae may have a higher level of genetic exchange and a lower level of clonality than species such as Salmonella enterica and Escherichia coli.
Collapse
Affiliation(s)
- D K Karaolis
- Department of Microbiology (GO8), University of Sydney, New South Wales, Australia
| | | | | |
Collapse
|
32
|
Nelson K, Selander RK. Intergeneric transfer and recombination of the 6-phosphogluconate dehydrogenase gene (gnd) in enteric bacteria. Proc Natl Acad Sci U S A 1994; 91:10227-31. [PMID: 7937867 PMCID: PMC44991 DOI: 10.1073/pnas.91.21.10227] [Citation(s) in RCA: 72] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
The gnd gene, encoding 6-phosphogluconate dehydrogenase (EC 1.1.1.44), was sequenced in 87 strains of 15 species assigned to five nominal genera of the Enterobacteriaceae, including 36 isolates of Salmonella enterica and 32 strains of Escherichia coli. In S. enterica, the effective (realized) rate of recombination of horizontally transferred gnd sequences is only moderately higher than the rates for other chromosomal housekeeping genes. In contrast, recombination at gnd has occurred with such high frequency in Escherichia coli that the indicated evolutionary relationships among strains are not congruent with those estimated by sequence analysis of other genes and by multilocus enzyme electrophoresis. E. coli and S. enterica apparently have not exchanged gnd sequences, but those of several strains of E. coli have been imported from species of Citrobacter and Klebsiella. The relatively frequent exchange of gnd within and among taxonomic groups of the Enterobacteriaceae, compared with other housekeeping genes, apparently results from its close linkage with genes that are subject to diversifying selection, including those of the rfb region determining the structure of the O antigen polysaccharide.
Collapse
Affiliation(s)
- K Nelson
- Institute of Molecular Evolutionary Genetics, Pennsylvania State University, University Park 16802
| | | |
Collapse
|
33
|
Li J, Nelson K, McWhorter AC, Whittam TS, Selander RK. Recombinational basis of serovar diversity in Salmonella enterica. Proc Natl Acad Sci U S A 1994; 91:2552-6. [PMID: 8146152 PMCID: PMC43407 DOI: 10.1073/pnas.91.7.2552] [Citation(s) in RCA: 75] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
The fliC gene, which encodes phase 1 flagellin, was sequenced in strains of 15 Salmonella enterica serovars expressing flagellar antigenic factors of the g series. The occurrence of each of the flagellin serotypes g,m, m,t, and g,z51 in distantly related strains is the result of horizontal exchange of DNA, as indicated by identity or close similarity in nucleotide sequence of all or parts of the antigenic factor-determining central region of fliC. The flagellin genes of some serovars are complex mosaic structures composed of diverse segments derived through multiple recombination events. Thus, recombination of horizontally transferred segments (intragenic) or entire genes (assortative) within and among subspecies is identified as a major evolutionary mechanism generating both allelic variation at the fliC locus and serovar diversity in natural populations. Evidence that flagellar serological diversity is promoted by diversifying selection in adaptation to host immune defense system or flagellotropic phage is discussed.
Collapse
Affiliation(s)
- J Li
- Institute of Molecular Evolutionary Genetics, Pennsylvania State University, University Park 16802
| | | | | | | | | |
Collapse
|
34
|
Boyd EF, Nelson K, Wang FS, Whittam TS, Selander RK. Molecular genetic basis of allelic polymorphism in malate dehydrogenase (mdh) in natural populations of Escherichia coli and Salmonella enterica. Proc Natl Acad Sci U S A 1994; 91:1280-4. [PMID: 8108402 PMCID: PMC43141 DOI: 10.1073/pnas.91.4.1280] [Citation(s) in RCA: 110] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Nucleotide sequences of the mdh gene encoding the metabolic enzyme malate dehydrogenase (MDH) were determined for 44 strains representing the major lineages of Escherichia coli and the eight subspecies of Salmonella enterica. Sequence diversity was four times greater in S. enterica than in E. coli, and in both species the rate of amino acid substitution was lower in the NAD(+)-binding domain than in the catalytic domain. Divergence of the mdh genes of the two species apparently has not involved excess nonsynonymous substitutions resulting from the fixation of adaptive amino acid mutations. Allozyme analysis detected 57% of the distinctive amino acid sequences. Statistical tests of the distribution of polymorphic synonymous nucleotide sites identified four possible intragenic recombination events, one involving a single allele of E. coli and three involving alleles of the three subspecies of S. enterica. But recombination at mdh has not occurred with sufficient frequency to obscure the phylogenetic relationships among strains indicated by multilocus enzyme electrophoresis, total DNA hybridization, and sequence analysis of the gapA and putP genes. These findings provide further evidence that the effective (realized) rates of horizontal transfer and recombination for metabolic enzyme and other housekeeping genes are generally low in these species, in contrast to those for loci encoding or mediating the structure of cell-surface and other macromolecules for which recombinants may be subject to strong balancing, directional, or diversifying selection.
Collapse
Affiliation(s)
- E F Boyd
- Institute of Molecular Evolutionary Genetics, Pennsylvania State University, University Park 16802
| | | | | | | | | |
Collapse
|
35
|
Takahata N. Comments on the detection of reciprocal recombination or gene conversion. Immunogenetics 1994; 39:146-9. [PMID: 8276458 DOI: 10.1007/bf00188618] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Affiliation(s)
- N Takahata
- Department of Genetics, Graduate University for Advanced Studies, Mishima, Japan
| |
Collapse
|
36
|
Dykhuizen DE, Polin DS, Dunn JJ, Wilske B, Preac-Mursic V, Dattwyler RJ, Luft BJ. Borrelia burgdorferi is clonal: implications for taxonomy and vaccine development. Proc Natl Acad Sci U S A 1993; 90:10163-7. [PMID: 8234271 PMCID: PMC47734 DOI: 10.1073/pnas.90.21.10163] [Citation(s) in RCA: 117] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
The chromosomal genes fla and p93 and the ospA gene from a linear plasmid were sequenced from up to 15 isolates of Borrelia burgdorferi, which causes Lyme borreliosis in man. Comparison of the gene trees provides no evidence for genetic exchange between chromosomal genes, suggesting B. burgdorferi is strictly clonal. Comparison of the chromosomal gene trees with that of the plasmid-encoded ospA reveals that plasmid transfer between clones is rare. Evidence for intragenic recombination was found in only a single ospA allele. The analysis reveals three common clones and a number of rare clones that are so highly divergent that vaccines developed against one are unlikely to provide immunity to organisms from others. Consequently, an understanding of the geographic and genetic variability of B. burgdorferi will prove essential for the development of effective vaccines and programs for control. While the major clones might be regarded as different species, the clonal population structure, the geographic localization, and the widespread incidence of Lyme disease suggest that B. burgdorferi should remain the name for the entire array of organisms.
Collapse
Affiliation(s)
- D E Dykhuizen
- Department of Ecology and Evolution, State University of New York, Stony Brook 11794
| | | | | | | | | | | | | |
Collapse
|
37
|
Nelson K, Selander RK. Evolutionary genetics of the proline permease gene (putP) and the control region of the proline utilization operon in populations of Salmonella and Escherichia coli. J Bacteriol 1992; 174:6886-95. [PMID: 1400239 PMCID: PMC207367 DOI: 10.1128/jb.174.21.6886-6895.1992] [Citation(s) in RCA: 81] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Virtually complete sequences (1,467 bp) of the proline permease gene (putP) and complete sequences (416 to 422 bp) of the control region of the proline utilization operon were determined for 16 strains of Salmonella, representing all eight subspecies, and 13 strains of Escherichia coli recovered from natural populations. Strains of Salmonella and E. coli differed, on average, at 16.3% of putP nucleotide sites and 17.5% of control region sites; the average difference between strains was much larger for Salmonella strains (4.6% of putP sites and 3.4% of control region sites) than for E. coli (2.4 and 0.9%, respectively). There was no difference in the distribution of polymorphic amino acid positions between the membrane-spanning and loop regions of the permease molecule, and rates of synonymous nucleotide substitution were virtually the same for the two domains. Statistical analysis yielded evidence of three probable cases of intragenic recombination, including the acquisition of a large segment of putP by strains of Salmonella subspecies VII from an unidentified source, the exchange of a 21-bp segment between two strains of E. coli, and the acquisition by one strain of E. coli of a cluster of 14 unique polymorphic control region sites from an unknown donor. An evolutionary tree for the putP and control region sequences was generally concordant with a tree for the gapA gene and a tree based on multilocus enzyme electrophoresis, thus providing evidence that for neither gene nor for enzyme genes in general has recombination occurred at rates sufficiently high or over regions sufficiently large to completely obscure phylogenetic relationships dependent on mutational divergence. It is suggested that the recombination rate varies among genes in relation to functional type, being highest for genes encoding cell surface and other proteins for which there is an adaptive advantage in structural diversity.
Collapse
Affiliation(s)
- K Nelson
- Institute of Molecular Evolutionary Genetics, Mueller Laboratory, Pennsylvania State University, University Park 16802
| | | |
Collapse
|
38
|
Sharp PM, Kelleher JE, Daniel AS, Cowan GM, Murray NE. Roles of selection and recombination in the evolution of type I restriction-modification systems in enterobacteria. Proc Natl Acad Sci U S A 1992; 89:9836-40. [PMID: 1409708 PMCID: PMC50228 DOI: 10.1073/pnas.89.20.9836] [Citation(s) in RCA: 43] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Restriction-modification systems can protect bacteria against viral infection. Sequences of the hsdM gene, encoding one of the three subunits of type I restriction-modification systems, have been determined for four strains of enterobacteria. Comparison with the known sequences of EcoK and EcoR124 indicates that all are homologous, though they fall into three families (exemplified by EcoK, EcoA, and EcoR124), the first two of which are apparently allelic. The extent of amino acid sequence identity between EcoK and EcoA is so low that the genes encoding them might be better termed pseudoalleles; this almost certainly reflects genetic exchange among highly divergent species. Within the EcoK family the ratio of intra- to interspecific divergence is very high. The extent of divergence between the genes from Escherichia coli K-12 and Salmonella typhimurium LT2 is similar to that for other genes with the same level of codon usage bias. In contrast, intraspecific divergence (between E. coli strains B and K-12) is extremely high and may reflect the action of frequency-dependent selection mediated by bacteriophages. There is also evidence of lateral transfer of a short sequence between E. coli and S. typhimurium.
Collapse
Affiliation(s)
- P M Sharp
- Department of Genetics, Trinity College, Dublin, Ireland
| | | | | | | | | |
Collapse
|
39
|
Abstract
Some genes in prokaryotes consist of a mosaic of regions derived from different ancestors by horizontal gene transfer. A method is described for demonstrating the statistical significance of such mosaic structure and for locating the crossover points separating different regions.
Collapse
Affiliation(s)
- J M Smith
- School of Biological Science, University of Sussex, Brighton, England
| |
Collapse
|
40
|
Clark AG, Kao TH. Excess nonsynonymous substitution of shared polymorphic sites among self-incompatibility alleles of Solanaceae. Proc Natl Acad Sci U S A 1991; 88:9823-7. [PMID: 1946408 PMCID: PMC52813 DOI: 10.1073/pnas.88.21.9823] [Citation(s) in RCA: 80] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
The function of the self-incompatibility locus (S locus) of many plant species dictates that natural selection will favor high levels of protein diversity. Pairwise sequence comparisons between S alleles from four species of Solanaceae reveal remarkably high sequence diversity and evidence for shared polymorphism. The level of amino acid constraint was found to be significantly heterogeneous among different regions of the gene, with some regions being highly constrained and others appearing to be virtually unconstrained. In some regions of the protein, there was an excess of nonsynonymous over synonymous substitution, consistent with the strong diversifying selection that must operate on this locus. These hypervariable regions are candidates for the sites that determine functional allelic identity. Simple contingency table tests show that sites that have polymorphism shared between species have more nonsynonymous substitution than polymorphic sites that do not exhibit shared polymorphism. This is consistent with the idea that adaptive evolution favoring amino acid replacement is occurring at sites with shared polymorphism. Tests of clustered polymorphism reveal that an unusually low rate of recombination must be occurring in this locus, allowing very ancient alleles to preserve their identity.
Collapse
Affiliation(s)
- A G Clark
- Institute of Molecular Evolutionary Genetics, Pennsylvania State University, University Park 16802
| | | |
Collapse
|
41
|
Nelson K, Whittam TS, Selander RK. Nucleotide polymorphism and evolution in the glyceraldehyde-3-phosphate dehydrogenase gene (gapA) in natural populations of Salmonella and Escherichia coli. Proc Natl Acad Sci U S A 1991; 88:6667-71. [PMID: 1862091 PMCID: PMC52149 DOI: 10.1073/pnas.88.15.6667] [Citation(s) in RCA: 103] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Nucleotide sequences of the gapA gene, encoding the glycolytic enzyme glyceraldehyde-3-phosphate dehydrogenase, were determined for 16 strains of Salmonella and 13 strains of Escherichia coli recovered from natural populations. Pairs of sequences from strains representing the eight serovar groups of Salmonella differed, on average, at 3.8% of nucleotide sites and 1.1% of inferred amino acids, and comparable values for E. coli were an order of magnitude smaller (0.2% and 0.1%, respectively). The rate of substitution at synonymous sites was significantly higher for codons specifying the catalytic domain of the enzyme than for those encoding the NAD(+)-binding domain, but the nonsynonymous substitution rate showed the opposite relationship. For Salmonella, statistical tests for nonrandom clustering of polymorphic sites failed to provide evidence that intragenic recombination or gene conversion has contributed to the generation of allelic diversity. The topology of a tree constructed from the gapA sequences was generally similar to that of phylogenetic trees of the strains based on multilocus enzyme electrophoresis, but the level of divergence of gapA in Salmonella group V from other Salmonella and E. coli strains is much greater than that indicated by DNA hybridization for the genome as a whole.
Collapse
Affiliation(s)
- K Nelson
- Institute of Molecular Evolutionary Genetics, Pennsylvania State University, University Park 16802
| | | | | |
Collapse
|
42
|
Kuhner M, Watts S, Klitz W, Thomson G, Goodenow RS. Gene conversion in the evolution of both the H-2 and Qa class I genes of the murine major histocompatibility complex. Genetics 1990; 126:1115-26. [PMID: 2076814 PMCID: PMC1204274 DOI: 10.1093/genetics/126.4.1115] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
In order to better understand the role of gene conversion in the evolution of the class I gene family of the major histocompatibility complex (MHC), we have used a computer algorithm to detect clustered sequence similarities among 24 class I DNA sequences from the H-2, Qa, and Tla regions of the murine MHC. Thirty-four statistically significant clusters were detected; individual analysis of the clusters suggested at least 25 past gene conversion or recombination events. These clusters are comparable in size to the conversions observed in the spontaneously occurring H-2Kbm and H-2Kkm2 mutations, and are distributed throughout all exons of the class I gene. Thus, gene conversion does not appear to be restricted to the regions of the class I gene encoding their antigen-presentation function. Moreover, both the highly polymorphic H-2 loci and the relatively monomorphic Qa and Tla loci appear to have participated as donors and recipients in conversion events. If gene conversion is not limited to the highly polymorphic loci of the MHC, then another factor, presumably natural selection, must be responsible for maintaining the observed differences in level of variation.
Collapse
Affiliation(s)
- M Kuhner
- Department of Integrative Biology, University of California, Berkeley 94720
| | | | | | | | | |
Collapse
|
43
|
Yuhki N, O'Brien SJ. DNA recombination and natural selection pressure sustain genetic sequence diversity of the feline MHC class I genes. J Exp Med 1990; 172:621-30. [PMID: 1695669 PMCID: PMC2188339 DOI: 10.1084/jem.172.2.621] [Citation(s) in RCA: 29] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Sequence comparisons of seven distinct MHC class I cDNA clones revealed that feline class I molecules have a remarkable similarity to human HLA genes in their organization of functional domains as well as in the nonrandom partitioning of genetic variability according to the functional constraints ascribed to different regions of the MHC molecule. The distribution of the pattern of sequence polymorphism in the cat as compared with genetic diversity of human and mouse class I genes provides evidence for four coordinate factors that contribute to the origin and sustenance of abundant allele diversity that characterizes the MHC in the species. These include: (a) a gradual accumulation of spontaneous mutational substitution over evolutionary time; (b) selection against mutational divergence in regions of the class I molecule involved in T cell receptor interaction and also in certain regions that interact with common features of antigens; (c) positive selection pressure in favor of persistence of polymorphism and heterozygosity at 57 nucleotide residues that comprise the antigen recognition site; and (d) periodic intragenic (interallelic) and intergenic recombination within the class I genes. We describe a highly conserved 23-bp nucleotide sequence within the coding region of the first alpha-helix that separates two relatively polymorphic segments located in the alpha 1 domain that may act as a template or "hot spot" for homologous recombination between class I alleles.
Collapse
Affiliation(s)
- N Yuhki
- Laboratory of Viral Carcinogenesis, National Cancer Institute, Frederick, Maryland 21701
| | | |
Collapse
|
44
|
DuBose RF, Dykhuizen DE, Hartl DL. Genetic exchange among natural isolates of bacteria: recombination within the phoA gene of Escherichia coli. Proc Natl Acad Sci U S A 1988; 85:7036-40. [PMID: 3045828 PMCID: PMC282115 DOI: 10.1073/pnas.85.18.7036] [Citation(s) in RCA: 89] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
An 1871-nucleotide region including the phoA gene (the structural gene encoding alkaline phosphatase, EC 3.1.3.1) was cloned and sequenced from eight naturally occurring strains of Escherichia coli. Alignment with the sequence from E. coli K-12 made apparent that there were 87 polymorphic nucleotide sites, of which 42 were informative for phylogenetic analysis. Maximum parsimony analysis revealed six equally parsimonious trees with a consistency index of 0.80. Of the 42 informative sites, 22 were inconsistent with each of the maximum parsimony trees. The spatial distribution of the inconsistent sites was highly nonrandom in a manner implying that intragenic recombination has played a major role in determining the evolutionary history of the nine alleles. The implication is that different segments of the phoA gene have different phylogenetic histories.
Collapse
Affiliation(s)
- R F DuBose
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110
| | | | | |
Collapse
|