1
|
Vollger MR, Dishuck PC, Harvey WT, DeWitt WS, Guitart X, Goldberg ME, Rozanski AN, Lucas J, Asri M, Munson KM, Lewis AP, Hoekzema K, Logsdon GA, Porubsky D, Paten B, Harris K, Hsieh P, Eichler EE. Increased mutation and gene conversion within human segmental duplications. Nature 2023; 617:325-334. [PMID: 37165237 PMCID: PMC10172114 DOI: 10.1038/s41586-023-05895-y] [Citation(s) in RCA: 54] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Accepted: 02/28/2023] [Indexed: 05/12/2023]
Abstract
Single-nucleotide variants (SNVs) in segmental duplications (SDs) have not been systematically assessed because of the limitations of mapping short-read sequencing data1,2. Here we constructed 1:1 unambiguous alignments spanning high-identity SDs across 102 human haplotypes and compared the pattern of SNVs between unique and duplicated regions3,4. We find that human SNVs are elevated 60% in SDs compared to unique regions and estimate that at least 23% of this increase is due to interlocus gene conversion (IGC) with up to 4.3 megabase pairs of SD sequence converted on average per human haplotype. We develop a genome-wide map of IGC donors and acceptors, including 498 acceptor and 454 donor hotspots affecting the exons of about 800 protein-coding genes. These include 171 genes that have 'relocated' on average 1.61 megabase pairs in a subset of human haplotypes. Using a coalescent framework, we show that SD regions are slightly evolutionarily older when compared to unique sequences, probably owing to IGC. SNVs in SDs, however, show a distinct mutational spectrum: a 27.1% increase in transversions that convert cytosine to guanine or the reverse across all triplet contexts and a 7.6% reduction in the frequency of CpG-associated mutations when compared to unique DNA. We reason that these distinct mutational properties help to maintain an overall higher GC content of SD DNA compared to that of unique DNA, probably driven by GC-biased conversion between paralogous sequences5,6.
Collapse
Affiliation(s)
- Mitchell R Vollger
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Division of Medical Genetics, University of Washington School of Medicine, Seattle, WA, USA
| | - Philip C Dishuck
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - William S DeWitt
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA
| | - Xavi Guitart
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Michael E Goldberg
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Allison N Rozanski
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Julian Lucas
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Mobin Asri
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Kelley Harris
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - PingHsun Hsieh
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
- Howard Hughes Medical Institute, Chevy Chase, MD, USA.
| |
Collapse
|
2
|
Sakamoto T, Innan H. Muller's ratchet of the Y chromosome with gene conversion. Genetics 2022; 220:iyab204. [PMID: 34791206 PMCID: PMC8733426 DOI: 10.1093/genetics/iyab204] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Accepted: 10/28/2021] [Indexed: 11/13/2022] Open
Abstract
Muller's ratchet is a process in which deleterious mutations are fixed irreversibly in the absence of recombination. The degeneration of the Y chromosome, and the gradual loss of its genes, can be explained by Muller's ratchet. However, most theories consider single-copy genes, and may not be applicable to Y chromosomes, which have a number of duplicated genes in many species, which are probably undergoing concerted evolution by gene conversion. We developed a model of Muller's ratchet to explore the evolution of the Y chromosome. The model assumes a nonrecombining chromosome with both single-copy and duplicated genes. We used analytical and simulation approaches to obtain the rate of gene loss in this model, with special attention to the role of gene conversion. Homogenization by gene conversion makes both duplicated copies either mutated or intact. The former promotes the ratchet, and the latter retards, and we ask which of these counteracting forces dominates under which conditions. We found that the effect of gene conversion is complex, and depends upon the fitness effect of gene duplication. When duplication has no effect on fitness, gene conversion accelerates the ratchet of both single-copy and duplicated genes. If duplication has an additive fitness effect, the ratchet of single-copy genes is accelerated by gene duplication, regardless of the gene conversion rate, whereas gene conversion slows the degeneration of duplicated genes. Our results suggest that the evolution of the Y chromosome involves several parameters, including the fitness effect of gene duplication by increasing dosage and gene conversion rate.
Collapse
Affiliation(s)
- Takahiro Sakamoto
- Department of Evolutionary Studies of Biosystems, SOKENDAI, The Graduate University for Advanced Studies, Hayama, Kanagawa 240-0193, Japan
| | - Hideki Innan
- Department of Evolutionary Studies of Biosystems, SOKENDAI, The Graduate University for Advanced Studies, Hayama, Kanagawa 240-0193, Japan
| |
Collapse
|
3
|
Fawcett JA, Innan H. The Role of Gene Conversion between Transposable Elements in Rewiring Regulatory Networks. Genome Biol Evol 2020; 11:1723-1729. [PMID: 31209488 PMCID: PMC6598467 DOI: 10.1093/gbe/evz124] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/11/2019] [Indexed: 12/23/2022] Open
Abstract
Nature has found many ways to utilize transposable elements (TEs) throughout evolution. Many molecular and cellular processes depend on DNA-binding proteins recognizing hundreds or thousands of similar DNA motifs dispersed throughout the genome that are often provided by TEs. It has been suggested that TEs play an important role in the evolution of such systems, in particular, the rewiring of gene regulatory networks. One mechanism that can further enhance the rewiring of regulatory networks is nonallelic gene conversion between copies of TEs. Here, we will first review evidence for nonallelic gene conversion in TEs. Then, we will illustrate the benefits nonallelic gene conversion provides in rewiring regulatory networks. For instance, nonallelic gene conversion between TE copies offers an alternative mechanism to spread beneficial mutations that improve the network, it allows multiple mutations to be combined and transferred together, and it allows natural selection to work efficiently in spreading beneficial mutations and removing disadvantageous mutations. Future studies examining the role of nonallelic gene conversion in the evolution of TEs should help us to better understand how TEs have contributed to evolution.
Collapse
|
4
|
MacQueen A, Tian D, Chang W, Holub E, Kreitman M, Bergelson J. Population Genetics of the Highly Polymorphic RPP8 Gene Family. Genes (Basel) 2019; 10:E691. [PMID: 31500388 PMCID: PMC6771003 DOI: 10.3390/genes10090691] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2019] [Revised: 08/31/2019] [Accepted: 09/03/2019] [Indexed: 02/06/2023] Open
Abstract
Plant nucleotide-binding domain and leucine-rich repeat containing (NLR) genes provide some of the most extreme examples of polymorphism in eukaryotic genomes, rivalling even the vertebrate major histocompatibility complex. Surprisingly, this is also true in Arabidopsis thaliana, a predominantly selfing species with low heterozygosity. Here, we investigate how gene duplication and intergenic exchange contribute to this extraordinary variation. RPP8 is a three-locus system that is configured chromosomally as either a direct-repeat tandem duplication or as a single copy locus, plus a locus 2 Mb distant. We sequenced 48 RPP8 alleles from 37 accessions of A. thaliana and 12 RPP8 alleles from Arabidopsis lyrata to investigate the patterns of interlocus shared variation. The tandem duplicates display fixed differences and share less variation with each other than either shares with the distant paralog. A high level of shared polymorphism among alleles at one of the tandem duplicates, the single-copy locus and the distal locus, must involve both classical crossing over and intergenic gene conversion. Despite these polymorphism-enhancing mechanisms, the observed nucleotide diversity could not be replicated under neutral forward-in-time simulations. Only by adding balancing selection to the simulations do they approach the level of polymorphism observed at RPP8. In this NLR gene triad, genetic architecture, gene function and selection all combine to generate diversity.
Collapse
Affiliation(s)
- Alice MacQueen
- Integrative Biology, The University of Texas at Austin, Austin, TX 78712, USA.
| | - Dacheng Tian
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing 210008, China.
| | - Wenhan Chang
- Department of Ecology & Evolution, The University of Chicago, Chicago, IL 60637, USA.
| | - Eric Holub
- School of Life Sciences, Wellesbourne Innovation Campus, University of Warwick, Wellesbourne CV359EF, UK.
| | - Martin Kreitman
- Department of Ecology & Evolution, The University of Chicago, Chicago, IL 60637, USA.
| | - Joy Bergelson
- Department of Ecology & Evolution, The University of Chicago, Chicago, IL 60637, USA.
| |
Collapse
|
5
|
Hartasánchez DA, Brasó-Vives M, Heredia-Genestar JM, Pybus M, Navarro A. Effect of Collapsed Duplications on Diversity Estimates: What to Expect. Genome Biol Evol 2018; 10:2899-2905. [PMID: 30364947 PMCID: PMC6239678 DOI: 10.1093/gbe/evy223] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/08/2018] [Indexed: 12/19/2022] Open
Abstract
The study of segmental duplications (SDs) and copy-number variants (CNVs) is of great importance in the fields of genomics and evolution. However, SDs and CNVs are usually excluded from genome-wide scans for natural selection. Because of high identity between copies, SDs and CNVs that are not included in reference genomes are prone to be collapsed-that is, mistakenly aligned to the same region-when aligning sequence data from single individuals to the reference. Such collapsed duplications are additionally challenging because concerted evolution between duplications alters their site frequency spectrum and linkage disequilibrium patterns. To investigate the potential effect of collapsed duplications upon natural selection scans we obtained expectations for four summary statistics from simulations of duplications evolving under a range of interlocus gene conversion and crossover rates. We confirm that summary statistics traditionally used to detect the action of natural selection on DNA sequences cannot be applied to SDs and CNVs since in some cases values for known duplications mimic selective signatures. As a proof of concept of the pervasiveness of collapsed duplications, we analyzed data from the 1,000 Genomes Project. We find that, within regions identified as variable in copy number, diversity between individuals with the duplication is consistently higher than between individuals without the duplication. Furthermore, the frequency of single nucleotide variants (SNVs) deviating from Hardy-Weinberg Equilibrium is higher in individuals with the duplication, which strongly suggests that higher diversity is a consequence of collapsed duplications and incorrect evaluation of SNVs within these CNV regions.
Collapse
Affiliation(s)
- Diego A Hartasánchez
- Institute of Evolutionary Biology (Universitat Pompeu Fabra - CSIC), PRBB, Barcelona, Catalonia, Spain.,Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain.,Laboratoire de Biométrie et Biologie Évolutive UMR 5558, Université de Lyon, Université Lyon 1, CNRS, Villeurbanne, France
| | - Marina Brasó-Vives
- Institute of Evolutionary Biology (Universitat Pompeu Fabra - CSIC), PRBB, Barcelona, Catalonia, Spain.,Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
| | - Jose Maria Heredia-Genestar
- Institute of Evolutionary Biology (Universitat Pompeu Fabra - CSIC), PRBB, Barcelona, Catalonia, Spain.,Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
| | - Marc Pybus
- Institute of Evolutionary Biology (Universitat Pompeu Fabra - CSIC), PRBB, Barcelona, Catalonia, Spain.,Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
| | - Arcadi Navarro
- Institute of Evolutionary Biology (Universitat Pompeu Fabra - CSIC), PRBB, Barcelona, Catalonia, Spain.,Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain.,National Institute for Bioinformatics (INB), Barcelona, Catalonia, Spain.,Centre for Genomic Regulation (CRG), Barcelona, Catalonia, Spain
| |
Collapse
|
6
|
Rogers J, Fishberg A, Youngs N, Wu YC. Reconciliation feasibility in the presence of gene duplication, loss, and coalescence with multiple individuals per species. BMC Bioinformatics 2017; 18:292. [PMID: 28583091 PMCID: PMC5460407 DOI: 10.1186/s12859-017-1701-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2016] [Accepted: 05/22/2017] [Indexed: 11/20/2022] Open
Abstract
BACKGROUND In phylogenetics, we often seek to reconcile gene trees with species trees within the framework of an evolutionary model. While the most popular models for eukaryotic species allow for only gene duplication and gene loss or only multispecies coalescence, recent work has combined these phenomena through a reconciliation structure, the labeled coalescent tree (LCT), that simultaneously describes the duplication-loss and coalescent history of a gene family. However, the LCT makes the simplifying assumption that only one individual is sampled per species whereas, with advances in gene sequencing, we now have access to multiple samples per species. RESULTS We demonstrate that with these additional samples, there exist gene tree topologies that are impossible to reconcile with any species tree. In particular, the multiple samples enforce new constraints on the placement of duplications within a valid reconciliation. To model these constraints, we extend the LCT to a new structure, the partially labeled coalescent tree (PLCT) and demonstrate how to use the PLCT to evaluate the feasibility of a gene tree topology. We apply our algorithm to two clades of apes and flies to characterize possible sources of infeasibility. CONCLUSION Going forward, we believe that this model represents a first step towards understanding reconciliations in duplication-loss-coalescence models with multiple samples per species.
Collapse
Affiliation(s)
- Jennifer Rogers
- Department of Computer Science, Harvey Mudd College, Claremont, 91711, California, USA
| | - Andrew Fishberg
- Department of Computer Science, Harvey Mudd College, Claremont, 91711, California, USA
| | - Nora Youngs
- Department of Mathematics, Harvey Mudd College, Claremont, 91711, California, USA
- Current Address: Department of Mathematics and Statistics, Colby College, Waterville, 04901, Maine, USA
| | - Yi-Chieh Wu
- Department of Computer Science, Harvey Mudd College, Claremont, 91711, California, USA.
| |
Collapse
|
7
|
Sun XQ, Li DH, Xue JY, Yang SH, Zhang YM, Li MM, Hang YY. Insertion DNA Accelerates Meiotic Interchromosomal Recombination in Arabidopsis thaliana. Mol Biol Evol 2016; 33:2044-53. [PMID: 27189569 DOI: 10.1093/molbev/msw087] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Nucleotide insertions/deletions are ubiquitous in eukaryotic genomes, and the resulting hemizygous (unpaired) DNA has significant, heritable effects on adjacent DNA. However, little is known about the genetic behavior of insertion DNA. Here, we describe a binary transgenic system to study the behavior of insertion DNA during meiosis. Transgenic Arabidopsis lines were generated to carry two different defective reporter genes on nonhomologous chromosomes, designated as "recipient" and "donor" lines. Double hemizygous plants (harboring unpaired DNA) were produced by crossing between the recipient and the donor, and double homozygous lines (harboring paired DNA) via self-pollination. The transfer of the donor's unmutated sequence to the recipient generated a functional β-glucuronidase gene, which could be visualized by histochemical staining and corroborated by polymerase chain reaction amplification and sequencing. More than 673 million seedlings were screened, and the results showed that meiotic ectopic recombination in the hemizygous lines occurred at a frequency >6.49-fold higher than that in the homozygous lines. Gene conversion might have been exclusively or predominantly responsible for the gene correction events. The direct measurement of ectopic recombination events provided evidence that an insertion, in the absence of an allelic counterpart, could scan the entire genome for homologous counterparts with which to pair. Furthermore, the unpaired (hemizygous) architectures could accelerate ectopic recombination between itself and interchromosomal counterparts. We suggest that the ectopic recombination accelerated by hemizygous architectures may be a general mechanism for interchromosomal recombination through ubiquitously dispersed repeat sequences in plants, ultimately contributing to genetic renovation and eukaryotic evolution.
Collapse
Affiliation(s)
- Xiao-Qin Sun
- Jiangsu Key Laboratory for the Research and Uti1ization of Plant Resources, Institute of Botany, Jiangsu Province and Chinese Academy of Sciences, Nanjing, China
| | - Ding-Hong Li
- Jiangsu Key Laboratory for the Research and Uti1ization of Plant Resources, Institute of Botany, Jiangsu Province and Chinese Academy of Sciences, Nanjing, China
| | - Jia-Yu Xue
- Jiangsu Key Laboratory for the Research and Uti1ization of Plant Resources, Institute of Botany, Jiangsu Province and Chinese Academy of Sciences, Nanjing, China
| | - Si-Hai Yang
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, China
| | - Yan-Mei Zhang
- Jiangsu Key Laboratory for the Research and Uti1ization of Plant Resources, Institute of Botany, Jiangsu Province and Chinese Academy of Sciences, Nanjing, China
| | - Mi-Mi Li
- Jiangsu Key Laboratory for the Research and Uti1ization of Plant Resources, Institute of Botany, Jiangsu Province and Chinese Academy of Sciences, Nanjing, China
| | - Yue-Yu Hang
- Jiangsu Key Laboratory for the Research and Uti1ization of Plant Resources, Institute of Botany, Jiangsu Province and Chinese Academy of Sciences, Nanjing, China
| |
Collapse
|
8
|
Sánchez-Ramírez S, Tulloss RE, Guzmán-Dávalos L, Cifuentes-Blanco J, Valenzuela R, Estrada-Torres A, Ruán-Soto F, Díaz-Moreno R, Hernández-Rico N, Torres-Gómez M, León H, Moncalvo JM. In and out of refugia: historical patterns of diversity and demography in the North American Caesar's mushroom species complex. Mol Ecol 2015; 24:5938-56. [PMID: 26465233 DOI: 10.1111/mec.13413] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2015] [Revised: 10/05/2015] [Accepted: 10/06/2015] [Indexed: 11/30/2022]
Abstract
Some of the effects of past climate dynamics on plant and animal diversity make-up have been relatively well studied, but to less extent in fungi. Pleistocene refugia are thought to harbour high biological diversity (i.e. phylogenetic lineages and genetic diversity), mainly as a product of increased reproductive isolation and allele conservation. In addition, high extinction rates and genetic erosion are expected in previously glaciated regions. Some of the consequences of past climate dynamics might involve changes in range and population size that can result in divergence and incipient or cryptic speciation. Many of these dynamic processes and patterns can be inferred through phylogenetic and coalescent methods. In this study, we first delimit species within a group of closely related edible ectomycorrhizal Amanita from North America (the American Caesar's mushrooms species complex) using multilocus coalescent-based approaches; and then address questions related to effects of Pleistocene climate change on the diversity and genetics of the group. Our study includes extensive geographical sampling throughout the distribution range, and DNA sequences from three nuclear protein-coding genes. Results reveal cryptic diversity and high speciation rates in refugia. Population sizes and expansions seem to be larger at midrange latitudes (Mexican highlands and SE USA). Range shifts are proportional to population size expansions, which were overall more common during the Pleistocene. This study documents responses to past climate change in fungi and also highlights the applicability of the multispecies coalescent in comparative phylogeographical analyses and diversity assessments that include ancestral species.
Collapse
Affiliation(s)
- Santiago Sánchez-Ramírez
- Department of Natural History, Royal Ontario Museum, 100 Queen's Park, Toronto, ON, M5S 2C6, Canada.,Department of Ecology and Evolutionary Biology, University of Toronto, 25 Willcocks, Toronto, ON, M5S 3B2, Canada
| | | | - Laura Guzmán-Dávalos
- Departamento de Botánica y Zoología, Universidad de Guadalajara, Zapopan, 45101, México
| | - Joaquín Cifuentes-Blanco
- Facultad de Ciencias, Departamento de Biología Comparada, UNAM, Ciudad Universitaria, México City, 04510, México
| | - Ricardo Valenzuela
- Escuela Nacional de Ciencias Biólogicas, Instituto Politécnico Nacional, México City, 11340, México
| | - Arturo Estrada-Torres
- Centro de Investigación en Ciencias Biológicas, Universidad Autónoma de Tlaxcala, Tlaxcala, 90122, México
| | - Felipe Ruán-Soto
- Facultad de Ciencias Biológicas, Universidad de Ciencias y Artes de Chiapas, Tuxtla Gutiérrez, 29039, Mexico
| | - Raúl Díaz-Moreno
- Instituto de Silvicultura e Industria de la Madera, Universidad Juárez del Estado de Durango, Durango, 34120, México
| | - Nallely Hernández-Rico
- Laboratorio de Etnobiología, Centro de Investigaciones Biológicas, Universidad Autónoma del Estado de Hidalgo, Pachuca, México
| | - Mariano Torres-Gómez
- Centro de Investigaciones en Ecosistemas CIEco, Antigua carretera a Pátzcuaro # 8701, Col. Ex-Hacienda de San José de La Huerta, Morelia, 58190, México
| | - Hugo León
- Coleccion Etnomicológica "Dr. Teófilo Herrera Suárez", Instituto Tecnológico del Valle de Oaxaca, Xoxocotlán, 71230, México
| | - Jean-Marc Moncalvo
- Department of Natural History, Royal Ontario Museum, 100 Queen's Park, Toronto, ON, M5S 2C6, Canada.,Department of Ecology and Evolutionary Biology, University of Toronto, 25 Willcocks, Toronto, ON, M5S 3B2, Canada
| |
Collapse
|
9
|
Interplay of interlocus gene conversion and crossover in segmental duplications under a neutral scenario. G3-GENES GENOMES GENETICS 2014; 4:1479-89. [PMID: 24906640 PMCID: PMC4132178 DOI: 10.1534/g3.114.012435] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Interlocus gene conversion is a major evolutionary force that drives the concerted evolution of duplicated genomic regions. Theoretical models successfully have addressed the effects of interlocus gene conversion and the importance of crossover in the evolutionary fate of gene families and duplications but have not considered complex recombination scenarios, such as the presence of hotspots. To study the interplay between interlocus gene conversion and crossover, we have developed a forward-time simulator that allows the exploration of a wide range of interlocus gene conversion rates under different crossover models. Using it, we have analyzed patterns of nucleotide variation and linkage disequilibrium within and between duplicate regions, focusing on a neutral scenario with constant population size and validating our results with the existing theoretical models. We show that the interaction of gene conversion and crossover is nontrivial and that the location of crossover junctions is a fundamental determinant of levels of variation and linkage disequilibrium in duplicated regions. We also show that if crossover activity between duplications is strong enough, recurrent interlocus gene conversion events can break linkage disequilibrium within duplicates. Given the complex nature of interlocus gene conversion and crossover, we provide a framework to explore their interplay to help increase knowledge on molecular evolution within segmental duplications under more complex scenarios, such as demographic changes or natural selection.
Collapse
|
10
|
Both positive and negative selection pressures contribute to the polymorphism pattern of the duplicated human CYP21A2 gene. PLoS One 2013; 8:e81977. [PMID: 24312389 PMCID: PMC3843699 DOI: 10.1371/journal.pone.0081977] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2013] [Accepted: 10/20/2013] [Indexed: 11/19/2022] Open
Abstract
The human steroid 21-hydroxylase gene (CYP21A2) participates in cortisol and aldosterone biosynthesis, and resides together with its paralogous (duplicated) pseudogene in a multiallelic copy number variation (CNV), called RCCX CNV. Concerted evolution caused by non-allelic gene conversion has been described in great ape CYP21 genes, and the same conversion activity is responsible for a serious genetic disorder of CYP21A2, congenital adrenal hyperplasia (CAH). In the current study, 33 CYP21A2 haplotype variants encoding 6 protein variants were determined from a European population. CYP21A2 was shown to be one of the most diverse human genes (HHe=0.949), but the diversity of intron 2 was greater still. Contrary to previous findings, the evolution of intron 2 did not follow concerted evolution, although the remaining part of the gene did. Fixed sites (different fixed alleles of sites in human CYP21 paralogues) significantly accumulated in intron 2, indicating that the excess of fixed sites was connected to the lack of effective non-allelic conversion and concerted evolution. Furthermore, positive selection was presumably focused on intron 2, and possibly associated with the previous genetic features. However, the positive selection detected by several neutrality tests was discerned along the whole gene. In addition, the clear signature of negative selection was observed in the coding sequence. The maintenance of the CYP21 enzyme function is critical, and could lead to negative selection, whereas the presumed gene regulation altering steroid hormone levels via intron 2 might help fast adaptation, which broadly characterizes the genes of human CNVs responding to the environment.
Collapse
|
11
|
Hanikenne M, Kroymann J, Trampczynska A, Bernal M, Motte P, Clemens S, Krämer U. Hard selective sweep and ectopic gene conversion in a gene cluster affording environmental adaptation. PLoS Genet 2013; 9:e1003707. [PMID: 23990800 PMCID: PMC3749932 DOI: 10.1371/journal.pgen.1003707] [Citation(s) in RCA: 69] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2013] [Accepted: 06/22/2013] [Indexed: 12/27/2022] Open
Abstract
Among the rare colonizers of heavy-metal rich toxic soils, Arabidopsis halleri is a compelling model extremophile, physiologically distinct from its sister species A. lyrata, and A. thaliana. Naturally selected metal hypertolerance and extraordinarily high leaf metal accumulation in A. halleri both require Heavy Metal ATPase4 (HMA4) encoding a PIB-type ATPase that pumps Zn(2+) and Cd(2+) out of specific cell types. Strongly enhanced HMA4 expression results from a combination of gene copy number expansion and cis-regulatory modifications, when compared to A. thaliana. These findings were based on a single accession of A. halleri. Few studies have addressed nucleotide sequence polymorphism at loci known to govern adaptations. We thus sequenced 13 DNA segments across the HMA4 genomic region of multiple A. halleri individuals from diverse habitats. Compared to control loci flanking the three tandem HMA4 gene copies, a gradual depletion of nucleotide sequence diversity and an excess of low-frequency polymorphisms are hallmarks of positive selection in HMA4 promoter regions, culminating at HMA4-3. The accompanying hard selective sweep is segmentally eclipsed as a consequence of recurrent ectopic gene conversion among HMA4 protein-coding sequences, resulting in their concerted evolution. Thus, HMA4 coding sequences exhibit a network-like genealogy and locally enhanced nucleotide sequence diversity within each copy, accompanied by lowered sequence divergence between paralogs in any given individual. Quantitative PCR corroborated that, across A. halleri, three genomic HMA4 copies generate overall 20- to 130-fold higher transcript levels than in A. thaliana. Together, our observations constitute an unexpectedly complex profile of polymorphism resulting from natural selection for increased gene product dosage. We propose that these findings are paradigmatic of a category of multi-copy genes from a broad range of organisms. Our results emphasize that enhanced gene product dosage, in addition to neo- and sub-functionalization, can account for the genomic maintenance of gene duplicates underlying environmental adaptation.
Collapse
Affiliation(s)
- Marc Hanikenne
- Functional Genomics and Plant Molecular Imaging, Center for Protein Engineering (CIP), Department of Life Sciences, University of Liège, Liège, Belgium
| | - Juergen Kroymann
- Laboratoire d'Ecologie, Systématique et Evolution, Université Paris-Sud/CNRS, Orsay, France
| | | | - María Bernal
- Department of Plant Physiology, Ruhr University Bochum, Bochum, Germany
| | - Patrick Motte
- Functional Genomics and Plant Molecular Imaging, Center for Protein Engineering (CIP), Department of Life Sciences, University of Liège, Liège, Belgium
| | - Stephan Clemens
- Department of Plant Physiology, University of Bayreuth, Bayreuth, Germany
| | - Ute Krämer
- Department of Plant Physiology, Ruhr University Bochum, Bochum, Germany
| |
Collapse
|
12
|
Hörger AC, Ilyas M, Stephan W, Tellier A, van der Hoorn RAL, Rose LE. Balancing selection at the tomato RCR3 Guardee gene family maintains variation in strength of pathogen defense. PLoS Genet 2012; 8:e1002813. [PMID: 22829777 PMCID: PMC3400550 DOI: 10.1371/journal.pgen.1002813] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2012] [Accepted: 05/21/2012] [Indexed: 12/31/2022] Open
Abstract
Coevolution between hosts and pathogens is thought to occur between interacting molecules of both species. This results in the maintenance of genetic diversity at pathogen antigens (or so-called effectors) and host resistance genes such as the major histocompatibility complex (MHC) in mammals or resistance (R) genes in plants. In plant-pathogen interactions, the current paradigm posits that a specific defense response is activated upon recognition of pathogen effectors via interaction with their corresponding R proteins. According to the "Guard-Hypothesis," R proteins (the "guards") can sense modification of target molecules in the host (the "guardees") by pathogen effectors and subsequently trigger the defense response. Multiple studies have reported high genetic diversity at R genes maintained by balancing selection. In contrast, little is known about the evolutionary mechanisms shaping the guardee, which may be subject to contrasting evolutionary forces. Here we show that the evolution of the guardee RCR3 is characterized by gene duplication, frequent gene conversion, and balancing selection in the wild tomato species Solanum peruvianum. Investigating the functional characteristics of 54 natural variants through in vitro and in planta assays, we detected differences in recognition of the pathogen effector through interaction with the guardee, as well as substantial variation in the strength of the defense response. This variation is maintained by balancing selection at each copy of the RCR3 gene. Our analyses pinpoint three amino acid polymorphisms with key functional consequences for the coevolution between the guardee (RCR3) and its guard (Cf-2). We conclude that, in addition to coevolution at the "guardee-effector" interface for pathogen recognition, natural selection acts on the "guard-guardee" interface. Guardee evolution may be governed by a counterbalance between improved activation in the presence and prevention of auto-immune responses in the absence of the corresponding pathogen.
Collapse
Affiliation(s)
- Anja C Hörger
- Section of Evolutionary Biology, Department of Biology II, University of Munich, LMU, Planegg-Martinsried, Germany.
| | | | | | | | | | | |
Collapse
|
13
|
Mboup M, Fischer I, Lainer H, Stephan W. Trans-species polymorphism and allele-specific expression in the CBF gene family of wild tomatoes. Mol Biol Evol 2012; 29:3641-52. [PMID: 22787283 DOI: 10.1093/molbev/mss176] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Abiotic stresses such as drought, extreme temperatures, and salinity have a strong impact on plant adaptation. They act as selective forces on plant physiology and morphology. These selective pressures leave characteristic footprints that can be detected at the DNA sequence level using population genetic tools. On the basis of a candidate gene approach, we investigated signatures of adaptation in two wild tomato species, Solanum peruvianum and S. chilense. These species are native to western South America and constitute a model system for studying adaptation, due to their ability to colonize diverse habitats and the available genetic resources. We have determined the selective forces acting on the C-repeat binding factor (CBF) gene family, which consists of three genes, and is known to be involved in tolerance to abiotic stresses, in particular in cold tolerance. We also analyzed the expression pattern of these genes after drought and cold stresses. We found that CBF3 evolves under very strong purifying selection, CBF2 is under balancing selection in some populations of both species (S. peruvianum/Quicacha and S. chilense/Nazca) maintaining a trans-species polymorphism, and CBF1 is a pseudogene. In contrast to previous studies of cultivated tomatoes showing that only CBF1 was cold induced, we found that all three CBF genes are cold induced in wild tomatoes. All three genes are also drought induced. CBF2 exhibits an allele-specific expression pattern associated with the trans-species polymorphism.
Collapse
Affiliation(s)
- Mamadou Mboup
- Section of Evolutionary Biology, Department of Biology II, University of Munich, Planegg-Martinsried, Germany.
| | | | | | | |
Collapse
|
14
|
Rasmussen MD, Kellis M. Unified modeling of gene duplication, loss, and coalescence using a locus tree. Genome Res 2012; 22:755-65. [PMID: 22271778 DOI: 10.1101/gr.123901.111] [Citation(s) in RCA: 97] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Gene phylogenies provide a rich source of information about the way evolution shapes genomes, populations, and phenotypes. In addition to substitutions, evolutionary events such as gene duplication and loss (as well as horizontal transfer) play a major role in gene evolution, and many phylogenetic models have been developed in order to reconstruct and study these events. However, these models typically make the simplifying assumption that population-related effects such as incomplete lineage sorting (ILS) are negligible. While this assumption may have been reasonable in some settings, it has become increasingly problematic as increased genome sequencing has led to denser phylogenies, where effects such as ILS are more prominent. To address this challenge, we present a new probabilistic model, DLCoal, that defines gene duplication and loss in a population setting, such that coalescence and ILS can be directly addressed. Interestingly, this model implies that in addition to the usual gene tree and species tree, there exists a third tree, the locus tree, which will likely have many applications. Using this model, we develop the first general reconciliation method that accurately infers gene duplications and losses in the presence of ILS, and we show its improved inference of orthologs, paralogs, duplications, and losses for a variety of clades, including flies, fungi, and primates. Also, our simulations show that gene duplications increase the frequency of ILS, further illustrating the importance of a joint model. Going forward, we believe that this unified model can offer insights to questions in both phylogenetics and population genetics.
Collapse
Affiliation(s)
- Matthew D Rasmussen
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | | |
Collapse
|
15
|
Llaurens V, McMullan M, van Oosterhout C. Cryptic MHC Polymorphism Revealed but Not Explained by Selection on the Class IIB Peptide-Binding Region. Mol Biol Evol 2012; 29:1631-44. [DOI: 10.1093/molbev/mss012] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
|
16
|
Abstract
We develop a coalescent-based simulation tool to generate patterns of single nucleotide polymorphisms (SNPs) in a wide region encompassing both the original and duplicated genes. Selection on the new duplicated copy and interlocus gene conversion between the two copies are incorporated. This simulation enables us to explore how selection on duplicated copies affects the pattern of SNPs. The fixation of an advantageous duplicated copy causes a strong reduction in polymorphism not only in the duplicated copy but also in its flanking regions, which is a typical signature of a selective sweep by positive selection. After fixation, polymorphism gradually increases by accumulating neutral mutations and eventually reaches the equilibrium value if there is no gene conversion. When gene conversion is active, the number of SNPs in the duplicated copy quickly increases by transferring SNPs from the original copy; therefore, the time when we can recognize the signature of selection is decreased. Because this effect of gene conversion is restricted only to the duplicated region, more power to detect selection is expected if a flanking region to the duplicated copy is used.
Collapse
|
17
|
Ko WY, Kaercher KA, Giombini E, Marcatili P, Froment A, Ibrahim M, Lema G, Nyambo TB, Omar SA, Wambebe C, Ranciaro A, Hirbo JB, Tishkoff SA. Effects of natural selection and gene conversion on the evolution of human glycophorins coding for MNS blood polymorphisms in malaria-endemic African populations. Am J Hum Genet 2011; 88:741-754. [PMID: 21664997 DOI: 10.1016/j.ajhg.2011.05.005] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2011] [Revised: 04/22/2011] [Accepted: 05/05/2011] [Indexed: 11/17/2022] Open
Abstract
Malaria has been a very strong selection pressure in recent human evolution, particularly in Africa. Of the one million deaths per year due to malaria, more than 90% are in sub-Saharan Africa, a region with high levels of genetic variation and population substructure. However, there have been few studies of nucleotide variation at genetic loci that are relevant to malaria susceptibility across geographically and genetically diverse ethnic groups in Africa. Invasion of erythrocytes by Plasmodium falciparum parasites is central to the pathology of malaria. Glycophorin A (GYPA) and B (GYPB), which determine MN and Ss blood types, are two major receptors that are expressed on erythrocyte surfaces and interact with parasite ligands. We analyzed nucleotide diversity of the glycophorin gene family in 15 African populations with different levels of malaria exposure. High levels of nucleotide diversity and gene conversion were found at these genes. We observed divergent patterns of genetic variation between these duplicated genes and between different extracellular domains of GYPA. Specifically, we identified fixed adaptive changes at exons 3-4 of GYPA. By contrast, we observed an allele frequency spectrum skewed toward a significant excess of intermediate-frequency alleles at GYPA exon 2 in many populations; the degree of spectrum distortion is correlated with malaria exposure, possibly because of the joint effects of gene conversion and balancing selection. We also identified a haplotype causing three amino acid changes in the extracellular domain of glycophorin B. This haplotype might have evolved adaptively in five populations with high exposure to malaria.
Collapse
Affiliation(s)
- Wen-Ya Ko
- Department of Genetics and Biology, School of Medicine and School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Kristin A Kaercher
- Department of Genetics and Biology, School of Medicine and School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Emanuela Giombini
- Department of Biochemical Sciences "Rossi Fanelli" University of Rome "La Sapienza" P.ale Aldo Moro, 5 - 00185 Rome, Italy
| | - Paolo Marcatili
- Department of Biochemical Sciences "Rossi Fanelli" University of Rome "La Sapienza" P.ale Aldo Moro, 5 - 00185 Rome, Italy
| | - Alain Froment
- UMR 208, Institut de Recherche pour le Développement, Muséum National d'Histoire Naturelle, Musée de l'Homme, 75116 Paris, France
| | - Muntaser Ibrahim
- Department of Molecular Biology, Institute of Endemic Diseases, University of Khartoum, 15-Khartoum, Sudan
| | - Godfrey Lema
- Department of Biochemistry, Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania
| | - Thomas B Nyambo
- Department of Biochemistry, Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania
| | - Sabah A Omar
- Kenya Medical Research Institute, Center for Biotechnology Research and Development, 54840-00200 Nairobi, Kenya
| | | | - Alessia Ranciaro
- Department of Genetics and Biology, School of Medicine and School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Jibril B Hirbo
- Department of Genetics and Biology, School of Medicine and School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Sarah A Tishkoff
- Department of Genetics and Biology, School of Medicine and School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
18
|
Fischer I, Camus-Kulandaivelu L, Allal F, Stephan W. Adaptation to drought in two wild tomato species: the evolution of the Asr gene family. THE NEW PHYTOLOGIST 2011; 190:1032-1044. [PMID: 21323928 DOI: 10.1111/j.1469-8137.2011.03648.x] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Wild tomato species are a valuable system in which to study local adaptation to drought: they grow in diverse environments ranging from mesic to extremely arid conditions. Here, we investigate the evolution of members of the Asr (ABA/water stress/ripening induced) gene family, which have been reported to be involved in the water stress response. We analysed molecular variation in the Asr gene family in populations of two closely related species, Solanum chilense and Solanum peruvianum. We concluded that Asr1 has evolved under strong purifying selection. In contrast to previous reports, we did not detect evidence for positive selection at Asr2. However, Asr4 shows patterns consistent with local adaptation in an S. chilense population that lives in an extremely dry environment. We also discovered a new member of the gene family, Asr5. Our results show that the Asr genes constitute a dynamic gene family and provide an excellent example of tandemly arrayed genes that are of importance in adaptation. Taking the potential distribution of the species into account, it appears that S. peruvianum can cope with a great variety of environmental conditions without undergoing local adaptation, whereas S. chilense undergoes local adaptation more frequently.
Collapse
Affiliation(s)
- Iris Fischer
- Section of Evolutionary Biology, Department of Biology II, University of Munich (LMU), Grosshaderner Strasse 2, 82152 Planegg-Martinsried, Germany
| | - Létizia Camus-Kulandaivelu
- CIRAD, Biological System Department - Research Unit 39 'Genetic Diversity and Breeding of Forest Tree Species', Campus international de Baillarguet TA A-39/C, 34398 Montpellier Cedex 5, France
| | - François Allal
- CIRAD, Biological System Department - Research Unit 39 'Genetic Diversity and Breeding of Forest Tree Species', Campus international de Baillarguet TA A-39/C, 34398 Montpellier Cedex 5, France
| | - Wolfgang Stephan
- Section of Evolutionary Biology, Department of Biology II, University of Munich (LMU), Grosshaderner Strasse 2, 82152 Planegg-Martinsried, Germany
| |
Collapse
|
19
|
The Rate and Tract Length of Gene Conversion between Duplicated Genes. Genes (Basel) 2011; 2:313-31. [PMID: 24710193 PMCID: PMC3924818 DOI: 10.3390/genes2020313] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2011] [Revised: 03/11/2011] [Accepted: 03/17/2011] [Indexed: 11/26/2022] Open
Abstract
Interlocus gene conversion occurs such that a certain length of DNA fragment is non-reciprocally transferred (copied and pasted) between paralogous regions. To understand the rate and tract length of gene conversion, there are two major approaches. One is based on mutation-accumulation experiments, and the other uses natural DNA sequence variation. In this review, we overview the two major approaches and discuss their advantages and disadvantages. In addition, to demonstrate the importance of statistical analysis of empirical and evolutionary data for estimating tract length, we apply a maximum likelihood method to several data sets.
Collapse
|
20
|
Fawcett JA, Innan H. Neutral and non-neutral evolution of duplicated genes with gene conversion. Genes (Basel) 2011; 2:191-209. [PMID: 24710144 PMCID: PMC3924837 DOI: 10.3390/genes2010191] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2010] [Revised: 01/20/2011] [Accepted: 02/12/2011] [Indexed: 01/11/2023] Open
Abstract
Gene conversion is one of the major mutational mechanisms involved in the DNA sequence evolution of duplicated genes. It contributes to create unique patters of DNA polymorphism within species and divergence between species. A typical pattern is so-called concerted evolution, in which the divergence between duplicates is maintained low for a long time because of frequent exchanges of DNA fragments. In addition, gene conversion affects the DNA evolution of duplicates in various ways especially when selection operates. Here, we review theoretical models to understand the evolution of duplicates in both neutral and non-neutral cases. We also explain how these theories contribute to interpreting real polymorphism and divergence data by using some intriguing examples.
Collapse
Affiliation(s)
- Jeffrey A Fawcett
- Graduate University for Advanced Studies, Hayama, Kanagawa 240-0193, Japan.
| | - Hideki Innan
- Graduate University for Advanced Studies, Hayama, Kanagawa 240-0193, Japan.
| |
Collapse
|
21
|
Diversity-enhancing selection acts on a female reproductive protease family in four subspecies of Drosophila mojavensis. Genetics 2011; 187:865-76. [PMID: 21212232 DOI: 10.1534/genetics.110.124743] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Protein components of the Drosophila male ejaculate are critical modulators of reproductive success, several of which are known to evolve rapidly. Recent evidence of adaptive evolution in female reproductive tract proteins suggests this pattern may reflect sexual selection at the molecular level. Here we explore the evolutionary dynamics of a five-paralog gene family of female reproductive proteases within geographically isolated subspecies of Drosophila mojavensis. Remarkably, four of five paralogs show exceptionally low differentiation between subspecies and unusually structured haplotypes that suggest the retention of old polymorphisms. These gene genealogies are accompanied by deviations from neutrality consistent with diversifying selection. While diversifying selection has been observed among the reproductive molecules of mammals and marine invertebrates, our study provides the first evidence of this selective regime in any Drosophila reproductive protein, male or female.
Collapse
|
22
|
Genomic and Population-Level Effects of Gene Conversion in Caenorhabditis Paralogs. Genes (Basel) 2010; 1:452-68. [PMID: 24710096 PMCID: PMC3966223 DOI: 10.3390/genes1030452] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2010] [Revised: 11/22/2010] [Accepted: 12/06/2010] [Indexed: 11/17/2022] Open
|
23
|
Rane HS, Smith JM, Bergthorsson U, Katju V. Gene conversion and DNA sequence polymorphism in the sex-determination gene fog-2 and its paralog ftr-1 in Caenorhabditis elegans. Mol Biol Evol 2010; 27:1561-9. [PMID: 20133352 DOI: 10.1093/molbev/msq039] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Gene conversion, a form of concerted evolution, bears enormous potential to shape the trajectory of sequence and functional divergence of gene paralogs subsequent to duplication events. fog-2, a sex-determination gene unique to Caenorhabditis elegans and implicated in the origin of hermaphroditism in this species, resulted from the duplication of ftr-1, an upstream gene of unknown function. Synonymous sequence divergence in regions of fog-2 and ftr-1 (excluding recent gene conversion tracts) suggests that the duplication occurred 46 million generations ago. Gene conversion between fog-2 and ftr-1 was previously discovered in experimental fog-2 knockout lines of C. elegans, whereby hermaphroditism was restored in mutant obligately outcrossing male-female populations. We analyzed DNA-sequence variation in fog-2 and ftr-1 within 40 isolates of C. elegans from diverse geographic locations in order to evaluate the contribution of gene conversion to genetic variation in the two gene paralogs. The analysis shows that gene conversion contributes significantly to DNA-sequence diversity in fog-2 and ftr-1 (22% and 34%, respectively) and may have the potential to alter sexual phenotypes in natural populations. A radical amino acid change in a conserved region of the F-box domain of fog-2 was found in natural isolates of C. elegans with significantly lower fecundity. We hypothesize that the lowered fecundity is due to reduced masculinization and less sperm production and that amino acid replacement substitutions and gene conversion in fog-2 may contribute significantly to variation in the degree of inbreeding and outcrossing in natural populations.
Collapse
Affiliation(s)
- Hallie S Rane
- Department of Biology, University of New Mexico, NM, USA
| | | | | | | |
Collapse
|
24
|
The evolution of gene duplications: classifying and distinguishing between models. Nat Rev Genet 2010; 11:97-108. [PMID: 20051986 DOI: 10.1038/nrg2689] [Citation(s) in RCA: 907] [Impact Index Per Article: 60.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Gene duplications and their subsequent divergence play an important part in the evolution of novel gene functions. Several models for the emergence, maintenance and evolution of gene copies have been proposed. However, a clear consensus on how gene duplications are fixed and maintained in genomes is lacking. Here, we present a comprehensive classification of the models that are relevant to all stages of the evolution of gene duplications. Each model predicts a unique combination of evolutionary dynamics and functional properties. Setting out these predictions is an important step towards identifying the main mechanisms that are involved in the evolution of gene duplications.
Collapse
|
25
|
Abstract
Interlocus gene conversion can homogenize DNA sequences of duplicated regions with high homology. Such nonvertical events sometimes cause a misleading evolutionary interpretation of data when the effect of gene conversion is ignored. To avoid this problem, it is crucial to test the data for the presence of gene conversion. Here, we performed extensive simulations to compare four major methods to detect gene conversion. One might expect that the power increases with increase of the gene conversion rate. However, we found this is true for only two methods. For the other two, limited power is expected when gene conversion is too frequent. We suggest using multiple methods to minimize the chance of missing the footprint of gene conversion.
Collapse
|
26
|
Takuno S, Innan H. Selection to maintain paralogous amino acid differences under the pressure of gene conversion in the heat-shock protein genes in yeast. Mol Biol Evol 2009; 26:2655-9. [PMID: 19745001 DOI: 10.1093/molbev/msp211] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
A genome scan for the signatures of selection for paralogous functional amino acid differences was performed with yeast genomes. This recently developed method makes it possible to localize the target sites of selection under the pressure of gene conversion. We found that two gene pairs have strong signatures of selection. The two pairs of duplicated genes happened to be heat shock genes (Ssa1/ Ssa2 and Ssb1/Ssb2), which have similar protein structures to each other, although the amino acid sequence identity between Ssa and Ssb is not high ( approximately 60%). Interestingly, the two gene pairs exhibit signature of selection at almost identical positions within the substrate-binding domain beta. Because this domain specifies the substrate polypeptides, it is presumed that functional divergence may be advantageous in this domain. Evolutionary analysis demonstrated that the observed divergence in the two gene pairs has been maintained in many yeast species independently, suggesting long-term operation of strong selection.
Collapse
|
27
|
Population genetic models of duplicated genes. Genetica 2009; 137:19-37. [PMID: 19266289 DOI: 10.1007/s10709-009-9355-1] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2008] [Accepted: 12/28/2008] [Indexed: 01/08/2023]
Abstract
Various population genetic models of duplicated genes are introduced. The problems covered in this review include the fixation process of a duplicated copy, copy number polymorphism, the fates of duplicated genes and single nucleotide polymorphism in duplicated genes. Because of increasing evidence for concerted evolution by gene conversion, this review introduces recently developed gene conversion models. In the first half, models assuming independent evolution of duplicated genes are introduced, and then the effect of gene conversion is considered in the second half.
Collapse
|
28
|
Duplication, selection and gene conversion in a Drosophila mojavensis female reproductive protein family. Genetics 2009; 181:1451-65. [PMID: 19204376 DOI: 10.1534/genetics.108.099044] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Protein components of the Drosophila male ejaculate, several of which evolve rapidly, are critical modulators of reproductive success. Recent studies of female reproductive tract proteins indicate they also are extremely divergent between species, suggesting that reproductive molecules may coevolve between the sexes. Our current understanding of intersexual coevolution, however, is severely limited by the paucity of genetic and evolutionary studies on the female molecules involved. Physiological evidence of ejaculate-female coadaptation, paired with a promiscuous mating system, makes Drosophila mojavensis an exciting model system in which to study the evolution of reproductive proteins. Here we explore the evolutionary dynamics of a five-paralog gene family of female reproductive proteases within populations of D. mojavensis and throughout the repleta species group. We show that the proteins have experienced ongoing gene duplication and adaptive evolution and further exhibit dynamic patterns of pseudogenation, copy number variation, gene conversion, and selection within geographically isolated populations of D. mojavensis. The integration of these patterns in a single gene family has never before been documented in a reproductive protein.
Collapse
|
29
|
Doyle JJ, Flagel LE, Paterson AH, Rapp RA, Soltis DE, Soltis PS, Wendel JF. Evolutionary genetics of genome merger and doubling in plants. Annu Rev Genet 2009; 42:443-61. [PMID: 18983261 DOI: 10.1146/annurev.genet.42.110807.091524] [Citation(s) in RCA: 439] [Impact Index Per Article: 27.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Polyploidy is a common mode of evolution in flowering plants. The profound effects of polyploidy on gene expression appear to be caused more by hybridity than by genome doubling. Epigenetic mechanisms underlying genome-wide changes in expression are as yet poorly understood; only methylation has received much study, and its importance varies among polyploids. Genetic diploidization begins with the earliest responses to genome merger and doubling; less is known about chromosomal diploidization. Polyploidy duplicates every gene in the genome, providing the raw material for divergence or partitioning of function in homoeologous copies. Preferential retention or loss of genes occurs in a wide range of taxa, suggesting that there is an underlying set of principles governing the fates of duplicated genes. Further studies are required for general patterns to be elucidated, involving different plant families, kinds of polyploidy, and polyploids of different ages.
Collapse
Affiliation(s)
- Jeff J Doyle
- Department of Plant Biology, Cornell University, Ithaca, New York 14850, USA.
| | | | | | | | | | | | | |
Collapse
|
30
|
Hollox EJ, Barber JCK, Brookes AJ, Armour JAL. Defensins and the dynamic genome: what we can learn from structural variation at human chromosome band 8p23.1. Genome Res 2009; 18:1686-97. [PMID: 18974263 DOI: 10.1101/gr.080945.108] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Over the past four years, genome-wide studies have uncovered numerous examples of structural variation in the human genome. This includes structural variation that changes copy number, such as deletion and duplication, and structural variation that does not change copy number, such as orientation and positional polymorphism. One region that contains all these types of variation spans the chromosome band 8p23.1. This region has been studied in some depth, and the focus of this review is to examine our current understanding of the variation of this region. We also consider whether this region is a good model for other structurally variable regions in the genome and what the implications of this variation are for clinical studies. Finally, we discuss the bioinformatics challenges raised, discuss the evolution of the region, and suggest some future priorities for structural variation research.
Collapse
Affiliation(s)
- Edward J Hollox
- Department of Genetics, University of Leicester, Leicester LE1 7RH, United Kingdom.
| | | | | | | |
Collapse
|
31
|
Osada N, Innan H. Duplication and gene conversion in the Drosophila melanogaster genome. PLoS Genet 2008; 4:e1000305. [PMID: 19079581 PMCID: PMC2588116 DOI: 10.1371/journal.pgen.1000305] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2008] [Accepted: 11/12/2008] [Indexed: 11/18/2022] Open
Abstract
Using the genomic sequences of Drosophila melanogaster subgroup, the pattern of gene duplications was investigated with special attention to interlocus gene conversion. Our fine-scale analysis with careful visual inspections enabled accurate identification of a number of duplicated blocks (genomic regions). The orthologous parts of those duplicated blocks were also identified in the D. simulans and D. sechellia genomes, by which we were able to clearly classify the duplicated blocks into post- and pre-speciation blocks. We found 31 post-speciation duplicated genes, from which the rate of gene duplication (from one copy to two copies) is estimated to be 1.0 x 10(-9) per single-copy gene per year. The role of interlocus gene conversion was observed in several respects in our data: (1) synonymous divergence between a duplicated pair is overall very low. Consequently, the gene duplication rate would be seriously overestimated by counting duplicated genes with low divergence; (2) the sizes of young duplicated blocks are generally large. We postulate that the degeneration of gene conversion around the edges could explain the shrinkage of "identifiable" duplicated regions; and (3) elevated paralogous divergence is observed around the edges in many duplicated blocks, supporting our gene conversion-degeneration model. Our analysis demonstrated that gene conversion between duplicated regions is a common and genome-wide phenomenon in the Drosophila genomes, and that its role should be especially significant in the early stages of duplicated genes. Based on a population genetic prediction, we applied a new genome-scan method to test for signatures of selection for neofunctionalization and found a strong signature in a pair of transporter genes.
Collapse
Affiliation(s)
- Naoki Osada
- National Institute of Biomedical Innovation, Osaka, Japan
- Graduate University for Advanced Studies, Hayama, Japan
| | - Hideki Innan
- National Institute of Biomedical Innovation, Osaka, Japan
- Graduate University for Advanced Studies, Hayama, Japan
| |
Collapse
|
32
|
Abstract
The effect of directional selection on the fixation process of a single mutation that spreads in a multigene family by gene conversion is investigated. A simple two-locus model with two alleles, A and a, is first considered in a random-mating diploid population with size N. There are four haplotypes, AA, Aa, aA, and aa, and selection works on the number of alleles A in a diplod (i = 0, 1, 2, 3, 4). Because gene conversion is allowed between the two loci, when the mutation rate is very low, either AA or aa will fix in the population eventually. We consider a situation where a single mutant, A, arises in one locus when a is fixed in both loci. Then, we derive the fixation probability analytically, and the fixation time is investigated by simulations. It is found that gene conversion has an effect to increase the "effective" population size, so that weak selection works more efficiently in a multigene family. With these results, we discuss the effect of gene conversion on the rate of molecular evolution in a multigene family undergoing concerted evolution. We also argue about the applicability of the theoretical results to models of multigene families with more than two loci.
Collapse
|
33
|
Abstract
Interlocus gene conversion is considered a crucial mechanism for generating novel combinations of polymorphisms in duplicated genes. The importance of gene conversion between duplicated genes has been recognized in the major histocompatibility complex and self-incompatibility genes, which are likely subject to diversifying selection. To theoretically understand the potential role of gene conversion in such situations, forward simulations are performed in various two-locus models. The results show that gene conversion could significantly increase the number of haplotypes when diversifying selection works on both loci. We find that the tract length of gene conversion is an important factor to determine the efficacy of gene conversion: shorter tract lengths can more effectively generate novel haplotypes given the gene conversion rate per site is the same. Similar results are also obtained when one of the duplicated genes is assumed to be a pseudogene. It is suggested that a duplicated gene, even after being silenced, will contribute to increasing the variability in the other locus through gene conversion. Consequently, the fixation probability and longevity of duplicated genes increase under the presence of gene conversion. On the basis of these findings, we propose a new scenario for the preservation of a duplicated gene: when the original donor gene is under diversifying selection, a duplicated copy can be preserved by gene conversion even after it is pseudogenized.
Collapse
|
34
|
Cornman RS, Willis JH. Extensive gene amplification and concerted evolution within the CPR family of cuticular proteins in mosquitoes. INSECT BIOCHEMISTRY AND MOLECULAR BIOLOGY 2008; 38:661-76. [PMID: 18510978 PMCID: PMC4276373 DOI: 10.1016/j.ibmb.2008.04.001] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 02/26/2008] [Revised: 03/27/2008] [Accepted: 04/03/2008] [Indexed: 05/03/2023]
Abstract
Annotation of the Anopheles gambiae genome has revealed a large increase in the number of genes encoding cuticular proteins with the Rebers and Riddiford Consensus (the CPR gene family) relative to Drosophila melanogaster. This increase reflects an expansion of the RR-2 group of CPR genes, particularly the amplification of sets of highly similar paralogs. Patterns of nucleotide variation indicate that extensive concerted evolution is occurring within these clusters. The pattern of concerted evolution is complex, however, as sequence similarity within clusters is uncorrelated with gene order and orientation, and no comparable clusters occur within similarly compact arrays of the RR-1 group in mosquitoes or in either group in D. melanogaster. The dearth of pseudogenes suggests that sequence clusters are maintained by selection for high gene-copy number, perhaps due to selection for high expression rates. This hypothesis is consistent with the apparently parallel evolution of compact gene architectures within sequence clusters relative to single-copy genes. We show that RR-2 proteins from sequence-cluster genes have complex repeats and extreme amino-acid compositions relative to single-copy CPR proteins in An. gambiae, and that the amino-acid composition of the N-terminal and C-terminal sequence flanking the chitin-binding consensus region evolves in a correlated fashion.
Collapse
Affiliation(s)
- R Scott Cornman
- Department of Cellular Biology, University of Georgia, Athens, GA 30602
| | - Judith H Willis
- Department of Cellular Biology, University of Georgia, Athens, GA 30602
| |
Collapse
|
35
|
Sjödin P, Hedman H, Kruskopf Österberg M, Gustafsson S, Lagercrantz U, Lascoux M. Polymorphism and Divergence at Three Duplicate Genes in Brassica nigra. J Mol Evol 2008; 66:581-90. [DOI: 10.1007/s00239-008-9108-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2006] [Revised: 02/29/2008] [Accepted: 04/09/2008] [Indexed: 10/22/2022]
|
36
|
Evidence that strong positive selection drives neofunctionalization in the tandemly duplicated polyhomeotic genes in Drosophila. Proc Natl Acad Sci U S A 2008; 105:5447-52. [PMID: 18381818 DOI: 10.1073/pnas.0710892105] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The polyhomeotic (ph) locus in Drosophila melanogaster consists of the two tandemly duplicated genes ph-d (distal) and ph-p (proximal). They code for transcriptional repressors belonging to the Polycomb group proteins, which regulate homeotic genes and hundreds of other loci. Although the duplication of ph occurred at least 25 million to 30 million years ago, both copies are very similar to each other at both the DNA and the protein levels, probably because of the action of frequent gene conversion. Despite this homogenizing force, differential regulation of both transcriptional units suggests that the functions of the duplicates have begun to diverge. Here, we provide evidence that this functional divergence is driven by positive selection. Based on resequencing of an approximately 30-kb region around the ph locus in an African sample of D. melanogaster X chromosomes, we identified a selective sweep, estimated its age and the strength of selection, and mapped the target of selection to a narrow interval of the ph-p gene. This noncoding region contains a large intron with several regulatory elements that are absent in the ph-d duplicate. Our results suggest that neofunctionalization has been achieved in the Drosophila ph genes through the action of strong positive selection and the inactivation of gene conversion in part of the gene.
Collapse
|
37
|
Abstract
Neofunctionalization occurs when a neofunctionalized allele is fixed in one of duplicated genes. This is a simple fixation process if duplicated genes accumulate mutations independently. However, the process is very complicated when duplicated genes undergo concerted evolution by gene conversion. Our simulations demonstrate that the process could be described with three distinct stages. First, a newly arisen neofunctionalized allele increases in frequency by selection, but gene conversion prevents its complete fixation. These two factors (selection and gene conversion) that work in opposite directions create an equilibrium, and the time during which the frequency of the neofunctionalized allele drifts around the equilibrium value is called the temporal equilibrium stage. During this temporal equilibrium stage, it is possible that gene conversion is inactivated by mutations, which allow the complete fixation of the neofunctionalized allele. And then, permanent neofunctionalization is achieved. This article develops basic population genetics theories on the process to permanent neofunctionalization under the pressure of gene conversion. We obtain the probability and time that the frequency of a newly arisen neofunctionalized allele reaches the equilibrium value. It is also found that during the temporal equilibrium stage, selection exhibits strong signature in the divergence in the DNA sequences between the duplicated genes. The spatial distribution of the divergence likely has a peak around the site targeted by selection. We provide an analytical expression of the pattern of divergence and apply it to the human red- and green-opsin genes. The theoretical prediction well fits the data when we assume that selection is operating for the two amino acid differences in exon 5, which are believed to account for the major part of the functional difference between the red and green opsins.
Collapse
|
38
|
Abstract
When a microsatellite locus is duplicated in a diploid organism, a single pair of PCR primers may amplify as many as four distinct alleles. To study the evolution of a duplicated microsatellite, we consider a coalescent model with symmetric stepwise mutation. Conditional on the time of duplication and a mutation rate, both in a model of completely unlinked loci and in a model of completely linked loci, we compute the probabilities for a sampled diploid individual to amplify one, two, three, or four distinct alleles with one pair of microsatellite PCR primers. These probabilities are then studied to examine the nature of their dependence on the duplication time and the mutation rate. The mutation rate is observed to have a stronger effect than the duplication time on the four probabilities, and the unlinked and linked cases are seen to behave similarly. Our results can be useful for helping to interpret genetic variation at microsatellite loci in species with a very recent history of gene and genome duplication.
Collapse
|
39
|
Thornton KR. The neutral coalescent process for recent gene duplications and copy-number variants. Genetics 2007; 177:987-1000. [PMID: 17720930 PMCID: PMC2034660 DOI: 10.1534/genetics.107.074948] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
I describe a method for simulating samples from gene families of size two under a neutral coalescent process, for the case where the duplicate gene either has fixed recently in the population or is still segregating. When a duplicate locus has recently fixed by genetic drift, diversity in the new gene is expected to be reduced, and an excess of rare alleles is expected, relative to the predictions of the standard coalescent model. The expected patterns of polymorphism in segregating duplicates ("copy-number variants") depend both on the frequency of the duplicate in the sample and on the rate of crossing over between the two loci. When the crossover rate between the ancestral gene and the copy-number variant is low, the expected pattern of variability in the ancestral gene will be similar to the predictions of models of either balancing or positive selection, if the frequency of the duplicate in the sample is intermediate or high, respectively. Simulations are used to investigate the effect of crossing over between loci, and gene conversion between the duplicate loci, on levels of variability and the site-frequency spectrum.
Collapse
Affiliation(s)
- Kevin R Thornton
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697, USA.
| |
Collapse
|
40
|
Storz JF, Baze M, Waite JL, Hoffmann FG, Opazo JC, Hayes JP. Complex signatures of selection and gene conversion in the duplicated globin genes of house mice. Genetics 2007; 177:481-500. [PMID: 17660536 PMCID: PMC2013706 DOI: 10.1534/genetics.107.078550] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Results of electrophoretic surveys have suggested that hemoglobin polymorphism may be maintained by balancing selection in natural populations of house mice, Mus musculus. Here we report a survey of nucleotide variation in the adult globin genes of house mice from South America. We surveyed nucleotide polymorphism in two closely linked alpha-globin paralogs and two closely linked beta-globin paralogs to test whether patterns of variation are consistent with a model of long-term balancing selection. Surprisingly high levels of nucleotide polymorphism at the two beta-globin paralogs were attributable to the segregation of two highly divergent haplotypes, Hbbs (which carries two identical beta-globin paralogs) and Hbbd (which carries two functionally divergent beta-globin paralogs). Interparalog gene conversion on the Hbbs haplotype has produced a highly unusual situation in which the two paralogs are more similar to one another than either one is to its allelic counterpart on the Hbbd haplotype. Levels of nucleotide polymorphism and linkage disequilibrium at the two beta-globin paralogs suggest a complex history of diversity-enhancing selection that may be responsible for long-term maintenance of alternative protein alleles. The alternative two-locus beta-globin haplotypes are associated with pronounced differences in intraerythrocyte glutathione and nitric oxide metabolism, suggesting a possible mechanism for selection on hemoglobin function.
Collapse
Affiliation(s)
- Jay F Storz
- School of Biological Sciences, University of Nebraska, Lincoln, Nebraska 68588, USA.
| | | | | | | | | | | |
Collapse
|
41
|
Kondrashov FA, Gurbich TA, Vlasov PK. Selection for functional uniformity of tuf duplicates in gamma-proteobacteria. Trends Genet 2007; 23:215-8. [PMID: 17383049 DOI: 10.1016/j.tig.2007.03.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2006] [Revised: 02/14/2007] [Accepted: 03/07/2007] [Indexed: 10/23/2022]
Abstract
Having an extra copy of a gene is thought to provide some functional redundancy, which results in a higher rate of evolution in duplicated genes. In this article, we estimate the impact of gene duplication on the selection of tuf paralogs, and we find that in the absence of gene conversion, tuf paralogs have evolved significantly slower than when gene conversion has been a factor in their evolution. Thus, tuf gene copies evolve under a selective pressure that ensures their functional uniformity, and gene conversion reduces selection against amino acid substitutions that affect the function of the encoded protein, EF-Tu.
Collapse
Affiliation(s)
- Fyodor A Kondrashov
- Section on Ecology, Behavior and Evolution, Division of Biological Sciences, University of California at San Diego, 2218 Muir Biology Building, La Jolla, CA 92093, USA.
| | | | | |
Collapse
|
42
|
Hallast P, Rull K, Laan M. The evolution and genomic landscape of CGB1 and CGB2 genes. Mol Cell Endocrinol 2007; 260-262:2-11. [PMID: 17055150 PMCID: PMC2599907 DOI: 10.1016/j.mce.2005.11.049] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/12/2005] [Accepted: 11/28/2005] [Indexed: 10/31/2022]
Abstract
The origin of completely novel proteins is a significant question in evolution. The luteinizing hormone (LHB)/chorionic gonadotropin (CGB) gene cluster in humans contains a candidate example of this process. Two genes in this cluster (CGB1 and CGB2) exhibit nucleotide sequence similarity with the other LHB/CGB genes, but as a result of frameshifting are predicted to encode a completely novel protein. Our analysis of these genes from humans and related primates indicates a recent origin in the lineage specific to humans and African great apes. While the function of these genes is not yet known, they are strongly conserved between human and chimpanzee and exhibit three-fold lower diversity than LHB across human populations with no mutations that would disrupt the coding sequence. The 5'-upstream region of CGB1/2 contains most of the promoter sequence of hCGbeta plus a novel region proximal to the putative transcription start site. In silico prediction of putative transcription factor binding sites supports the hypothesis that CGB1 and CGB2 gene products are expressed in, and may contribute to, implantation and placental development.
Collapse
Affiliation(s)
- Pille Hallast
- Department of Biotechnology, Institute of Molecular and Cell Biology, University of Tartu, Riia 23, 51010 Tartu, Estonia
| | - Kristiina Rull
- Department of Biotechnology, Institute of Molecular and Cell Biology, University of Tartu, Riia 23, 51010 Tartu, Estonia
- Department of Obstetrics and Gynecology, University of Tartu, Estonia
| | - Maris Laan
- Department of Biotechnology, Institute of Molecular and Cell Biology, University of Tartu, Riia 23, 51010 Tartu, Estonia
| |
Collapse
|
43
|
Arguello JR, Chen Y, Yang S, Wang W, Long M. Origination of an X-linked testes chimeric gene by illegitimate recombination in Drosophila. PLoS Genet 2006; 2:e77. [PMID: 16715176 PMCID: PMC1463047 DOI: 10.1371/journal.pgen.0020077] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2006] [Accepted: 04/05/2006] [Indexed: 12/02/2022] Open
Abstract
The formation of chimeric gene structures provides important routes by which novel proteins and functions are introduced into genomes. Signatures of these events have been identified in organisms from wide phylogenic distributions. However, the ability to characterize the early phases of these evolutionary processes has been difficult due to the ancient age of the genes or to the limitations of strictly computational approaches. While examples involving retrotransposition exist, our understanding of chimeric genes originating via illegitimate recombination is limited to speculations based on ancient genes or transfection experiments. Here we report a case of a young chimeric gene that has originated by illegitimate recombination in Drosophila. This gene was created within the last 2–3 million years, prior to the speciation of Drosophila simulans, Drosophila sechellia, and Drosophila mauritiana. The duplication, which involved the Bällchen gene on Chromosome 3R, was partial, removing substantial 3′ coding sequence. Subsequent to the duplication onto the X chromosome, intergenic sequence was recruited into the protein-coding region creating a chimeric peptide with ~ 33 new amino acid residues. In addition, a novel intron-containing 5′ UTR and novel 3′ UTR evolved. We further found that this new X-linked gene has evolved testes-specific expression. Following speciation of the D. simulans complex, this novel gene evolved lineage-specifically with evidence for positive selection acting along the D. simulans branch. Illegitimate recombination, the non-homologous recombination that occurs between DNA sequences with few or no identical nucleotides, is a general phenomenon that has been known to cause many medically important deleterious changes. However, little is known about the positive side of such a process. For example, little is known about its relative role in the origin of new gene functions that confer increased fitness to species. This work contributes to the understanding of the significance of this process. Here the authors report on a young chimeric gene that has originated by illegitimate recombination in Drosophila. The term “chimeric gene” refers to gene structures—both coding and noncoding—which have been generated from distinct parental loci. This chimeric gene was created within the last 2–3 million years, prior to the speciation of Drosophila simulans, Drosophila sechellia, and Drosophila mauritiana. A gene on Chromosome 3R was duplicated onto the X chromosome and recruited intergenic sequence, creating a chimeric peptide. It was found that this new X-linked gene has evolved testes-specific expression. Following speciation of the D. simulans complex, this novel gene evolved lineage-specifically under positive Darwinian selection.
Collapse
Affiliation(s)
- J. Roman Arguello
- Committee on Evolutionary Biology, University of Chicago, Chicago, Illinois, United States of America
| | - Ying Chen
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
| | - Shuang Yang
- Chinese Academy of Sciences–Max Planck Junior Scientist Group, Key Laboratory of Cellular and Molecular Evolution, Kunming Institute of Zoology, Kunming, Yunnan, China
| | - Wen Wang
- Chinese Academy of Sciences–Max Planck Junior Scientist Group, Key Laboratory of Cellular and Molecular Evolution, Kunming Institute of Zoology, Kunming, Yunnan, China
- * To whom correspondence should be addressed. E-mail: (WW); (ML)
| | - Manyuan Long
- Committee on Evolutionary Biology, University of Chicago, Chicago, Illinois, United States of America
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
- * To whom correspondence should be addressed. E-mail: (WW); (ML)
| |
Collapse
|
44
|
Abstract
There are a number of polymorphism-based statistical tests of neutrality, but most of them focus on either the amount or the pattern of polymorphism. In this article, a new test called the two-dimensional (2D) test is developed. This test evaluates a pair of summary statistics in a two-dimensional field. One statistic should summarize the pattern of polymorphism, while the other could be a measure of the level of polymorphism. For the latter summary statistic, the polymorphism-divergence ratio is used following the idea of the Hudson-Kreitman-Aguadé (HKA) test. To incorporate the HKA test in the 2D test, a summary statistic-based version of the HKA test is developed such that the polymorphism-divergence ratio at a particular region of interest is examined if it is consistent with the average of those in other independent regions.
Collapse
Affiliation(s)
- Hideki Innan
- Human Genetics Center, School of Public Health, University of Texas Health Science Center, Houston, Texas 77030, USA.
| |
Collapse
|
45
|
Chapman BA, Bowers JE, Feltus FA, Paterson AH. Buffering of crucial functions by paleologous duplicated genes may contribute cyclicality to angiosperm genome duplication. Proc Natl Acad Sci U S A 2006; 103:2730-5. [PMID: 16467140 PMCID: PMC1413778 DOI: 10.1073/pnas.0507782103] [Citation(s) in RCA: 133] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Genome duplication followed by massive gene loss has permanently shaped the genomes of many higher eukaryotes, particularly angiosperms. It has long been believed that a primary advantage of genome duplication is the opportunity for the evolution of genes with new functions by modification of duplicated genes. If so, then patterns of genetic diversity among strains within taxa might reveal footprints of selection that are consistent with this advantage. Contrary to classical predictions that duplicated genes may be relatively free to acquire unique functionality, we find among both Arabidopsis ecotypes and Oryza subspecies that SNPs encode less radical amino acid changes in genes for which there exists a duplicated copy at a "paleologous" locus than in "singleton" genes. Preferential retention of duplicated genes encoding long complex proteins and their unexpectedly slow divergence (perhaps because of homogenization) suggest that a primary advantage of retaining duplicated paleologs may be the buffering of crucial functions. Functional buffering and functional divergence may represent extremes in the spectrum of duplicated gene fates. Functional buffering may be especially important during "genomic turmoil" immediately after genome duplication but continues to act approximately 60 million years later, and its gradual deterioration may contribute cyclicality to genome duplication in some lineages.
Collapse
Affiliation(s)
- Brad A. Chapman
- *Plant Genome Mapping Laboratory and Departments of
- Plant Biology
| | | | | | - Andrew H. Paterson
- *Plant Genome Mapping Laboratory and Departments of
- Plant Biology
- Genetics, and
- Crop and Soil Science, University of Georgia, Athens, GA 30602
- To whom correspondence should be addressed at:
Plant Genome Mapping Laboratory, University of Georgia, 111 Riverbend Road, Athens, GA 30602. E-mail:
| |
Collapse
|
46
|
Hallast P, Nagirnaja L, Margus T, Laan M. Segmental duplications and gene conversion: Human luteinizing hormone/chorionic gonadotropin beta gene cluster. Genome Res 2005; 15:1535-46. [PMID: 16251463 PMCID: PMC1310641 DOI: 10.1101/gr.4270505] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2005] [Accepted: 09/06/2005] [Indexed: 11/24/2022]
Abstract
Segmental duplicons (>1 kb) of high sequence similarity (>90%) covering >5% of the human genome are characterized by complex sequence variation. Apart from a few well-characterized regions (MHC, beta-globin), the diversity and linkage disequilibrium (LD) patterns of duplicons and the role of gene conversion in shaping them have been poorly studied. To shed light on these issues, we have re-sequenced the human Luteinizing Hormone/Chorionic Gonadotropin beta (LHB/CGB) cluster (19q13.32) of three population samples (Estonians, Mandenka, and Han). The LHB/CGB cluster consists of seven duplicated genes critical in human reproduction. In the LHB/CGB region, high sequence diversity, concentration of gene-conversion acceptor sites, and strong LD colocalize with peripheral genes, whereas central loci are characterized by lower variation, gene-conversion donor activity, and breakdown of LD between close markers. The data highlight an important role of gene conversion in spreading polymorphisms among duplicon copies and generating LD around them. The directionality of gene-conversion events seems to be determined by the localization of a predicted recombination "hotspot" and "warm spot" in the vicinity of the most active acceptor genes at the periphery of the cluster. The data suggest that enriched crossover activity in direct and inverted segmental repeats is in accordance with the formation of palindromic secondary structures promoting double-strand breaks rather than fixed DNA sequence motifs. Also, this first detailed coverage of sequence diversity and structure of the LHB/CGB gene cluster will pave the way for studying the identified polymorphisms as well as potential genomic rearrangements in association with an individual's reproductive success.
Collapse
Affiliation(s)
- Pille Hallast
- Institute of Molecular and Cell Biology, University of Tartu, Riia 23, 51010 Tartu, Estonia
| | | | | | | |
Collapse
|
47
|
Sugino RP, Innan H. Estimating the time to the whole-genome duplication and the duration of concerted evolution via gene conversion in yeast. Genetics 2005; 171:63-9. [PMID: 15972458 PMCID: PMC1456531 DOI: 10.1534/genetics.105.043869] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2005] [Accepted: 06/13/2005] [Indexed: 11/18/2022] Open
Abstract
A maximum-likelihood (ML) method is developed to estimate the duration of concerted evolution and the time to the whole-genome duplication (WGD) event in baker's yeast (Saccharomyces cerevisiae). The models with concerted evolution fit the data significantly better than the molecular clock model, indicating a crucial role of concerted evolution via gene conversion after gene duplication in yeast. Our ML estimate of the time to the WGD is nearly identical to the time to the speciation event between S. cerevisiae and Kluyveromyces waltii, suggesting that the WGD occurred in very early stages after speciation or the WGD might have been involved in the speciation event.
Collapse
Affiliation(s)
- Ryuichi P Sugino
- Human Genetics Center, School of Public Health, Unversity of Texas Health Science Center, Houston 77030, USA
| | | |
Collapse
|
48
|
Abstract
The gene duplication rate in the yeast genome is estimated without assuming the molecular clock model to be approximately 0.01 to 0.06 per gene per billion years; this rate is two orders of magnitude lower than a previous estimate based on the molecular clock model. This difference is explained by extensive concerted evolution via gene conversion between duplicated genes, which violates the assumption of the molecular clock in the analyses of duplicated genes. The average length of the period of concerted evolution and the gene conversion rate are estimated to be approximately 25 million years and approximately 28 times the mutation rate, respectively.
Collapse
Affiliation(s)
- Li-Zhi Gao
- Human Genetics Center, School of Public Health, University of Texas Health Science Center, 1200 Hermann Pressler, Houston, TX 77030, USA
| | | |
Collapse
|
49
|
Abstract
Nonindependent evolution of duplicated genes is called concerted evolution. In this article, we study the evolutionary process of duplicated regions that involves concerted evolution. The model incorporates mutation and gene conversion: the former increases d, the divergence between two duplicated regions, while the latter decreases d. It is demonstrated that the process consists of three phases. Phase I is the time until d reaches its equilibrium value, d(0). In phase II d fluctuates around d(0), and d increases again in phase III. Our simulation results demonstrate that the length of concerted evolution (i.e., phase II) is highly variable, while the lengths of the other two phases are relatively constant. It is also demonstrated that the length of phase II approximately follows an exponential distribution with mean tau, which is a function of many parameters including gene conversion rate and the length of gene conversion tract. On the basis of these findings, we obtain the probability distribution of the level of divergence between a pair of duplicated regions as a function of time, mutation rate, and tau. Finally, we discuss potential problems in genomic data analysis of duplicated genes when it is based on the molecular clock but concerted evolution is common.
Collapse
Affiliation(s)
- Kosuke M Teshima
- Center for Genome Information, College of Medicine, University of Cincinnati, Cincinnati, Ohio 45267, USA
| | | |
Collapse
|
50
|
Verrelli BC, Tishkoff SA. Signatures of selection and gene conversion associated with human color vision variation. Am J Hum Genet 2004; 75:363-75. [PMID: 15252758 PMCID: PMC1182016 DOI: 10.1086/423287] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2004] [Accepted: 06/10/2004] [Indexed: 11/03/2022] Open
Abstract
Trichromatic color vision in humans results from the combination of red, green, and blue photopigment opsins. Although color vision genes have been the targets of active molecular and psychophysical research on color vision abnormalities, little is known about patterns of normal genetic variation in these genes among global human populations. The current study presents nucleotide sequence analyses and tests of neutrality for a 5.5-kb region of the X-linked long-wave "red" opsin gene (OPN1LW) in 236 individuals from ethnically diverse human populations. Our analysis of the recombination landscape across OPN1LW reveals an unusual haplotype structure associated with amino acid replacement variation in exon 3 that is consistent with gene conversion. Compared with the absence of OPN1LW amino acid replacement fixation since divergence from chimpanzee, the human population exhibits a significant excess of high-frequency OPN1LW replacements. Our results suggest that subtle changes in L-cone opsin wavelength absorption may have been adaptive during human evolution.
Collapse
Affiliation(s)
- Brian C Verrelli
- Department of Biology, University of Maryland, College Park 20742, USA
| | | |
Collapse
|