1
|
Charlesworth D, Zhang Y, Bergero R, Graham C, Gardner J, Yong L. Using GC Content to Compare Recombination Patterns on the Sex Chromosomes and Autosomes of the Guppy, Poecilia reticulata, and Its Close Outgroup Species. Mol Biol Evol 2021; 37:3550-3562. [PMID: 32697821 DOI: 10.1093/molbev/msaa187] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Genetic and physical mapping of the guppy (Poecilia reticulata) have shown that recombination patterns differ greatly between males and females. Crossover events occur evenly across the chromosomes in females, but in male meiosis they are restricted to the tip furthest from the centromere of each chromosome, creating very high recombination rates per megabase, as in pseudoautosomal regions of mammalian sex chromosomes. We used GC content to indirectly infer recombination patterns on guppy chromosomes, based on evidence that recombination is associated with GC-biased gene conversion, so that genome regions with high recombination rates should be detectable by high GC content. We used intron sequences and third positions of codons to make comparisons between sequences that are matched, as far as possible, and are all probably under weak selection. Almost all guppy chromosomes, including the sex chromosome (LG12), have very high GC values near their assembly ends, suggesting high recombination rates due to strong crossover localization in male meiosis. Our test does not suggest that the guppy XY pair has stronger crossover localization than the autosomes, or than the homologous chromosome in the close relative, the platyfish (Xiphophorus maculatus). We therefore conclude that the guppy XY pair has not recently undergone an evolutionary change to a different recombination pattern, or reduced its crossover rate, but that the guppy evolved Y-linkage due to acquiring a male-determining factor that also conferred the male crossover pattern. We also identify the centromere ends of guppy chromosomes, which were not determined in the genome assembly.
Collapse
Affiliation(s)
- Deborah Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Yexin Zhang
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Roberta Bergero
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Chay Graham
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Jim Gardner
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Lengxob Yong
- Centre for Ecology and Conservation, University of Exeter, Falmouth, Cornwall, United Kingdom
| |
Collapse
|
2
|
Korunes KL, Noor MAF. Pervasive gene conversion in chromosomal inversion heterozygotes. Mol Ecol 2018; 28:1302-1315. [PMID: 30387889 DOI: 10.1111/mec.14921] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2018] [Revised: 09/27/2018] [Accepted: 10/22/2018] [Indexed: 12/30/2022]
Abstract
Chromosomal inversions shape recombination landscapes, and species differing by inversions may exhibit reduced gene flow in these regions of the genome. Though single crossovers within inversions are not usually recovered from inversion heterozygotes, the recombination barrier imposed by inversions is nuanced by noncrossover gene conversion. Here, we provide a genomewide empirical analysis of gene conversion rates both within species and in species hybrids. We estimate that gene conversion occurs at a rate of 1 × 10-5 to 2.5 × 10-5 converted sites per bp per generation in experimental crosses within Drosophila pseudoobscura and between D. pseudoobscura and its naturally hybridizing sister species D. persimilis. This analysis is the first direct empirical assessment of gene conversion rates within inversions of a species hybrid. Our data show that gene conversion rates in interspecies hybrids are at least as high as within-species estimates of gene conversion rates, and gene conversion occurs regularly within and around inverted regions of species hybrids, even near inversion breakpoints. We also found that several gene conversion events appeared to be mitotic rather than meiotic in origin. Finally, we observed that gene conversion rates are higher in regions of lower local sequence divergence, yet our observed gene conversion rates in more divergent inverted regions were at least as high as in less divergent collinear regions. Given our observed high rates of gene conversion despite the sequence differentiation between species, especially in inverted regions, gene conversion has the potential to reduce the efficacy of inversions as barriers to recombination over evolutionary time.
Collapse
|
3
|
Chudasama DY, Aladag Z, Felicien MI, Hall M, Beeson J, Asadi N, Gidron Y, Karteris E, Anikin VB. Prognostic value of the DNA integrity index in patients with malignant lung tumors. Oncotarget 2018; 9:21281-21288. [PMID: 29765538 PMCID: PMC5940399 DOI: 10.18632/oncotarget.25086] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2017] [Accepted: 03/06/2018] [Indexed: 11/25/2022] Open
Abstract
Introduction Lung cancer survival remains poor in the western world due to late presentation in most cases, leading to difficulty of treatment in these advanced and metastatic patients. Therefore, the development of a robust biomarker for prognosis and to monitor treatment response and relapse would be of great benefit. The use of Alu repeats and DNA Integrity Index has been shown to hold both diagnostic and prognostic value, and as it is obtained from the plasma of patients, it can serve as a non-invasive tool for routine monitoring. This study evaluates the efficiency of this technique in malignant lung cancer patients. Methods Plasma samples were collected from 48 patients, consisting of 29 lung cancer patients and 19 non-cancer controls. Alu repeat ratio and confounders were measured. Results Observations showed a higher Alu repeat ratio amongst the cancer group compared to controls (p=0.035), mean Alu ratio 0.38 (range 0.01-0.93) and 0.22 (0.007-0.44) respectively, ROC curve analysis AUC 0.61 (p=0.22). Analysis by staging was more promising, whereby a higher DNA Integrity Index was seen in advanced cases compared to both early stage and controls, p<0.0001; AUC: 0.92 (P=0.0002) and p=0.0006, AUC – 0.88 (p=0.0007) respectively, however no significant difference was observed in the early stage compared to controls. Short term survival data also showed a DNA Integrity Index of >0.5 to be associated with poorer overall survival p=0.03. Conclusion The results of this study show a potential use of Alu repeats ratios for prognostic purposes in the advanced setting for lung cancer patients.
Collapse
Affiliation(s)
- Dimple Y Chudasama
- Division of Thoracic Surgery, The Royal Brompton & Harefield NHS Foundation Trust, Harefield Hospital, London, UK.,Division of Biosciences, Brunel University London, London, UK
| | - Zeynep Aladag
- Division of Biosciences, Brunel University London, London, UK
| | | | - Marcia Hall
- Division of Biosciences, Brunel University London, London, UK
| | - Julie Beeson
- Division of Thoracic Surgery, The Royal Brompton & Harefield NHS Foundation Trust, Harefield Hospital, London, UK
| | - Nizar Asadi
- Division of Thoracic Surgery, The Royal Brompton & Harefield NHS Foundation Trust, Harefield Hospital, London, UK
| | - Yori Gidron
- Scalab, Lille University, Oncolille, Lille, France
| | | | - Vladimir B Anikin
- Division of Thoracic Surgery, The Royal Brompton & Harefield NHS Foundation Trust, Harefield Hospital, London, UK
| |
Collapse
|
4
|
Statistical Methods for Identifying Sequence Motifs Affecting Point Mutations. Genetics 2016; 205:843-856. [PMID: 27974498 DOI: 10.1534/genetics.116.195677] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Accepted: 12/01/2016] [Indexed: 11/18/2022] Open
Abstract
Mutation processes differ between types of point mutation, genomic locations, cells, and biological species. For some point mutations, specific neighboring bases are known to be mechanistically influential. Beyond these cases, numerous questions remain unresolved, including: what are the sequence motifs that affect point mutations? How large are the motifs? Are they strand symmetric? And, do they vary between samples? We present new log-linear models that allow explicit examination of these questions, along with sequence logo style visualization to enable identifying specific motifs. We demonstrate the performance of these methods by analyzing mutation processes in human germline and malignant melanoma. We recapitulate the known CpG effect, and identify novel motifs, including a highly significant motif associated with A[Formula: see text]G mutations. We show that major effects of neighbors on germline mutation lie within [Formula: see text] of the mutating base. Models are also presented for contrasting the entire mutation spectra (the distribution of the different point mutations). We show the spectra vary significantly between autosomes and X-chromosome, with a difference in T[Formula: see text]C transition dominating. Analyses of malignant melanoma confirmed reported characteristic features of this cancer, including statistically significant strand asymmetry, and markedly different neighboring influences. The methods we present are made freely available as a Python library https://bitbucket.org/pycogent3/mutationmotif.
Collapse
|
5
|
Korunes KL, Noor MAF. Gene conversion and linkage: effects on genome evolution and speciation. Mol Ecol 2016; 26:351-364. [DOI: 10.1111/mec.13736] [Citation(s) in RCA: 43] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2016] [Revised: 06/07/2016] [Accepted: 06/22/2016] [Indexed: 12/12/2022]
|
6
|
Figuet E, Nabholz B, Bonneau M, Mas Carrio E, Nadachowska-Brzyska K, Ellegren H, Galtier N. Life History Traits, Protein Evolution, and the Nearly Neutral Theory in Amniotes. Mol Biol Evol 2016; 33:1517-27. [DOI: 10.1093/molbev/msw033] [Citation(s) in RCA: 58] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
|
7
|
Mugal CF, Weber CC, Ellegren H. GC-biased gene conversion links the recombination landscape and demography to genomic base composition. Bioessays 2015; 37:1317-26. [DOI: 10.1002/bies.201500058] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Affiliation(s)
- Carina F. Mugal
- Department of Evolutionary Biology; Evolutionary Biology Centre; Uppsala University; Uppsala Sweden
| | - Claudia C. Weber
- Department of Evolutionary Biology; Evolutionary Biology Centre; Uppsala University; Uppsala Sweden
- Department of Biology; Center for Computational Genetics and Genomics; Temple University; Philadelphia PA USA
| | - Hans Ellegren
- Department of Evolutionary Biology; Evolutionary Biology Centre; Uppsala University; Uppsala Sweden
| |
Collapse
|
8
|
Knief U, Schielzeth H, Ellegren H, Kempenaers B, Forstmeier W. A prezygotic transmission distorter acting equally in female and male zebra finchesTaeniopygia guttata. Mol Ecol 2015; 24:3846-59. [DOI: 10.1111/mec.13281] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2015] [Revised: 06/13/2015] [Accepted: 06/17/2015] [Indexed: 12/25/2022]
Affiliation(s)
- Ulrich Knief
- Department of Behavioural Ecology and Evolutionary Genetics; Max Planck Institute for Ornithology; Eberhard-Gwinner-Str. 82319 Seewiesen Germany
| | - Holger Schielzeth
- Department of Evolutionary Biology; Bielefeld University; Morgenbreede 45 33615 Bielefeld Germany
| | - Hans Ellegren
- Department of Evolutionary Biology; Uppsala University; Norbyvägen 18D 752 36 Uppsala Sweden
| | - Bart Kempenaers
- Department of Behavioural Ecology and Evolutionary Genetics; Max Planck Institute for Ornithology; Eberhard-Gwinner-Str. 82319 Seewiesen Germany
| | - Wolfgang Forstmeier
- Department of Behavioural Ecology and Evolutionary Genetics; Max Planck Institute for Ornithology; Eberhard-Gwinner-Str. 82319 Seewiesen Germany
| |
Collapse
|
9
|
Figuet E, Ballenghien M, Romiguier J, Galtier N. Biased gene conversion and GC-content evolution in the coding sequences of reptiles and vertebrates. Genome Biol Evol 2014; 7:240-50. [PMID: 25527834 PMCID: PMC4316630 DOI: 10.1093/gbe/evu277] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Mammalian and avian genomes are characterized by a substantial spatial heterogeneity of GC-content, which is often interpreted as reflecting the effect of local GC-biased gene conversion (gBGC), a meiotic repair bias that favors G and C over A and T alleles in high-recombining genomic regions. Surprisingly, the first fully sequenced nonavian sauropsid (i.e., reptile), the green anole Anolis carolinensis, revealed a highly homogeneous genomic GC-content landscape, suggesting the possibility that gBGC might not be at work in this lineage. Here, we analyze GC-content evolution at third-codon positions (GC3) in 44 vertebrates species, including eight newly sequenced transcriptomes, with a specific focus on nonavian sauropsids. We report that reptiles, including the green anole, have a genome-wide distribution of GC3 similar to that of mammals and birds, and we infer a strong GC3-heterogeneity to be already present in the tetrapod ancestor. We further show that the dynamic of coding sequence GC-content is largely governed by karyotypic features in vertebrates, notably in the green anole, in agreement with the gBGC hypothesis. The discrepancy between third-codon positions and noncoding DNA regarding GC-content dynamics in the green anole could not be explained by the activity of transposable elements or selection on codon usage. This analysis highlights the unique value of third-codon positions as an insertion/deletion-free marker of nucleotide substitution biases that ultimately affect the evolution of proteins.
Collapse
Affiliation(s)
- Emeric Figuet
- CNRS, Université Montpellier 2, UMR 5554, Institut des Sciences de l'Evolution de Montpellier, France
| | - Marion Ballenghien
- CNRS, Université Montpellier 2, UMR 5554, Institut des Sciences de l'Evolution de Montpellier, France
| | - Jonathan Romiguier
- CNRS, Université Montpellier 2, UMR 5554, Institut des Sciences de l'Evolution de Montpellier, France Department of Ecology and Evolution, Biophore, University of Lausanne, Switzerland
| | - Nicolas Galtier
- CNRS, Université Montpellier 2, UMR 5554, Institut des Sciences de l'Evolution de Montpellier, France
| |
Collapse
|
10
|
Weber CC, Boussau B, Romiguier J, Jarvis ED, Ellegren H. Evidence for GC-biased gene conversion as a driver of between-lineage differences in avian base composition. Genome Biol 2014; 15:549. [PMID: 25496599 PMCID: PMC4290106 DOI: 10.1186/s13059-014-0549-1] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2014] [Accepted: 11/19/2014] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND While effective population size (Ne) and life history traits such as generation time are known to impact substitution rates, their potential effects on base composition evolution are less well understood. GC content increases with decreasing body mass in mammals, consistent with recombination-associated GC biased gene conversion (gBGC) more strongly impacting these lineages. However, shifts in chromosomal architecture and recombination landscapes between species may complicate the interpretation of these results. In birds, interchromosomal rearrangements are rare and the recombination landscape is conserved, suggesting that this group is well suited to assess the impact of life history on base composition. RESULTS Employing data from 45 newly and 3 previously sequenced avian genomes covering a broad range of taxa, we found that lineages with large populations and short generations exhibit higher GC content. The effect extends to both coding and non-coding sites, indicating that it is not due to selection on codon usage. Consistent with recombination driving base composition, GC content and heterogeneity were positively correlated with the rate of recombination. Moreover, we observed ongoing increases in GC in the majority of lineages. CONCLUSIONS Our results provide evidence that gBGC may drive patterns of nucleotide composition in avian genomes and are consistent with more effective gBGC in large populations and a greater number of meioses per unit time; that is, a shorter generation time. Thus, in accord with theoretical predictions, base composition evolution is substantially modulated by species life history.
Collapse
Affiliation(s)
- Claudia C Weber
- />Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18D, SE-752 36 Uppsala, Sweden
| | - Bastien Boussau
- />Laboratoire de Biométrie et Biologie Evolutive, Université de Lyon, Université Lyon 1, CNRS, UMR5558 Villeurbanne, France
| | | | - Erich D Jarvis
- />Department of Neurobiology, Howard Hughes Medical Institute, Duke University Medical Center, Durham, NC USA
| | - Hans Ellegren
- />Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18D, SE-752 36 Uppsala, Sweden
| |
Collapse
|
11
|
Robinson MC, Stone EA, Singh ND. Population genomic analysis reveals no evidence for GC-biased gene conversion in Drosophila melanogaster. Mol Biol Evol 2013; 31:425-33. [PMID: 24214536 DOI: 10.1093/molbev/mst220] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Gene conversion is the nonreciprocal exchange of genetic material between homologous chromosomes. Multiple lines of evidence from a variety of taxa strongly suggest that gene conversion events are biased toward GC-bearing alleles. However, in Drosophila, the data have largely been indirect and unclear, with some studies supporting the predictions of a GC-biased gene conversion model and other data showing contradictory findings. Here, we test whether gene conversion events are GC-biased in Drosophila melanogaster using whole-genome polymorphism and divergence data. Our results provide no support for GC-biased gene conversion and thus suggest that this process is unlikely to significantly contribute to patterns of polymorphism and divergence in this system.
Collapse
Affiliation(s)
- Matthew C Robinson
- Department of Biological Sciences, Program in Genetics, North Carolina State University
| | | | | |
Collapse
|
12
|
Segmenting the human genome based on states of neutral genetic divergence. Proc Natl Acad Sci U S A 2013; 110:14699-704. [PMID: 23959903 DOI: 10.1073/pnas.1221792110] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Many studies have demonstrated that divergence levels generated by different mutation types vary and covary across the human genome. To improve our still-incomplete understanding of the mechanistic basis of this phenomenon, we analyze several mutation types simultaneously, anchoring their variation to specific regions of the genome. Using hidden Markov models on insertion, deletion, nucleotide substitution, and microsatellite divergence estimates inferred from human-orangutan alignments of neutrally evolving genomic sequences, we segment the human genome into regions corresponding to different divergence states--each uniquely characterized by specific combinations of divergence levels. We then parsed the mutagenic contributions of various biochemical processes associating divergence states with a broad range of genomic landscape features. We find that high divergence states inhabit guanine- and cytosine (GC)-rich, highly recombining subtelomeric regions; low divergence states cover inner parts of autosomes; chromosome X forms its own state with lowest divergence; and a state of elevated microsatellite mutability is interspersed across the genome. These general trends are mirrored in human diversity data from the 1000 Genomes Project, and departures from them highlight the evolutionary history of primate chromosomes. We also find that genes and noncoding functional marks [annotations from the Encyclopedia of DNA Elements (ENCODE)] are concentrated in high divergence states. Our results provide a powerful tool for biomedical data analysis: segmentations can be used to screen personal genome variants--including those associated with cancer and other diseases--and to improve computational predictions of noncoding functional elements.
Collapse
|
13
|
Cagliani R, Guerini FR, Rubio-Acero R, Baglio F, Forni D, Agliardi C, Griffanti L, Fumagalli M, Pozzoli U, Riva S, Calabrese E, Sikora M, Casals F, Comi GP, Bresolin N, Cáceres M, Clerici M, Sironi M. Long-standing balancing selection in the THBS4 gene: influence on sex-specific brain expression and gray matter volumes in Alzheimer disease. Hum Mutat 2013; 34:743-53. [PMID: 23420636 DOI: 10.1002/humu.22301] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2012] [Accepted: 02/01/2013] [Indexed: 01/08/2023]
Abstract
The THBS4 gene encodes a glycoprotein involved in inflammatory responses and synaptogenesis. THBS4 is expressed at higher levels in the brain of humans compared with nonhuman primates, and the protein accumulates in β-amyloid plaques. We analyzed THBS4 genetic variability in humans and show that two haplotypes (hap1 and hap2) are maintained by balancing selection and modulate THBS4 expression in lymphocytes. Indeed, the balancing selection region covers a predicted transcriptional enhancer. In humans, but not in macaques and chimpanzees, THBS4 brain expression increases with age, and variants in the balancing selection region interact with sex in influencing THBS4 expression (pinteraction = 0.038), with hap1 homozygous females showing lowest expression. In Alzheimer disease (AD) patients, significant interactions between sex and THBS4 genotype were detected for peripheral gray matter (pinteraction = 0.014) and total gray matter (pinteraction = 0.012) volumes. Similarly to the gene expression results, the interaction is mainly mediated by hap1 homozygous AD females, who show reduced volumes. Thus, the balancing selection target in THBS4 is likely represented by one or more variants that regulate tissue-specific and sex-specific gene expression. The selection signature associated with THBS4 might not be related to AD pathogenesis, but rather to inflammatory responses.
Collapse
|
14
|
Lartillot N. Interaction between selection and biased gene conversion in mammalian protein-coding sequence evolution revealed by a phylogenetic covariance analysis. Mol Biol Evol 2012; 30:356-68. [PMID: 23024185 DOI: 10.1093/molbev/mss231] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
According to the nearly-neutral model, variation in long-term effective population size among species should result in correlated variation in the ratio of nonsynonymous over synonymous substitution rates (dN/dS). Previous empirical investigations in mammals have been consistent with this prediction, suggesting an important role for nearly-neutral effects on protein-coding sequence evolution. GC-biased gene conversion (gBGC), on the other hand, is increasingly recognized as a major evolutionary force shaping genome nucleotide composition. When sufficiently strong compared with random drift, gBGC may significantly interfere with a nearly-neutral regime and impact dN/dS in a complex manner. Here, we investigate the phylogenetic correlations between dN/dS, the equilibrium GC composition (GC*), and several life-history and karyotypic traits in placental mammals. We show that the equilibrium GC composition decreases with body mass and increases with the number of chromosomes, suggesting a modulation of the strength of biased gene conversion due to changes in effective population size and genome-wide recombination rate. The variation in dN/dS is complex and only partially fits the prediction of the nearly-neutral theory. However, specifically restricting estimation of the dN/dS ratio on GC-conservative transversions, which are immune from gBGC, results in correlations that are more compatible with a nearly-neutral interpretation. Our investigation indicates the presence of complex interactions between selection and biased gene conversion and suggests that further mechanistic development is warranted, to tease out mutation, selection, drift, and conversion.
Collapse
Affiliation(s)
- Nicolas Lartillot
- Centre Robert-Cedergren pour la Bioinformatique, Département de Biochimie, Université de Montréal, Québec, Canada.
| |
Collapse
|
15
|
Popa A, Samollow P, Gautier C, Mouchiroud D. The sex-specific impact of meiotic recombination on nucleotide composition. Genome Biol Evol 2012; 4:412-22. [PMID: 22417915 PMCID: PMC3318449 DOI: 10.1093/gbe/evs023] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Meiotic recombination is an important evolutionary force shaping the nucleotide landscape of genomes. For most vertebrates, the frequency of recombination varies slightly or considerably between the sexes (heterochiasmy). In humans, male, rather than female, recombination rate has been found to be more highly correlated with the guanine and cytosine (GC) content across the genome. In the present study, we review the results in human and extend the examination of the evolutionary impact of heterochiasmy beyond primates to include four additional eutherian mammals (mouse, dog, pig, and sheep), a metatherian mammal (opossum), and a bird (chicken). Specifically, we compared sex-specific recombination rates (RRs) with nucleotide substitution patterns evaluated in transposable elements. Our results, based on a comparative approach, reveal a great diversity in the relationship between heterochiasmy and nucleotide composition. We find that the stronger male impact on this relationship is a conserved feature of human, mouse, dog, and sheep. In contrast, variation in genomic GC content in pig and opossum is more strongly correlated with female, rather than male, RR. Moreover, we show that the sex-differential impact of recombination is mainly driven by the chromosomal localization of recombination events. Independent of sex, the higher the RR in a genomic region and the longer this recombination activity is conserved in time, the stronger the bias in nucleotide substitution pattern, through such mechanisms as biased gene conversion. Over time, this bias will increase the local GC content of the region.
Collapse
|
16
|
Axelsson E, Webster MT, Ratnakumar A, Ponting CP, Lindblad-Toh K. Death of PRDM9 coincides with stabilization of the recombination landscape in the dog genome. Genome Res 2012; 22:51-63. [PMID: 22006216 PMCID: PMC3246206 DOI: 10.1101/gr.124123.111] [Citation(s) in RCA: 93] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2011] [Accepted: 10/05/2011] [Indexed: 11/25/2022]
Abstract
Analysis of diverse eukaryotes has revealed that recombination events cluster in discrete genomic locations known as hotspots. In humans, a zinc-finger protein, PRDM9, is believed to initiate recombination in >40% of hotspots by binding to a specific DNA sequence motif. However, the PRDM9 coding sequence is disrupted in the dog genome assembly, raising questions regarding the nature and control of recombination in dogs. By analyzing the sequences of PRDM9 orthologs in a number of dog breeds and several carnivores, we show here that this gene was inactivated early in canid evolution. We next use patterns of linkage disequilibrium using more than 170,000 SNP markers typed in almost 500 dogs to estimate the recombination rates in the dog genome using a coalescent-based approach. Broad-scale recombination rates show good correspondence with an existing linkage-based map. Significant variation in recombination rate is observed on the fine scale, and we are able to detect over 4000 recombination hotspots with high confidence. In contrast to human hotspots, 40% of canine hotspots are characterized by a distinct peak in GC content. A comparative genomic analysis indicates that these peaks are present also as weaker peaks in the panda, suggesting that the hotspots have been continually reinforced by accelerated and strongly GC biased nucleotide substitutions, consistent with the long-term action of biased gene conversion on the dog lineage. These results are consistent with the loss of PRDM9 in canids, resulting in a greater evolutionary stability of recombination hotspots. The genetic determinants of recombination hotspots in the dog genome may thus reflect a fundamental process of relevance to diverse animal species.
Collapse
Affiliation(s)
- Erik Axelsson
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, 75237 Uppsala, Sweden
| | - Matthew T. Webster
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, 75237 Uppsala, Sweden
| | - Abhirami Ratnakumar
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, 75237 Uppsala, Sweden
| | - Chris P. Ponting
- MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3QX, United Kingdom
| | - Kerstin Lindblad-Toh
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, 75237 Uppsala, Sweden
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, Massachusetts 02139, USA
| |
Collapse
|
17
|
Late replicating domains are highly recombining in females but have low male recombination rates: implications for isochore evolution. PLoS One 2011; 6:e24480. [PMID: 21949720 PMCID: PMC3176772 DOI: 10.1371/journal.pone.0024480] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2011] [Accepted: 08/11/2011] [Indexed: 01/01/2023] Open
Abstract
In mammals sequences that are either late replicating or highly recombining have high rates of evolution at putatively neutral sites. As early replicating domains and highly recombining domains both tend to be GC rich we a priori expect these two variables to covary. If so, the relative contribution of either of these variables to the local neutral substitution rate might have been wrongly estimated owing to covariance with the other. Against our expectations, we find that sex-averaged recombination rates show little or no correlation with replication timing, suggesting that they are independent determinants of substitution rates. However, this result masks significant sex-specific complexity: late replicating domains tend to have high recombination rates in females but low recombination rates in males. That these trends are antagonistic explains why sex-averaged recombination is not correlated with replication timing. This unexpected result has several important implications. First, although both male and female recombination rates covary significantly with intronic substitution rates, the magnitude of this correlation is moderately underestimated for male recombination and slightly overestimated for female recombination, owing to covariance with replicating timing. Second, the result could explain why male recombination is strongly correlated with GC content but female recombination is not. If to explain the correlation between GC content and replication timing we suppose that late replication forces reduced GC content, then GC promotion by biased gene conversion during female recombination is partly countered by the antagonistic effect of later replicating sequence tending increase AT content. Indeed, the strength of the correlation between female recombination rate and local GC content is more than doubled by control for replication timing. Our results underpin the need to consider sex-specific recombination rates and potential covariates in analysis of GC content and rates of evolution.
Collapse
|
18
|
Nabholz B, Künstner A, Wang R, Jarvis ED, Ellegren H. Dynamic evolution of base composition: causes and consequences in avian phylogenomics. Mol Biol Evol 2011; 28:2197-210. [PMID: 21393604 PMCID: PMC3144382 DOI: 10.1093/molbev/msr047] [Citation(s) in RCA: 67] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Resolving the phylogenetic relationships among birds is a classical problem in systematics, and this is particularly so when it comes to understanding the relationships among Neoaves. Previous phylogenetic inference of birds has been limited to mitochondrial genomes or a few nuclear genes. Here, we apply deep brain transcriptome sequencing of nine bird species (several passerines, hummingbirds, dove, parrot, and emu), using next-generation sequencing technology to understand features of transcriptome evolution in birds and how this affects phylogenetic inference, and combine with data from two bird species using first generation technology. The phylogenomic data matrix comprises 1,995 genes and a total of 0.77 Mb of exonic sequence. First, we find an unexpected heterogeneity in the evolution of base composition among avian lineages. There is a pronounced increase in guanine + cytosine (GC) content in the third codon position in several independent lineages, with the strongest effect seen in passerines. Second, we evaluate the effect of GC content variation on phylogenetic reconstruction. We find important inconsistencies between the topologies obtained with or without taking GC variation into account, each supporting different conclusions of past studies and also influencing hypotheses on the evolution of the trait of vocal learning. Third, we demonstrate a link between GC content evolution and recombination rate and, focusing on the zebra finch lineage, find that recombination seems to drive GC content. Although we cannot reveal the causal relationships, this observation is consistent with the model of GC-biased gene conversion. Finally, we use this unparalleled amount of avian sequence data to study the rate of molecular evolution, calibrated by fossil evidence and augmented with data from alligator transcriptome sequencing. There is a 2- to 3-fold variation in substitution rate among lineages with passerines being the most rapidly evolving and ratites the slowest. This study illustrates the potential of next-generation sequencing for phylogenomic studies but also the pitfalls when using genome-wide data with heterogeneous base composition.
Collapse
Affiliation(s)
- Benoit Nabholz
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| | - Axel Künstner
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| | - Rui Wang
- Department of Neurobiology, Howard Hughes Medical Institute, Duke University Medical Center, Durham
| | - Erich D. Jarvis
- Department of Neurobiology, Howard Hughes Medical Institute, Duke University Medical Center, Durham
| | - Hans Ellegren
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| |
Collapse
|
19
|
Katzman S, Capra JA, Haussler D, Pollard KS. Ongoing GC-biased evolution is widespread in the human genome and enriched near recombination hot spots. Genome Biol Evol 2011; 3:614-26. [PMID: 21697099 PMCID: PMC3157837 DOI: 10.1093/gbe/evr058] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Fast evolving regions of many metazoan genomes show a bias toward substitutions that change weak (A,T) into strong (G,C) base pairs. Single-nucleotide polymorphisms (SNPs) do not share this pattern, suggesting that it results from biased fixation rather than biased mutation. Supporting this hypothesis, analyses of polymorphism in specific regions of the human genome have identified a positive correlation between weak to strong (W→S) SNPs and derived allele frequency (DAF), suggesting that SNPs become increasingly GC biased over time, especially in regions of high recombination. Using polymorphism data generated by the 1000 Genomes Project from 179 individuals from 4 human populations, we evaluated the extent and distribution of ongoing GC-biased evolution in the human genome. We quantified GC fixation bias by comparing the DAFs of W→S mutations and S→W mutations using a Mann-Whitney U test. Genome-wide, W→S SNPs have significantly higher DAFs than S→W SNPs. This pattern is widespread across the human genome but varies in magnitude along the chromosomes. We found extreme GC-biased evolution in neighborhoods of recombination hot spots, a significant correlation between GC bias and recombination rate, and an inverse correlation between GC bias and chromosome arm length. These findings demonstrate the presence of ongoing fixation bias favoring G and C alleles throughout the human genome and suggest that the bias is caused by a recombination-associated process, such as GC-biased gene conversion.
Collapse
Affiliation(s)
- Sol Katzman
- Center for Biomolecular Science and Engineering, University of California, Santa Cruz, USA
| | | | | | | |
Collapse
|
20
|
Capra JA, Pollard KS. Substitution patterns are GC-biased in divergent sequences across the metazoans. Genome Biol Evol 2011; 3:516-27. [PMID: 21670083 PMCID: PMC3138425 DOI: 10.1093/gbe/evr051] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
The fastest-evolving regions in the human and chimpanzee genomes show a remarkable excess of weak (A,T) to strong (G,C) nucleotide substitutions since divergence from their common ancestor. We investigated the phylogenetic extent and possible causes of this weak to strong (W→S) bias in divergent sequences (BDS) using recently sequenced genomes and recombination maps from eight trios of eukaryotic species. To quantify evidence for BDS, we inferred substitution histories using an efficient maximum likelihood approach with a context-dependent evolutionary model. We then annotated all lineage-specific substitutions in terms of W→S bias and density on the chromosomes. Finally, we used the inferred substitutions to calculate a BDS score—a log odds ratio between substitution type and density—and assessed its statistical significance with Fisher's exact test. Applying this approach, we found significant BDS in the coding and noncoding sequence of human, mouse, dog, stickleback, fruit fly, and worm. We also observed a significant lack of W→S BDS in chicken and yeast. The BDS score varies between species and across the chromosomes within each species. It is most strongly correlated with different genomic features in different species, but a strong correlation with recombination rates is found in several species. Our results demonstrate that a W→S substitution bias in fast-evolving sequences is a widespread phenomenon. The patterns of BDS observed suggest that a recombination-associated process, such as GC-biased gene conversion, is involved in the production of the bias in many species, but the strength of the BDS likely depends on many factors, including genome stability, variability in recombination rate over time and across the genome, the frequency of meiosis, and the amount of outcrossing in each species.
Collapse
Affiliation(s)
- John A. Capra
- Gladstone Institutes, University of California, San Francisco
| | - Katherine S. Pollard
- Gladstone Institutes, University of California, San Francisco
- Division of Biostatistics & Institute for Human Genetics, University of California, San Francisco
- Corresponding author: E-mail:
| |
Collapse
|
21
|
Clément Y, Arndt PF. Substitution patterns are under different influences in primates and rodents. Genome Biol Evol 2011; 3:236-45. [PMID: 21339508 PMCID: PMC3068003 DOI: 10.1093/gbe/evr011] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
There are large-scale variations of the GC-content along mammalian chromosomes that have been called isochore structures. Primates and rodents have different isochore structures, which suggests that these lineages exhibit different modes of GC-content evolution. It has been shown that, in the human lineage, GC-biased gene conversion (gBGC), a neutral process associated with meiotic recombination, acts on GC-content evolution by influencing A or T to G or C substitution rates. We computed genome-wide substitution patterns in the mouse lineage from multiple alignments and compared them with substitution patterns in the human lineage. We found that in the mouse lineage, gBGC is active but weaker than in the human lineage and that male-specific recombination better predicts GC-content evolution than female-specific recombination. Furthermore, we were able to show that G or C to A or T substitution rates are predicted by a combination of different factors in both lineages. A or T to G or C substitution rates are most strongly predicted by meiotic recombination in the human lineage but by CpG odds ratio (the observed CpG frequency normalized by the expected CpG frequency) in the mouse lineage, suggesting that substitution patterns are under different influences in primates and rodents.
Collapse
Affiliation(s)
- Yves Clément
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany.
| | | |
Collapse
|
22
|
Li M, Chen SS. The tendency to recreate ancestral CG dinucleotides in the human genome. BMC Evol Biol 2011; 11:3. [PMID: 21208429 PMCID: PMC3025853 DOI: 10.1186/1471-2148-11-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2010] [Accepted: 01/05/2011] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND The CG dinucleotides are known to be deficient in the human genome, due to a high mutation rate from 5-methylated CG to TG and its complementary pair CA. Meanwhile, many cellular functions rely on these CG dinucleotides, such as gene expression controlled by cytosine methylation status. Thus, CG dinucleotides that provide essential functional substrates should be retained in genomes. How these two conflicting processes regarding the fate of CG dinucleotides - i.e., high mutation rate destroying CG dinucleotides, vs. functional processes that require their preservation remains an unsolved question. RESULTS By analyzing the mutation and frequency spectrum of newly derived alleles in the human genome, a tendency towards generating more CGs was observed, which was mainly contributed by an excess number of mutations from CA/TG to CG. Simultaneously, we found a fixation preference for CGs derived from TG/CA rather than CGs generated by other dinucleotides. These tendencies were observed both in intergenic and genic regions. An analysis of Integrated Extended Haplotype Homozygosity provided no evidence of selection for newly derived CGs. CONCLUSIONS Ancestral CG dinucleotides that were subsequently lost by mutation tend to be recreated in the human genome, as indicated by a biased mutation and fixation pattern favoring new CGs that derived from TG/CA.
Collapse
Affiliation(s)
- Mingkun Li
- CAS-MPG Partner Institute of Computational Biology, Shanghai Institutes of Biological Sciences, Chinese Academy of Sciences, 200000 Shanghai, PR China.
| | | |
Collapse
|
23
|
Ratnakumar A, Mousset S, Glémin S, Berglund J, Galtier N, Duret L, Webster MT. Detecting positive selection within genomes: the problem of biased gene conversion. Philos Trans R Soc Lond B Biol Sci 2010; 365:2571-80. [PMID: 20643747 DOI: 10.1098/rstb.2010.0007] [Citation(s) in RCA: 116] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
The identification of loci influenced by positive selection is a major goal of evolutionary genetics. A popular approach is to perform scans of alignments on a genome-wide scale in order to find regions evolving at accelerated rates on a particular branch of a phylogenetic tree. However, positive selection is not the only process that can lead to accelerated evolution. Notably, GC-biased gene conversion (gBGC) is a recombination-associated process that results in the biased fixation of G and C nucleotides. This process can potentially generate bursts of nucleotide substitutions within hotspots of meiotic recombination. Here, we analyse the results of a scan for positive selection on genes on branches across the primate phylogeny. We show that genes identified as targets of positive selection have a significant tendency to exhibit the genomic signature of gBGC. Using a maximum-likelihood framework, we estimate that more than 20 per cent of cases of significantly elevated non-synonymous to synonymous substitution rates ratio (d(N)/d(S)), particularly in shorter branches, could be due to gBGC. We demonstrate that in some cases, gBGC can lead to very high d(N)/d(S) (more than 2). Our results indicate that gBGC significantly affects the evolution of coding sequences in primates, often leading to patterns of evolution that can be mistaken for positive selection.
Collapse
Affiliation(s)
- Abhirami Ratnakumar
- Department of Medical Biochemistry and Microbiology, Uppsala University, Box 582, 751 23 Uppsala, Sweden
| | | | | | | | | | | | | |
Collapse
|
24
|
Fumagalli M, Cagliani R, Riva S, Pozzoli U, Biasin M, Piacentini L, Comi GP, Bresolin N, Clerici M, Sironi M. Population genetics of IFIH1: ancient population structure, local selection, and implications for susceptibility to type 1 diabetes. Mol Biol Evol 2010; 27:2555-66. [PMID: 20538742 DOI: 10.1093/molbev/msq141] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
The human interferon induced with helicase C domain 1 (IFIH1) gene encodes a sensor of double-strand RNA involved in innate immunity against viruses, indicating that this gene is a likely target of virus-driven selective pressure. Notably, IFIH1 also plays a role in autoimmunity, as common and rare polymorphisms in this gene have been associated with type 1 diabetes (T1D). We analyzed the evolutionary history of IFIH1 in human populations. Results herein suggest that two major IFIH1 haplotype clades originated from ancestral population structure (or balancing selection) in the African continent and that local selective pressures have acted on the gene. Specifically, directional selection in Europe and Asia resulted in the spread of a common IFIH1 haplotype carrying a derived His460 allele. This variant changes a highly conserved arginine residue in the helicase domain, possibly conferring altered specificity in viral recognition. An alternative common haplotype has swept to high frequency in South Americans as a result of recent positive selection. Previous studies suggested that a portion of risk alleles for autoimmune diseases could have been maintained in humans as they conferred a selective advantage against infections. This is not the case for IFIH1, as population genetic differentiation and haplotype analyses indicated that the T1D susceptibility alleles behaved as neutral or nearly neutral polymorphisms. Our findings suggest that variants in IFIH1 confer different susceptibility to diverse viral infections and provide insight into the relationship between adaptation to past infection and predisposition to autoimmunity in modern populations.
Collapse
Affiliation(s)
- Matteo Fumagalli
- Bioinformatic Lab, Scientific Institute Istituto di Ricovero e Cura a Carattere Scientifico E. Medea, Bosisio Parini, Lecco, Italy
| | | | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Contrasting GC-content dynamics across 33 mammalian genomes: relationship with life-history traits and chromosome sizes. Genome Res 2010; 20:1001-9. [PMID: 20530252 DOI: 10.1101/gr.104372.109] [Citation(s) in RCA: 148] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The origin, evolution, and functional relevance of genomic variations in GC content are a long-debated topic, especially in mammals. Most of the existing literature, however, has focused on a small number of model species and/or limited sequence data sets. We analyzed more than 1000 orthologous genes in 33 fully sequenced mammalian genomes, reconstructed their ancestral isochore organization in the maximum likelihood framework, and explored the evolution of third-codon position GC content in representatives of 16 orders and 27 families. We showed that the previously reported erosion of GC-rich isochores is not a general trend. Several species (e.g., shrew, microbat, tenrec, rabbit) have independently undergone a marked increase in GC content, with a widening gap between the GC-poorest and GC-richest classes of genes. The intensively studied apes and (especially) murids do not reflect the general placental pattern. We correlated GC-content evolution with species life-history traits and cytology. Significant effects of body mass and genome size were detected, with each being consistent with the GC-biased gene conversion model.
Collapse
|
26
|
Aleshin A, Zhi D. Recombination-associated sequence homogenization of neighboring Alu elements: signature of nonallelic gene conversion. Mol Biol Evol 2010; 27:2300-11. [PMID: 20453015 DOI: 10.1093/molbev/msq116] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Recently, researchers have begun to recognize that, in order to establish neutral models for disease association and evolutionary genomics studies, it is crucial to have a clear understanding of the genomic impact of nonallelic gene conversion. Drawing on previous successes in characterizing this phenomenon over protein-coding gene families, we undertook a computational analysis of neighboring Alu sequences in the genome scale. For this purpose, we developed adjusted comutation rate (aCMR), a novel statistical method measuring the excess number of identical point mutations shared by adjacent Alu sequences, vis-à-vis random pairs. Using aCMR, we uncovered a remarkable genome-wide sequence homogenization of neighboring Alus, with the strongest signal observed in the pseudoautosomal regions of the X and Y chromosomes. The magnitude of sequence homogenization between Alu pairs is greater with shorter interlocus distance, higher sequence identity, and parallel orientation. Moreover, shared substitutions show a strong directionality toward GC nucleotides, with multiple substitutions tending to cluster within the Alu sequence. Taken together, these observed recombination-associated sequence homogenization patterns are best explained by frequent ubiquitous gene conversion events between neighboring Alus. We believe that these observations help to illuminate the nature and impact of the enigmatic phenomenon of gene conversion.
Collapse
Affiliation(s)
- Alexey Aleshin
- Department of Medicine, Division of Hematology, Oncology, David Geffen School of Medicine, University of California, Los Angeles, USA
| | | |
Collapse
|
27
|
Backström N, Forstmeier W, Schielzeth H, Mellenius H, Nam K, Bolund E, Webster MT, Öst T, Schneider M, Kempenaers B, Ellegren H. The recombination landscape of the zebra finch Taeniopygia guttata genome. Genome Res 2010; 20:485-95. [PMID: 20357052 PMCID: PMC2847751 DOI: 10.1101/gr.101410.109] [Citation(s) in RCA: 168] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2009] [Accepted: 12/02/2009] [Indexed: 12/18/2022]
Abstract
Understanding the causes and consequences of variation in the rate of recombination is essential since this parameter is considered to affect levels of genetic variability, the efficacy of selection, and the design of association and linkage mapping studies. However, there is limited knowledge about the factors governing recombination rate variation. We genotyped 1920 single nucleotide polymorphisms in a multigeneration pedigree of more than 1000 zebra finches (Taeniopygia guttata) to develop a genetic linkage map, and then we used these map data together with the recently available draft genome sequence of the zebra finch to estimate recombination rates in 1 Mb intervals across the genome. The average zebra finch recombination rate (1.5 cM/Mb) is higher than in humans, but significantly lower than in chicken. The local rates of recombination in chicken and zebra finch were only weakly correlated, demonstrating evolutionary turnover of the recombination landscape in birds. The distribution of recombination events was heavily biased toward ends of chromosomes, with a stronger telomere effect than so far seen in any organism. In fact, the recombination rate was as low as 0.1 cM/Mb in intervals up to 100 Mb long in the middle of the larger chromosomes. We found a positive correlation between recombination rate and GC content, as well as GC-rich sequence motifs. Levels of linkage disequilibrium (LD) were significantly higher in regions of low recombination, showing that heterogeneity in recombination rates have left a footprint on the genomic landscape of LD in zebra finch populations.
Collapse
Affiliation(s)
- Niclas Backström
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36 Uppsala, Sweden
| | - Wolfgang Forstmeier
- Max Planck Institute for Ornithology, Department of Behavioural Ecology and Evolutionary Genetics, 82319 Seewiesen, Germany
| | - Holger Schielzeth
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36 Uppsala, Sweden
- Max Planck Institute for Ornithology, Department of Behavioural Ecology and Evolutionary Genetics, 82319 Seewiesen, Germany
| | - Harriet Mellenius
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36 Uppsala, Sweden
| | - Kiwoong Nam
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36 Uppsala, Sweden
| | - Elisabeth Bolund
- Max Planck Institute for Ornithology, Department of Behavioural Ecology and Evolutionary Genetics, 82319 Seewiesen, Germany
| | - Matthew T. Webster
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36 Uppsala, Sweden
| | - Torbjörn Öst
- Molecular Medicine, Department of Medical Sciences, University Hospital, SE-751 85 Uppsala, Sweden
| | - Melanie Schneider
- Max Planck Institute for Ornithology, Department of Behavioural Ecology and Evolutionary Genetics, 82319 Seewiesen, Germany
| | - Bart Kempenaers
- Max Planck Institute for Ornithology, Department of Behavioural Ecology and Evolutionary Genetics, 82319 Seewiesen, Germany
| | - Hans Ellegren
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36 Uppsala, Sweden
| |
Collapse
|
28
|
Navarro-Costa P, Gonçalves J, Plancha CE. The AZFc region of the Y chromosome: at the crossroads between genetic diversity and male infertility. Hum Reprod Update 2010; 16:525-42. [PMID: 20304777 PMCID: PMC2918367 DOI: 10.1093/humupd/dmq005] [Citation(s) in RCA: 99] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND The three azoospermia factor (AZF) regions of the Y chromosome represent genomic niches for spermatogenesis genes. Yet, the most distal region, AZFc, is a major generator of large-scale variation in the human genome. Determining to what extent this variability affects spermatogenesis is a highly contentious topic in human reproduction. METHODS In this review, an extensive characterization of the molecular mechanisms responsible for AZFc genotypical variation is undertaken. Such data are complemented with the assessment of the clinical consequences for male fertility imputable to the different AZFc variants. For this, a critical re-evaluation of 23 association studies was performed in order to extract unifying conclusions by curtailing methodological heterogeneities. RESULTS Intrachromosomal homologous recombination mechanisms, either crossover or non-crossover based, are the main drivers for AZFc genetic diversity. In particular, rearrangements affecting gene dosage are the most likely to introduce phenotypical disruptions in the spermatogenic profile. In the specific cases of partial AZFc deletions, both the actual existence and the severity of the spermatogenic defect are dependent on the evolutionary background of the Y chromosome. CONCLUSIONS AZFc is one of the most genetically dynamic regions in the human genome. This property may serve as counter against the genetic degeneracy associated with the lack of a meiotic partner. However, such strategy comes at a price: some rearrangements represent a risk factor or a de-facto causative agent of spermatogenic disruption. Interestingly, this precarious balance is modulated, among other yet unknown factors, by the evolutionary history of the Y chromosome.
Collapse
Affiliation(s)
- Paulo Navarro-Costa
- Instituto de Medicina Molecular, Faculdade de Medicina de Lisboa, Lisboa, Portugal.
| | | | | |
Collapse
|
29
|
Duret L, Galtier N. Biased gene conversion and the evolution of mammalian genomic landscapes. Annu Rev Genomics Hum Genet 2009; 10:285-311. [PMID: 19630562 DOI: 10.1146/annurev-genom-082908-150001] [Citation(s) in RCA: 468] [Impact Index Per Article: 31.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Recombination is typically thought of as a symmetrical process resulting in large-scale reciprocal genetic exchanges between homologous chromosomes. Recombination events, however, are also accompanied by short-scale, unidirectional exchanges known as gene conversion in the neighborhood of the initiating double-strand break. A large body of evidence suggests that gene conversion is GC-biased in many eukaryotes, including mammals and human. AT/GC heterozygotes produce more GC- than AT-gametes, thus conferring a population advantage to GC-alleles in high-recombining regions. This apparently unimportant feature of our molecular machinery has major evolutionary consequences. Structurally, GC-biased gene conversion explains the spatial distribution of GC-content in mammalian genomes-the so-called isochore structure. Functionally, GC-biased gene conversion promotes the segregation and fixation of deleterious AT --> GC mutations, thus increasing our genomic mutation load. Here we review the recent evidence for a GC-biased gene conversion process in mammals, and its consequences for genomic landscapes, molecular evolution, and human functional genomics.
Collapse
Affiliation(s)
- Laurent Duret
- Université de Lyon 1, CNRS, UMR5558, Laboratoire de Biométrie et Biologie Evolutive, F-69622, Villeurbanne, France.
| | | |
Collapse
|
30
|
Qi YJ, Qiu WY. Symmetry Analysis of an X-palindrome in Human and Chimpanzee. CHINESE J CHEM PHYS 2009. [DOI: 10.1088/1674-0068/22/04/401-405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
31
|
GC content and recombination: reassessing the causal effects for the Saccharomyces cerevisiae genome. Genetics 2009; 183:31-8. [PMID: 19546316 DOI: 10.1534/genetics.109.105049] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Recombination plays a crucial role in the evolution of genomes. Among many chromosomal features, GC content is one of the most prominent variables that appear to be highly correlated with recombination. However, it is not yet clear (1) whether recombination drives GC content (as proposed, for example, in the biased gene conversion model) or the converse and (2) what are the length scales for mutual influences between GC content and recombination. Here we have reassessed these questions for the model genome Saccharomyces cerevisiae, for which the most refined recombination data are available. First, we confirmed a strong correlation between recombination rate and GC content at local scales (a few kilobases). Second, on the basis of alignments between S. cerevisiae, S. paradoxus, and S. mikatae sequences, we showed that the inferred AT/GC substitution patterns are not correlated with recombination, indicating that GC content is not driven by recombination in yeast. These results thus suggest that, in S. cerevisiae, recombination is determined either by the GC content or by a third parameter, also affecting the GC content. Third, we observed long-range correlations between GC and recombination for chromosome III (for which such correlations were reported experimentally and were the model for many structural studies). However, similar correlations were not detected in the other chromosomes, restraining thus the generality of the phenomenon. These results pave the way for further analyses aimed at the detailed untangling of drives involved in the evolutionary shaping of the yeast genome.
Collapse
|
32
|
Pink CJ, Swaminathan SK, Dunham I, Rogers J, Ward A, Hurst LD. Evidence that replication-associated mutation alone does not explain between-chromosome differences in substitution rates. Genome Biol Evol 2009; 1:13-22. [PMID: 20333173 PMCID: PMC2817397 DOI: 10.1093/gbe/evp001] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/05/2009] [Indexed: 12/12/2022] Open
Abstract
Since Haldane first noticed an excess of paternally derived mutations, it has been considered that most mutations derive from errors during germ line replication. Miyata et al. (1987) proposed that differences in the rate of neutral evolution on X, Y, and autosome can be employed to measure the extent of this male bias. This commonly applied method assumes replication to be the sole source of between-chromosome variation in substitution rates. We propose a simple test of this assumption: If true, estimates of the male bias should be independent of which two chromosomal classes are compared. Prior evidence from rodents suggested that this might not be true, but conclusions were limited by a lack of rat Y-linked sequence. We therefore sequenced two rat Y-linked bacterial artificial chromosomes and determined evolutionary rate by comparison with mouse. For estimation of rates we consider both introns and synonymous rates. Surprisingly, for both data sets the prediction of congruent estimates of alpha is strongly rejected. Indeed, some comparisons suggest a female bias with autosomes evolving faster than Y-linked sequence. We conclude that the method of Miyata et al. (1987) has the potential to provide incorrect estimates. Correcting the method requires understanding of the other causes of substitution that might differ between chromosomal classes. One possible cause is recombination-associated substitution bias for which we find some evidence. We note that if, as some suggest, this association is dominantly owing to male recombination, the high estimates of alpha seen in birds is to be expected as Z chromosomes recombine in males.
Collapse
Affiliation(s)
- Catherine J Pink
- Department of Biology and Biochemistry, University of Bath, Bath, Somerset, United Kingdom
| | | | | | | | | | | |
Collapse
|
33
|
Abstract
Genes that have experienced accelerated evolutionary rates on the human lineage during recent evolution are candidates for involvement in human-specific adaptations. To determine the forces that cause increased evolutionary rates in certain genes, we analyzed alignments of 10,238 human genes to their orthologues in chimpanzee and macaque. Using a likelihood ratio test, we identified protein-coding sequences with an accelerated rate of base substitutions along the human lineage. Exons evolving at a fast rate in humans have a significant tendency to contain clusters of AT-to-GC (weak-to-strong) biased substitutions. This pattern is also observed in noncoding sequence flanking rapidly evolving exons. Accelerated exons occur in regions with elevated male recombination rates and exhibit an excess of nonsynonymous substitutions relative to the genomic average. We next analyzed genes with significantly elevated ratios of nonsynonymous to synonymous rates of base substitution (dN/dS) along the human lineage, and those with an excess of amino acid replacement substitutions relative to human polymorphism. These genes also show evidence of clusters of weak-to-strong biased substitutions. These findings indicate that a recombination-associated process, such as biased gene conversion (BGC), is driving fixation of GC alleles in the human genome. This process can lead to accelerated evolution in coding sequences and excess amino acid replacement substitutions, thereby generating significant results for tests of positive selection. Regions of the human genome that appear to evolve rapidly may have been under strong positive selection and could contain the genetic changes responsible for the uniqueness of our species. However, neutral (nonadaptive) evolutionary processes can give rise to signals that can be mistaken as signs of selection. In this article, we identify coding sequences that have undergone accelerated rates of change in humans, affecting the divergence of the proteins they encode. By analyzing patterns of molecular evolution in these genes and their distribution in the genome, we show that many protein-coding changes in the fastest-changing genes are not a result of selection operating on the genes, but instead result from biased fixation of AT-to-GC mutations. Our findings are consistent with a model of recombination-driven biased gene conversion. This leads to the provocative hypothesis that many of the genetic changes leading to human-specific characters may have been prompted by fixation of deleterious mutations. Natural selection is commonly believed to be the main engine of functional genetic change, but a separate neutral evolutionary process linked to recombination may have contributed significantly to the divergence of human proteins.
Collapse
|
34
|
Duret L, Galtier N. Comment on "Human-specific gain of function in a developmental enhancer". Science 2009; 323:714; author reply 714. [PMID: 19197042 DOI: 10.1126/science.1165848] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Prabhakar et al. (Reports, 5 September 2008, p. 1346) argued that the conserved noncoding sequence HACNS1 has undergone positive selection and contributed to human adaptation. However, the pattern of substitution in HACNS1 is more consistent with the neutral process of biased gene conversion (BGC). The reported human-specific gain of function is likely due to the accumulation of deleterious mutations driven by BGC, not positive selection.
Collapse
Affiliation(s)
- Laurent Duret
- Université de Lyon, Université Lyon 1, CNRS, UMR5558, Laboratoire de Biométrie et Biologie Evolutive, F-69622, Villeurbanne, France.
| | | |
Collapse
|
35
|
Freudenberg J, Wang M, Yang Y, Li W. Partial correlation analysis indicates causal relationships between GC-content, exon density and recombination rate in the human genome. BMC Bioinformatics 2009; 10 Suppl 1:S66. [PMID: 19208170 PMCID: PMC2648766 DOI: 10.1186/1471-2105-10-s1-s66] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Several features are known to correlate with the GC-content in the human genome, including recombination rate, gene density and distance to telomere. However, by testing for pairwise correlation only, it is impossible to distinguish direct associations from indirect ones and to distinguish between causes and effects. RESULTS We use partial correlations to construct partially directed graphs for the following four variables: GC-content, recombination rate, exon density and distance-to-telomere. Recombination rate and exon density are unconditionally uncorrelated, but become inversely correlated by conditioning on GC-content. This pattern indicates a model where recombination rate and exon density are two independent causes of GC-content variation. CONCLUSION Causal inference and graphical models are useful methods to understand genome evolution and the mechanisms of isochore evolution in the human genome.
Collapse
Affiliation(s)
- Jan Freudenberg
- The Robert S Boas Center for Genomics and Human GeneticsFeinstein Institute for Medical Research, North Shore LIJ Health System, Manhasset, NY 11030, USA.
| | | | | | | |
Collapse
|
36
|
Galtier N, Duret L, Glémin S, Ranwez V. GC-biased gene conversion promotes the fixation of deleterious amino acid changes in primates. Trends Genet 2009; 25:1-5. [DOI: 10.1016/j.tig.2008.10.011] [Citation(s) in RCA: 153] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2008] [Revised: 10/17/2008] [Accepted: 10/24/2008] [Indexed: 01/22/2023]
|
37
|
Cao J, Wu X, Jin Y. Lower GC-content in editing exons: implications for regulation by molecular characteristics maintained by selection. Gene 2008; 421:14-9. [PMID: 18632225 DOI: 10.1016/j.gene.2008.05.012] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2007] [Revised: 03/01/2008] [Accepted: 05/21/2008] [Indexed: 01/26/2023]
Abstract
We unexpectedly discover that there are much lower GC3 and GC-content and higher Gibbs free energy on editing exons than other exons in the Drosophila synaptotagmin I transcripts, which was further, confirmed statistically by others 47 experimentally-validated samples. Sequence alignment, Ks and Ka/Ks assays suggest that rapidly ascending purifying selection occur in editing exons which constrains nucleotide divergency. The presence of specific molecular characteristics such as lower GC-content in editing exons imply an unexpected requirement and are likely to direct RNA editing occurrence. Thus, relations between molecular characteristics of DNA, RNA editing and purifying selection might be present.
Collapse
Affiliation(s)
- Jun Cao
- Institute of Biochemistry, College of Life Sciences, Zhejiang University (Zijingang Campus), Hangzhou, Zhejiang, ZJ310058, PR China
| | | | | |
Collapse
|
38
|
Duret L, Arndt PF. The impact of recombination on nucleotide substitutions in the human genome. PLoS Genet 2008; 4:e1000071. [PMID: 18464896 PMCID: PMC2346554 DOI: 10.1371/journal.pgen.1000071] [Citation(s) in RCA: 254] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2007] [Accepted: 04/11/2008] [Indexed: 01/19/2023] Open
Abstract
Unraveling the evolutionary forces responsible for variations of neutral substitution patterns among taxa or along genomes is a major issue for detecting selection within sequences. Mammalian genomes show large-scale regional variations of GC-content (the isochores), but the substitution processes at the origin of this structure are poorly understood. We analyzed the pattern of neutral substitutions in 1 Gb of primate non-coding regions. We show that the GC-content toward which sequences are evolving is strongly negatively correlated to the distance to telomeres and positively correlated to the rate of crossovers (R2 = 47%). This demonstrates that recombination has a major impact on substitution patterns in human, driving the evolution of GC-content. The evolution of GC-content correlates much more strongly with male than with female crossover rate, which rules out selectionist models for the evolution of isochores. This effect of recombination is most probably a consequence of the neutral process of biased gene conversion (BGC) occurring within recombination hotspots. We show that the predictions of this model fit very well with the observed substitution patterns in the human genome. This model notably explains the positive correlation between substitution rate and recombination rate. Theoretical calculations indicate that variations in population size or density in recombination hotspots can have a very strong impact on the evolution of base composition. Furthermore, recombination hotspots can create strong substitution hotspots. This molecular drive affects both coding and non-coding regions. We therefore conclude that along with mutation, selection and drift, BGC is one of the major factors driving genome evolution. Our results also shed light on variations in the rate of crossover relative to non-crossover events, along chromosomes and according to sex, and also on the conservation of hotspot density between human and chimp. Mammalian genomes show a very strong heterogeneity of base composition along chromosomes (the so-called isochores). The functional significance of these peculiar genomic landscapes is highly debated: do isochores confer some selective advantage, or are they simply the by-product of neutral evolutionary processes? To resolve this issue, we analyzed the pattern of substitution in the human genome by comparison with chimpanzee and macaque. We show that the evolution of base composition (GC-content) is essentially determined by the rate of recombination. This effect appears to be much stronger in male than in female germline, which rules out selective explanations for the evolution of isochores. We show that this impact of recombination is most probably a consequence of the process of biased gene conversion (BGC). This neutral process mimics the action of selection and can induce strong substitution hotspots within recombination hotspots, sometimes leading to the fixation of deleterious mutations. BGC appears to be one of the major factors driving genome evolution. It is therefore essential to take this process into account if we want to be able to interpret genome sequences.
Collapse
Affiliation(s)
- Laurent Duret
- Laboratoire de Biométrie et Biologie Evolutive, Université de Lyon, Université Lyon 1, CNRS, UMR 5558, Villeurbanne, France
- * E-mail: (LD); (PFA)
| | - Peter F. Arndt
- Department for Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
- * E-mail: (LD); (PFA)
| |
Collapse
|
39
|
Pozzoli U, Menozzi G, Fumagalli M, Cereda M, Comi GP, Cagliani R, Bresolin N, Sironi M. Both selective and neutral processes drive GC content evolution in the human genome. BMC Evol Biol 2008; 8:99. [PMID: 18371205 PMCID: PMC2292697 DOI: 10.1186/1471-2148-8-99] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2007] [Accepted: 03/27/2008] [Indexed: 11/10/2022] Open
Abstract
Background Mammalian genomes consist of regions differing in GC content, referred to as isochores or GC-content domains. The scientific debate is still open as to whether such compositional heterogeneity is a selected or neutral trait. Results Here we analyze SNP allele frequencies, retrotransposon insertion polymorphisms (RIPs), as well as fixed substitutions accumulated in the human lineage since its divergence from chimpanzee to indicate that biased gene conversion (BGC) has been playing a role in within-genome GC content variation. Yet, a distinct contribution to GC content evolution is accounted for by a selective process. Accordingly, we searched for independent evidences that GC content distribution does not conform to neutral expectations. Indeed, after correcting for possible biases, we show that intron GC content and size display isochore-specific correlations. Conclusion We consider that the more parsimonious explanation for our results is that GC content is subjected to the action of both weak selection and BGC in the human genome with features such as nucleosome positioning or chromatin conformation possibly representing the final target of selective processes. This view might reconcile previous contrasting findings and add some theoretical background to recent evidences suggesting that GC content domains display different behaviors with respect to highly regulated biological processes such as developmentally-stage related gene expression and programmed replication timing during neural stem cell differentiation.
Collapse
Affiliation(s)
- Uberto Pozzoli
- Scientific Institute IRCCS E, Medea, Bioinformatic Lab, Via don L, Monza 20, 23842 Bosisio Parini (LC), Italy.
| | | | | | | | | | | | | | | |
Collapse
|
40
|
Bush EC, Lahn BT. A genome-wide screen for noncoding elements important in primate evolution. BMC Evol Biol 2008; 8:17. [PMID: 18215302 PMCID: PMC2242780 DOI: 10.1186/1471-2148-8-17] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2007] [Accepted: 01/23/2008] [Indexed: 01/23/2023] Open
Abstract
Background A major goal in the study of human evolution is to identify key genetic changes which occurred over the course of primate evolution. According to one school of thought, many such changes are likely to be found in noncoding sequence. An approach to identifying these involves comparing multiple genomes to identify conserved regions with an accelerated substitution rate in a particular lineage. Such acceleration could be the result of positive selection. Results Here we develop a likelihood ratio test method to identify such regions. We apply it not only to the human terminal lineage, as has been done in previous studies, but also to a number of other branches in the primate tree. We present the top scoring elements, and compare our results with previous studies. We also present resequencing data from one particular element accelerated on the human lineage. These data indicate that the element lies in a region of low polymorphism in humans, consistent with the possibility of a recent selective sweep. They also show that the AT to GC bias for polymorphism in this region differs dramatically from that for substitutions. Conclusion Our results suggest that screens of this type will be helpful in unraveling the complex set of changes which occurred during primate evolution.
Collapse
Affiliation(s)
- Eliot C Bush
- Department of Human Genetics and Howard Hughes Medical Institute, University of Chicago, Chicago, Illinois, USA.
| | | |
Collapse
|
41
|
Karro JE, Peifer M, Hardison RC, Kollmann M, von Grünberg HH. Exponential decay of GC content detected by strand-symmetric substitution rates influences the evolution of isochore structure. Mol Biol Evol 2007; 25:362-74. [PMID: 18042807 DOI: 10.1093/molbev/msm261] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The distribution of guanine and cytosine nucleotides throughout a genome, or the GC content, is associated with numerous features in mammals; understanding the pattern and evolutionary history of GC content is crucial to our efforts to annotate the genome. The local GC content is decaying toward an equilibrium point, but the causes and rates of this decay, as well as the value of the equilibrium point, remain topics of debate. By comparing the results of 2 methods for estimating local substitution rates, we identify 620 Mb of the human genome in which the rates of the various types of nucleotide substitutions are the same on both strands. These strand-symmetric regions show an exponential decay of local GC content at a pace determined by local substitution rates. DNA segments subjected to higher rates experience disproportionately accelerated decay and are AT rich, whereas segments subjected to lower rates decay more slowly and are GC rich. Although we are unable to draw any conclusions about causal factors, the results support the hypothesis proposed by Khelifi A, Meunier J, Duret L, and Mouchiroud D (2006. GC content evolution of the human and mouse genomes: insights from the study of processed pseudogenes in regions of different recombination rates. J Mol Evol. 62:745-752.) that the isochore structure has been reshaped over time. If rate variation were a determining factor, then the current isochore structure of mammalian genomes could result from the local differences in substitution rates. We predict that under current conditions strand-symmetric portions of the human genome will stabilize at an average GC content of 30% (considerably less than the current 42%), thus confirming that the human genome has not yet reached equilibrium.
Collapse
Affiliation(s)
- J E Karro
- Department of Computer Science and Systems Analysis, Miami University, Ohio, USA.
| | | | | | | | | |
Collapse
|
42
|
Kelkar YD, Tyekucheva S, Chiaromonte F, Makova KD. The genome-wide determinants of human and chimpanzee microsatellite evolution. Genome Res 2007; 18:30-8. [PMID: 18032720 DOI: 10.1101/gr.7113408] [Citation(s) in RCA: 174] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Mutation rates of microsatellites vary greatly among loci. The causes of this heterogeneity remain largely enigmatic yet are crucial for understanding numerous human neurological diseases and genetic instability in cancer. In this first genome-wide study, the relative contributions of intrinsic features and regional genomic factors to the variation in mutability among orthologous human-chimpanzee microsatellites are investigated with resampling and regression techniques. As a result, we uncover the intricacies of microsatellite mutagenesis as follows. First, intrinsic features (repeat number, length, and motif size), which all influence the probability and rate of slippage, are the strongest predictors of mutability. Second, mutability increases nonuniformly with length, suggesting that processes additional to slippage, such as faulty repair, contribute to mutations. Third, mutability varies among microsatellites with different motif composition likely due to dissimilarities in secondary DNA structure formed by their slippage intermediates. Fourth, mutability of mononucleotide microsatellites is impacted by their location on sex chromosomes vs. autosomes and inside vs. outside of Alu repeats, the former confirming the importance of replication and the latter suggesting a role for gene conversion. Fifth, transcription status and location in a particular isochore do not influence microsatellite mutability. Sixth, compared with intrinsic features, regional genomic factors have only minor effects. Finally, our regression models explain approximately 90% of variation in microsatellite mutability and can generate useful predictions for the studies of human diseases, forensics, and conservation genetics.
Collapse
Affiliation(s)
- Yogeshwar D Kelkar
- Department of Biology, Penn State University, University Park, Pennsylvania 16802, USA
| | | | | | | |
Collapse
|
43
|
Biased distributions and decay of long interspersed nuclear elements in the chicken genome. Genetics 2007; 178:573-81. [PMID: 17947446 DOI: 10.1534/genetics.106.061861] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
The genomes of birds are much smaller than mammalian genomes, and transposable elements (TEs) make up only 10% of the chicken genome, compared with the 45% of the human genome. To study the mechanisms that constrain the copy numbers of TEs, and as a consequence the genome size of birds, we analyzed the distributions of LINEs (CR1's) and SINEs (MIRs) on the chicken autosomes and Z chromosome. We show that (1) CR1 repeats are longest on the Z chromosome and their length is negatively correlated with the local GC content; (2) the decay of CR1 elements is highly biased, and the 5'-ends of the insertions are lost much faster than their 3'-ends; (3) the GC distribution of CR1 repeats shows a bimodal pattern with repeats enriched in both AT-rich and GC-rich regions of the genome, but the CR1 families show large differences in their GC distribution; and (4) the few MIRs in the chicken are most abundant in regions with intermediate GC content. Our results indicate that the primary mechanism that removes repeats from the chicken genome is ectopic exchange and that the low abundance of repeats in avian genomes is likely to be the consequence of their high recombination rates.
Collapse
|
44
|
Dreszer TR, Wall GD, Haussler D, Pollard KS. Biased clustered substitutions in the human genome: the footprints of male-driven biased gene conversion. Genome Res 2007; 17:1420-30. [PMID: 17785536 PMCID: PMC1987345 DOI: 10.1101/gr.6395807] [Citation(s) in RCA: 89] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
We examined fixed substitutions in the human lineage since divergence from the common ancestor with the chimpanzee, and determined what fraction are AT to GC (weak-to-strong). Substitutions that are densely clustered on the chromosomes show a remarkable excess of weak-to-strong "biased" substitutions. These unexpected biased clustered substitutions (UBCS) are common near the telomeres of all autosomes but not the sex chromosomes. Regions of extreme bias are enriched for genes. Human and chimp orthologous regions show a striking similarity in the shape and magnitude of their respective UBCS maps, suggesting a relatively stable force leads to clustered bias. The strong and stable signal near telomeres may have participated in the evolution of isochores. One exception to the UBCS pattern found in all autosomes is chromosome 2, which shows a UBCS peak midchromosome, mapping to the fusion site of two ancestral chromosomes. This provides evidence that the fusion occurred as recently as 740,000 years ago and no more than approximately 3 million years ago. No biased clustering was found in SNPs, suggesting that clusters of biased substitutions are selected from mutations. UBCS is strongly correlated with male (and not female) recombination rates, which explains the lack of UBCS signal on chromosome X. These observations support the hypothesis that biased gene conversion (BGC), specifically in the male germline, played a significant role in the evolution of the human genome.
Collapse
MESH Headings
- Animals
- Chromosomes, Human, Pair 2/genetics
- Chromosomes, Human, X/genetics
- Chromosomes, Human, Y/genetics
- Evolution, Molecular
- Female
- Gene Conversion
- Gene Fusion
- Genome, Human
- Humans
- Male
- Models, Genetic
- Pan troglodytes/genetics
- Polymorphism, Single Nucleotide
- Recombination, Genetic
- Sex Characteristics
- Species Specificity
- Telomere/genetics
- Time Factors
Collapse
Affiliation(s)
- Timothy R. Dreszer
- Department of Biomolecular Engineering, University of California, Santa Cruz, California 95064, USA
| | - Gregory D. Wall
- Department of Statistics, University of California, Davis, California 95616, USA
| | - David Haussler
- Department of Biomolecular Engineering, University of California, Santa Cruz, California 95064, USA
- Howard Hughes Medical Institute, University of California, Santa Cruz, California 95064, USA
- Corresponding authors.E-mail ; fax (831) 459-1809.E-mail ; fax (530) 754-9658
| | - Katherine S. Pollard
- Department of Statistics, University of California, Davis, California 95616, USA
- UC Davis Genome Center, University of California, Davis, California 95616, USA
- Corresponding authors.E-mail ; fax (831) 459-1809.E-mail ; fax (530) 754-9658
| |
Collapse
|
45
|
Ellegren H. Molecular evolutionary genomics of birds. Cytogenet Genome Res 2007; 117:120-30. [PMID: 17675852 DOI: 10.1159/000103172] [Citation(s) in RCA: 110] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2006] [Accepted: 09/09/2006] [Indexed: 11/19/2022] Open
Abstract
Insight into the molecular evolution of birds has been offered by the steady accumulation of avian DNA sequence data, recently culminating in the first draft sequence of an avian genome, that of chicken. By studying avian molecular evolution we can learn about adaptations and phenotypic evolution in birds, and also gain an understanding of the similarities and differences between mammalian and avian genomes. In both these lineages, there is pronounced isochore structure with highly variable GC content. However, while mammalian isochores are decaying, they are maintained in the chicken lineage, which is consistent with a biased gene conversion model where the high and variable recombination rate of birds reinforces heterogeneity in GC. In Galliformes, GC is positively correlated with the rate of nucleotide substitution; the mean neutral mutation rate is 0.12-0.15% at each site per million years but this estimate comes with significant local variation in the rate of mutation. Comparative genomics reveals lower d(N)/d(S) ratios on micro- compared to macrochromosomes, possibly due to population genetic effects or a non-random distribution of genes with respect to chromosome size. A non-random genomic distribution is shown by genes with sex-biased expression, with male-biased genes over-represented and female-biased genes under-represented on the Z chromosome. A strong effect of selection is evident on the non-recombining W chromosome with high d(N)/d(S) ratios and limited polymorphism. Nucleotide diversity in chicken is estimated at 4-5 x 10(-3) which might be seen as surprisingly high given presumed bottlenecks during domestication, but is lower than that recently observed in several natural populations of other species. Several important aspects of the molecular evolutionary process of birds remain to be understood and it can be anticipated that the upcoming genome sequence of a second bird species, the zebra finch, as well as the integration of data on gene expression, shall further advance our knowledge of avian evolution.
Collapse
Affiliation(s)
- H Ellegren
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden.
| |
Collapse
|
46
|
A macaque's-eye view of human insertions and deletions: differences in mechanisms. PLoS Comput Biol 2007; 3:1772-82. [PMID: 17941704 PMCID: PMC1976337 DOI: 10.1371/journal.pcbi.0030176] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2007] [Accepted: 07/26/2007] [Indexed: 11/19/2022] Open
Abstract
Insertions and deletions (indels) cause numerous genetic diseases and lead to pronounced evolutionary differences among genomes. The macaque sequences provide an opportunity to gain insights into the mechanisms generating these mutations on a genome-wide scale by establishing the polarity of indels occurring in the human lineage since its divergence from the chimpanzee. Here we apply novel regression techniques and multiscale analyses to demonstrate an extensive regional indel rate variation stemming from local fluctuations in divergence, GC content, male and female recombination rates, proximity to telomeres, and other genomic factors. We find that both replication and, surprisingly, recombination are significantly associated with the occurrence of small indels. Intriguingly, the relative inputs of replication versus recombination differ between insertions and deletions, thus the two types of mutations are likely guided in part by distinct mechanisms. Namely, insertions are more strongly associated with factors linked to recombination, while deletions are mostly associated with replication-related features. Indel as a term misleadingly groups the two types of mutations together by their effect on a sequence alignment. However, here we establish that the correct identification of a small gap as an insertion or a deletion (by use of an outgroup) is crucial to determining its mechanism of origin. In addition to providing novel insights into insertion and deletion mutagenesis, these results will assist in gap penalty modeling and eventually lead to more reliable genomic alignments.
Collapse
|
47
|
Galtier N, Duret L. Adaptation or biased gene conversion? Extending the null hypothesis of molecular evolution. Trends Genet 2007; 23:273-7. [PMID: 17418442 DOI: 10.1016/j.tig.2007.03.011] [Citation(s) in RCA: 165] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2006] [Revised: 03/01/2007] [Accepted: 03/21/2007] [Indexed: 11/26/2022]
Abstract
The analysis of evolutionary rates is a popular approach to characterizing the effect of natural selection at the molecular level. Sequences contributing to species adaptation are expected to evolve faster than nonfunctional sequences because favourable mutations have a higher fixation probability than neutral ones. Such an accelerated rate of evolution might be due to factors other than natural selection, in particular GC-biased gene conversion. This is true of neutral sequences, but also of constrained sequences, which can be illustrated using the mouse Fxy gene. Several criteria can discriminate between the natural selection and biased gene conversion models. These criteria suggest that the recently reported human accelerated regions are most likely the result of biased gene conversion. We argue that these regions, far from contributing to human adaptation, might represent the Achilles' heel of our genome.
Collapse
Affiliation(s)
- Nicolas Galtier
- CNRS UMR 5554 - Institut des Sciences de l'Evolution, Université Montpellier 2 - CC64, Place E. Bataillon, 34095 Montpellier Cedex, France.
| | | |
Collapse
|
48
|
Abstract
Mutation has traditionally been considered a random process, but this paradigm is challenged by recent evidence of divergence rate heterogeneity in different genomic regions. One facet of mutation rate variation is the propensity for genetic change to correlate with the number of germ cell divisions, reflecting the replication-dependent origin of many mutations. Haldane was the first to connect this association of replication and mutation to the difference in the number of cell divisions in oogenesis (low) and spermatogenesis (usually high), and the resulting sex difference in the rate of mutation. The concept of male-biased mutation has been thoroughly analysed in recent years using an evolutionary approach, in which sequence divergence of autosomes and/or sex chromosomes are compared to allow inference about the relative contribution of mothers and fathers in the accumulation of mutations. For instance, assuming that a neutral sequence is analysed, that rate heterogeneity owing to other factors is cancelled out by the investigation of many loci and that the effect of ancestral polymorphism is properly taken into account, the male-to-female mutation rate ratio, alpham, can be solved from the observed difference in rate of X and Y chromosome divergence. The male mutation bias is positively correlated with the relative excess of cell divisions in the male compared to the female germ line, as evidenced by a generation time effect: in mammals, alpham is estimated at approximately 4-6 in primates, approximately 3 in carnivores and approximately 2 in small rodents. Another life-history correlate is sexual selection: when there is intense sperm competition among males, increased sperm production will be associated with a larger number of mitotic cell divisions in spermatogenesis and hence an increase in alpham. Male-biased mutation has implications for important aspects of evolutionary biology such as mate choice in relation to mutation load, sexual selection and the maintenance of genetic diversity despite strong directional selection, the tendency for a disproportionate large role of the X (Z) chromosome in post-zygotic isolation, and the evolution of sex.
Collapse
Affiliation(s)
- Hans Ellegren
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18D, 752 36 Uppsala, Sweden.
| |
Collapse
|
49
|
Schmegner C, Hoegel J, Vogel W, Assum G. The rate, not the spectrum, of base pair substitutions changes at a GC-content transition in the human NF1 gene region: implications for the evolution of the mammalian genome structure. Genetics 2006; 175:421-8. [PMID: 17057231 PMCID: PMC1775011 DOI: 10.1534/genetics.106.064386] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The human genome is composed of long stretches of DNA with distinct GC contents, called isochores or GC-content domains. A boundary between two GC-content domains in the human NF1 gene region is also a boundary between domains of early- and late-replicating sequences and of regions with high and low recombination frequencies. The perfect conservation of the GC-content distribution in this region between human and mouse demonstrates that GC-content stabilizing forces must act regionally on a fine scale at this locus. To further elucidate the nature of these forces, we report here on the spectrum of human SNPs and base pair substitutions between human and chimpanzee. The results show that the mutation rate changes exactly at the GC-content transition zone from low values in the GC-poor sequences to high values in GC-rich ones. The GC content of the GC-poor sequences can be explained by a bias in favor of GC > AT mutations, whereas the GC content of the GC-rich segment may result from a fixation bias in favor of AT > GC substitutions. This fixation bias may be explained by direct selection by the GC content or by biased gene conversion.
Collapse
|
50
|
Gaffney DJ, Keightley PD. Genomic selective constraints in murid noncoding DNA. PLoS Genet 2006; 2:e204. [PMID: 17166057 PMCID: PMC1657059 DOI: 10.1371/journal.pgen.0020204] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2006] [Accepted: 10/18/2006] [Indexed: 02/04/2023] Open
Abstract
Recent work has suggested that there are many more selectively constrained, functional noncoding than coding sites in mammalian genomes. However, little is known about how selective constraint varies amongst different classes of noncoding DNA. We estimated the magnitude of selective constraint on a large dataset of mouse-rat gene orthologs and their surrounding noncoding DNA. Our analysis indicates that there are more than three times as many selectively constrained, nonrepetitive sites within noncoding DNA as in coding DNA in murids. The majority of these constrained noncoding sites appear to be located within intergenic regions, at distances greater than 5 kilobases from known genes. Our study also shows that in murids, intron length and mean intronic selective constraint are negatively correlated with intron ordinal number. Our results therefore suggest that functional intronic sites tend to accumulate toward the 5′ end of murid genes. Our analysis also reveals that mean number of selectively constrained noncoding sites varies substantially with the function of the adjacent gene. We find that, among others, developmental and neuronal genes are associated with the greatest numbers of putatively functional noncoding sites compared with genes involved in electron transport and a variety of metabolic processes. Combining our estimates of the total number of constrained coding and noncoding bases we calculate that over twice as many deleterious mutations have occurred in intergenic regions as in known genic sequence and that the total genomic deleterious point mutation rate is 0.91 per diploid genome, per generation. This estimated rate is over twice as large as a previous estimate in murids. Most DNA can typically be divided into two categories: regions that encode the instructions for the assembly of a protein molecule (protein-coding genes) and those that do not (noncoding). Although mammalian genomes are primarily noncoding, relatively little is known about how much of this is functional, where such regions are found in the genome, and what functions they are likely to perform. In this study, the authors investigated the quantity and location of functional noncoding DNA in mice and rats. They estimate that functional noncoding DNA is at least three times as common as coding DNA in rodents, and the majority is located large distances from known protein-coding genes. Putatively functional intronic DNA tends to be clustered towards the gene 5′ end, suggesting that much intronic sequence is instrumental in regulating gene expression. This study also finds that genes involved in development and the nervous system are typically associated with much higher quantities of functional noncoding DNA, suggesting that these genes require more finely tuned control of their expression. One implication of this study is the finding that disease-causing mutations have occurred more frequently in noncoding regions and may have affected gene expression, rather than protein structure.
Collapse
Affiliation(s)
- Daniel J Gaffney
- Institute of Evolutionary Biology, Ashworth Laboratories, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom.
| | | |
Collapse
|