1
|
Marsh JI, Johri P. Biases in ARG-Based Inference of Historical Population Size in Populations Experiencing Selection. Mol Biol Evol 2024; 41:msae118. [PMID: 38874402 PMCID: PMC11245712 DOI: 10.1093/molbev/msae118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 06/05/2024] [Accepted: 06/11/2024] [Indexed: 06/15/2024] Open
Abstract
Inferring the demographic history of populations provides fundamental insights into species dynamics and is essential for developing a null model to accurately study selective processes. However, background selection and selective sweeps can produce genomic signatures at linked sites that mimic or mask signals associated with historical population size change. While the theoretical biases introduced by the linked effects of selection have been well established, it is unclear whether ancestral recombination graph (ARG)-based approaches to demographic inference in typical empirical analyses are susceptible to misinference due to these effects. To address this, we developed highly realistic forward simulations of human and Drosophila melanogaster populations, including empirically estimated variability of gene density, mutation rates, recombination rates, purifying, and positive selection, across different historical demographic scenarios, to broadly assess the impact of selection on demographic inference using a genealogy-based approach. Our results indicate that the linked effects of selection minimally impact demographic inference for human populations, although it could cause misinference in populations with similar genome architecture and population parameters experiencing more frequent recurrent sweeps. We found that accurate demographic inference of D. melanogaster populations by ARG-based methods is compromised by the presence of pervasive background selection alone, leading to spurious inferences of recent population expansion, which may be further worsened by recurrent sweeps, depending on the proportion and strength of beneficial mutations. Caution and additional testing with species-specific simulations are needed when inferring population history with non-human populations using ARG-based approaches to avoid misinference due to the linked effects of selection.
Collapse
Affiliation(s)
- Jacob I Marsh
- Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Parul Johri
- Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
- Integrative Program for Biological and Genome Sciences, University of North Carolina, Chapel Hill, NC 27599, USA
| |
Collapse
|
2
|
Luthra I, Jensen C, Chen XE, Salaudeen AL, Rafi AM, de Boer CG. Regulatory activity is the default DNA state in eukaryotes. Nat Struct Mol Biol 2024; 31:559-567. [PMID: 38448573 DOI: 10.1038/s41594-024-01235-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 01/29/2024] [Indexed: 03/08/2024]
Abstract
Genomes encode for genes and non-coding DNA, both capable of transcriptional activity. However, unlike canonical genes, many transcripts from non-coding DNA have limited evidence of conservation or function. Here, to determine how much biological noise is expected from non-genic sequences, we quantify the regulatory activity of evolutionarily naive DNA using RNA-seq in yeast and computational predictions in humans. In yeast, more than 99% of naive DNA bases were transcribed. Unlike the evolved transcriptome, naive transcripts frequently overlapped with opposite sense transcripts, suggesting selection favored coherent gene structures in the yeast genome. In humans, regulation-associated chromatin activity is predicted to be common in naive dinucleotide-content-matched randomized DNA. Here, naive and evolved DNA have similar co-occurrence and cell-type specificity of chromatin marks, challenging these as indicators of selection. However, in both yeast and humans, extreme high activities were rare in naive DNA, suggesting they result from selection. Overall, basal regulatory activity seems to be the default, which selection can hone to evolve a function or, if detrimental, repress.
Collapse
Affiliation(s)
- Ishika Luthra
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada
| | - Cassandra Jensen
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada
| | - Xinyi E Chen
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada
| | - Asfar Lathif Salaudeen
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada
| | - Abdul Muntakim Rafi
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada
| | - Carl G de Boer
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada.
| |
Collapse
|
3
|
Reiner WB, Masao F, Sholts SB, Songita AV, Stanistreet I, Stollhofen H, Taylor RE, Hlusko LJ. OH 83: A new early modern human fossil cranium from the Ndutu beds of Olduvai Gorge, Tanzania. AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY 2017; 164:533-545. [PMID: 28786473 DOI: 10.1002/ajpa.23292] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Revised: 05/02/2017] [Accepted: 07/23/2017] [Indexed: 01/23/2023]
Abstract
OBJECTIVE Herein we introduce a newly recovered partial calvaria, OH 83, from the upper Ndutu Beds of Olduvai Gorge, Tanzania. We present the geological context of its discovery and a comparative analysis of its morphology, placing OH 83 within the context of our current understanding of the origins and evolution of Homo sapiens. MATERIALS AND METHODS We comparatively assessed the morphology of OH 83 using quantitative and qualitative data from penecontemporaneous fossils and the W.W. Howells modern human craniometric dataset. RESULTS OH 83 is geologically dated to ca. 60-32 ka. Its morphology is indicative of an early modern human, falling at the low end of the range of variation for post-orbital cranial breadth, the high end of the range for bifrontal breadth, and near average in frontal length. DISCUSSION There have been numerous attempts to use cranial anatomy to define the species Homo sapiens and identify it in the fossil record. These efforts have not met wide agreement by the scientific community due, in part, to the mosaic patterns of cranial variation represented by the fossils. The variable, mosaic pattern of trait expression in the crania of Middle and Late Pleistocene fossils implies that morphological modernity did not occur at once. However, OH 83 demonstrates that by ca. 60-32 ka modern humans in Africa included individuals that are at the fairly small and gracile range of modern human cranial variation.
Collapse
Affiliation(s)
- Whitney B Reiner
- Department of Integrative Biology, University of California Berkeley, MC 3140, Berkeley, California, 94720
| | - Fidelis Masao
- University of Dar es Salaam, Dar es Salaam, TZ, 35091.,Conservation Olduvai Project, Dar es Salaam, TZ, 35091
| | - Sabrina B Sholts
- Department of Anthropology, National Museum of Natural History, Smithsonian Institution, Washington, DC, 20560
| | | | - Ian Stanistreet
- University of Liverpool, Liverpool, L69 3GP, UK.,The Stone Age Institute, Bloomington, Indiana, 47407
| | - Harald Stollhofen
- GeoZentrum Nordbayern, Universität Erlangen-Nürnberg, Erlangen, 91054, Germany
| | - R E Taylor
- University of California Riverside, Riverside, California, 92521
| | - Leslea J Hlusko
- Department of Integrative Biology, University of California Berkeley, MC 3140, Berkeley, California, 94720
| |
Collapse
|
4
|
Evolution and dispersal of the genus Homo: A landscape approach. J Hum Evol 2015; 87:48-65. [PMID: 26235482 DOI: 10.1016/j.jhevol.2015.07.002] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2013] [Revised: 07/05/2015] [Accepted: 07/09/2015] [Indexed: 02/07/2023]
Abstract
The notion of the physical landscape as an arena of ecological interaction and human evolution is a powerful one, but its implementation at larger geographical and temporal scales is hampered by the challenges of reconstructing physical landscape settings in the geologically active regions where the earliest evidence is concentrated. We argue that the inherently dynamic nature of these unstable landscapes has made them important agents of biological change, creating complex topographies capable of selecting for, stimulating, obstructing or accelerating the latent and emerging properties of the human evolutionary trajectory. We use this approach, drawing on the concepts and methods of active tectonics, to develop a new perspective on the origins and dispersal of the Homo genus. We show how complex topography provides an easy evolutionary pathway to full terrestrialisation in the African context, and would have further equipped members of the genus Homo with a suite of adaptive characteristics that facilitated wide-ranging dispersal across ecological and climatic boundaries into Europe and Asia by following pathways of complex topography. We compare this hypothesis with alternative explanations for hominin dispersal, and evaluate it by mapping the distribution of topographic features at varying scales, and comparing the distribution of early Homo sites with the resulting maps and with other environmental variables.
Collapse
|
5
|
|
6
|
Ortega VE, Meyers DA. Pharmacogenetics: implications of race and ethnicity on defining genetic profiles for personalized medicine. J Allergy Clin Immunol 2014; 133:16-26. [PMID: 24369795 DOI: 10.1016/j.jaci.2013.10.040] [Citation(s) in RCA: 153] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2013] [Revised: 10/22/2013] [Accepted: 10/23/2013] [Indexed: 01/06/2023]
Abstract
Pharmacogenetics is being used to develop personalized therapies specific to subjects from different ethnic or racial groups. To date, pharmacogenetic studies have been primarily performed in trial cohorts consisting of non-Hispanic white subjects of European descent. A "bottleneck" or collapse of genetic diversity associated with the first human colonization of Europe during the Upper Paleolithic period, followed by the recent mixing of African, European, and Native American ancestries, has resulted in different ethnic groups with varying degrees of genetic diversity. Differences in genetic ancestry might introduce genetic variation, which has the potential to alter the therapeutic efficacy of commonly used asthma therapies, such as β2-adrenergic receptor agonists (β-agonists). Pharmacogenetic studies of admixed ethnic groups have been limited to small candidate gene association studies, of which the best example is the gene coding for the receptor target of β-agonist therapy, the β2-adrenergic receptor (ADRB2). Large consortium-based sequencing studies are using next-generation whole-genome sequencing to provide a diverse genome map of different admixed populations, which can be used for future pharmacogenetic studies. These studies will include candidate gene studies, genome-wide association studies, and whole-genome admixture-based approaches that account for ancestral genetic structure, complex haplotypes, gene-gene interactions, and rare variants to detect and replicate novel pharmacogenetic loci.
Collapse
Affiliation(s)
- Victor E Ortega
- Center for Genomics and Personalized Medicine, Wake Forest School of Medicine, Winston-Salem, NC
| | - Deborah A Meyers
- Center for Genomics and Personalized Medicine, Wake Forest School of Medicine, Winston-Salem, NC.
| |
Collapse
|
7
|
Blumenstiel JP, Chen X, He M, Bergman CM. An age-of-allele test of neutrality for transposable element insertions. Genetics 2014; 196:523-38. [PMID: 24336751 PMCID: PMC3914624 DOI: 10.1534/genetics.113.158147] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2013] [Accepted: 12/06/2013] [Indexed: 01/31/2023] Open
Abstract
How natural selection acts to limit the proliferation of transposable elements (TEs) in genomes has been of interest to evolutionary biologists for many years. To describe TE dynamics in populations, previous studies have used models of transposition-selection equilibrium that assume a constant rate of transposition. However, since TE invasions are known to happen in bursts through time, this assumption may not be reasonable. Here we propose a test of neutrality for TE insertions that does not rely on the assumption of a constant transposition rate. We consider the case of TE insertions that have been ascertained from a single haploid reference genome sequence. By conditioning on the age of an individual TE insertion allele (inferred by the number of unique substitutions that have occurred within the particular TE sequence since insertion), we determine the probability distribution of the insertion allele frequency in a population sample under neutrality. Taking models of varying population size into account, we then evaluate predictions of our model against allele frequency data from 190 retrotransposon insertions sampled from North American and African populations of Drosophila melanogaster. Using this nonequilibrium neutral model, we are able to explain ∼ 80% of the variance in TE insertion allele frequencies based on age alone. Controlling for both nonequilibrium dynamics of transposition and host demography, we provide evidence for negative selection acting against most TEs as well as for positive selection acting on a small subset of TEs. Our work establishes a new framework for the analysis of the evolutionary forces governing large insertion mutations like TEs, gene duplications, or other copy number variants.
Collapse
Affiliation(s)
- Justin P. Blumenstiel
- Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, Kansas 66049
| | - Xi Chen
- Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, Kansas 66049
| | - Miaomiao He
- Faculty of Life Sciences, University of Manchester, Manchester M21 0RG, United Kingdom
| | - Casey M. Bergman
- Faculty of Life Sciences, University of Manchester, Manchester M21 0RG, United Kingdom
| |
Collapse
|
8
|
Kumari V, Iyer LR, Roy R, Bhargava V, Panda S, Paul J, Verweij JJ, Clark CG, Bhattacharya A, Bhattacharya S. Genomic distribution of SINEs in Entamoeba histolytica strains: implication for genotyping. BMC Genomics 2013; 14:432. [PMID: 23815468 PMCID: PMC3716655 DOI: 10.1186/1471-2164-14-432] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2012] [Accepted: 06/20/2013] [Indexed: 11/01/2022] Open
Abstract
BACKGROUND The major clinical manifestations of Entamoeba histolytica infection include amebic colitis and liver abscess. However the majority of infections remain asymptomatic. Earlier reports have shown that some E. histolytica isolates are more virulent than others, suggesting that virulence may be linked to genotype. Here we have looked at the genomic distribution of the retrotransposable short interspersed nuclear elements EhSINE1 and EhSINE2. Due to their mobile nature, some EhSINE copies may occupy different genomic locations among isolates of E. histolytica possibly affecting adjacent gene expression; this variability in location can be exploited to differentiate strains. RESULTS We have looked for EhSINE1- and EhSINE2-occupied loci in the genome sequence of Entamoeba histolytica HM-1:IMSS and searched for homologous loci in other strains to determine the insertion status of these elements. A total of 393 EhSINE1 and 119 EhSINE2 loci were analyzed in the available sequenced strains (Rahman, DS4-868, HM1:CA, KU48, KU50, KU27 and MS96-3382. Seventeen loci (13 EhSINE1 and 4 EhSINE2) were identified where a EhSINE1/EhSINE2 sequence was missing from the corresponding locus of other strains. Most of these loci were unoccupied in more than one strain. Some of the loci were analyzed experimentally for SINE occupancy using DNA from strain Rahman. These data helped to correctly assemble the nucleotide sequence at three loci in Rahman. SINE occupancy was also checked at these three loci in 7 other axenically cultivated E. histolytica strains and 16 clinical isolates. Each locus gave a single, specific amplicon with the primer sets used, making this a suitable method for strain typing. Based on presence/absence of SINE and amplification with locus-specific primers, the 23 strains could be divided into eleven genotypes. The results obtained by our method correlated with the data from other typing methods. We also report a bioinformatic analysis of EhSINE2 copies. CONCLUSIONS Our results reveal several loci with extensive polymorphism of SINE occupancy among different strains of E. histolytica and prove the principle that the genomic distribution of SINEs is a valid method for typing of E. histolytica strains.
Collapse
Affiliation(s)
- Vandana Kumari
- School of Environmental Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| | - Lakshmi Rani Iyer
- School of Life Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| | - Riti Roy
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Varsha Bhargava
- School of Environmental Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| | - Suchita Panda
- School of Environmental Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| | - Jaishree Paul
- School of Life Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| | - Jaco J Verweij
- Laboratory for Medical Microbiology and Immunology, Laboratory for Clinical Pathology, St. Elisabeth Hospital, Tilburg, The Netherlands
| | - C Graham Clark
- Department of Pathogen Molecular Biology, London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK
| | - Alok Bhattacharya
- School of Life Sciences, Jawaharlal Nehru University, New Delhi 110067, India
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Sudha Bhattacharya
- School of Environmental Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| |
Collapse
|
9
|
Sargsyan O. Analytical framework for identifying and differentiating recent hitchhiking and severe bottleneck effects from multi-locus DNA sequence data. PLoS One 2012; 7:e37588. [PMID: 22662176 PMCID: PMC3360760 DOI: 10.1371/journal.pone.0037588] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2012] [Accepted: 04/21/2012] [Indexed: 11/19/2022] Open
Abstract
Hitchhiking and severe bottleneck effects have impact on the dynamics of genetic diversity of a population by inducing homogenization at a single locus and at the genome-wide scale, respectively. As a result, identification and differentiation of the signatures of such events from DNA sequence data at a single locus is challenging. This paper develops an analytical framework for identifying and differentiating recent homogenization events at multiple neutral loci in low recombination regions. The dynamics of genetic diversity at a locus after a recent homogenization event is modeled according to the infinite-sites mutation model and the Wright-Fisher model of reproduction with constant population size. In this setting, I derive analytical expressions for the distribution, mean, and variance of the number of polymorphic sites in a random sample of DNA sequences from a locus affected by a recent homogenization event. Based on this framework, three likelihood-ratio based tests are presented for identifying and differentiating recent homogenization events at multiple loci. Lastly, I apply the framework to two data sets. First, I consider human DNA sequences from four non-coding loci on different chromosomes for inferring evolutionary history of modern human populations. The results suggest, in particular, that recent homogenization events at the loci are identifiable when the effective human population size is 50,000 or greater in contrast to 10,000, and the estimates of the recent homogenization events are agree with the "Out of Africa" hypothesis. Second, I use HIV DNA sequences from HIV-1-infected patients to infer the times of HIV seroconversions. The estimates are contrasted with other estimates derived as the mid-time point between the last HIV-negative and first HIV-positive screening tests. The results show that significant discrepancies can exist between the estimates.
Collapse
Affiliation(s)
- Ori Sargsyan
- Theoretical Biology and Biophysics and Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America.
| |
Collapse
|
10
|
Beyin A. Upper Pleistocene Human Dispersals out of Africa: A Review of the Current State of the Debate. INTERNATIONAL JOURNAL OF EVOLUTIONARY BIOLOGY 2011; 2011:615094. [PMID: 21716744 PMCID: PMC3119552 DOI: 10.4061/2011/615094] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/15/2010] [Revised: 01/22/2011] [Accepted: 02/24/2011] [Indexed: 12/31/2022]
Abstract
Although there is a general consensus on African origin of early modern humans, there is disagreement about how and when they dispersed to Eurasia. This paper reviews genetic and Middle Stone Age/Middle Paleolithic archaeological literature from northeast Africa, Arabia, and the Levant to assess the timing and geographic backgrounds of Upper Pleistocene human colonization of Eurasia. At the center of the discussion lies the question of whether eastern Africa alone was the source of Upper Pleistocene human dispersals into Eurasia or were there other loci of human expansions outside of Africa? The reviewed literature hints at two modes of early modern human colonization of Eurasia in the Upper Pleistocene: (i) from multiple Homo sapiens source populations that had entered Arabia, South Asia, and the Levant prior to and soon after the onset of the Last Interglacial (MIS-5), (ii) from a rapid dispersal out of East Africa via the Southern Route (across the Red Sea basin), dating to ~74–60 kya.
Collapse
Affiliation(s)
- Amanuel Beyin
- Turkana Basin Institute, Stony Brook University, SBS Building 5th Floor, Stony Brook, NY 11794, USA
| |
Collapse
|
11
|
Huff CD, Xing J, Rogers AR, Witherspoon D, Jorde LB. Mobile elements reveal small population size in the ancient ancestors of Homo sapiens. Proc Natl Acad Sci U S A 2010; 107:2147-52. [PMID: 20133859 PMCID: PMC2836654 DOI: 10.1073/pnas.0909000107] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The genealogies of different genetic loci vary in depth. The deeper the genealogy, the greater the chance that it will include a rare event, such as the insertion of a mobile element. Therefore, the genealogy of a region that contains a mobile element is on average older than that of the rest of the genome. In a simple demographic model, the expected time to most recent common ancestor (TMRCA) is doubled if a rare insertion is present. We test this expectation by examining single nucleotide polymorphisms around polymorphic Alu insertions from two completely sequenced human genomes. The estimated TMRCA for regions containing a polymorphic insertion is two times larger than the genomic average (P < <10(-30)), as predicted. Because genealogies that contain polymorphic mobile elements are old, they are shaped largely by the forces of ancient population history and are insensitive to recent demographic events, such as bottlenecks and expansions. Remarkably, the information in just two human DNA sequences provides substantial information about ancient human population size. By comparing the likelihood of various demographic models, we estimate that the effective population size of human ancestors living before 1.2 million years ago was 18,500, and we can reject all models where the ancient effective population size was larger than 26,000. This result implies an unusually small population for a species spread across the entire Old World, particularly in light of the effective population sizes of chimpanzees (21,000) and gorillas (25,000), which each inhabit only one part of a single continent.
Collapse
Affiliation(s)
- Chad D. Huff
- Department of Human Genetics, Eccles Institute of Human Genetics
| | - Jinchuan Xing
- Department of Human Genetics, Eccles Institute of Human Genetics
| | - Alan R. Rogers
- Department of Anthropology, University of Utah, Salt Lake City, UT 84112
| | | | - Lynn B. Jorde
- Department of Human Genetics, Eccles Institute of Human Genetics
| |
Collapse
|
12
|
Maydan JS, Lorch A, Edgley ML, Flibotte S, Moerman DG. Copy number variation in the genomes of twelve natural isolates of Caenorhabditis elegans. BMC Genomics 2010; 11:62. [PMID: 20100350 PMCID: PMC2822765 DOI: 10.1186/1471-2164-11-62] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2009] [Accepted: 01/25/2010] [Indexed: 11/23/2022] Open
Abstract
Background Copy number variation is an important component of genetic variation in higher eukaryotes. The extent of natural copy number variation in C. elegans is unknown outside of 2 highly divergent wild isolates and the canonical N2 Bristol strain. Results We have used array comparative genomic hybridization (aCGH) to detect copy number variation in the genomes of 12 natural isolates of Caenorhabditis elegans. Deletions relative to the canonical N2 strain are more common in these isolates than duplications, and indels are enriched in multigene families on the autosome arms. Among the strains in our study, the Hawaiian and Madeiran strains (CB4856 and JU258) carry the largest number of deletions, followed by the Vancouver strain (KR314). Overall we detected 510 different deletions affecting 1136 genes, or over 5% of the genes in the canonical N2 genome. The indels we identified had a median length of 2.7 kb. Since many deletions are found in multiple isolates, deletion loci were used as markers to derive an unrooted tree to estimate genetic relatedness among the strains. Conclusion Copy number variation is extensive in C. elegans, affecting over 5% of the genes in the genome. The deletions we have detected in natural isolates of C. elegans contribute significantly to the number of deletion alleles available to researchers. The relationships between strains are complex and different regions of the genome possess different genealogies due to recombination throughout the natural history of the species, which may not be apparent in studies utilizing smaller numbers of genetic markers.
Collapse
Affiliation(s)
- Jason S Maydan
- Department of Zoology, University of British Columbia, British Columbia, Canada
| | | | | | | | | |
Collapse
|
13
|
Abstract
Natural selection on codon usage is a pervasive force that acts on a large variety of prokaryotic and eukaryotic genomes. Despite this, obtaining reliable estimates of selection on codon usage has proved complicated, perhaps due to the fact that the selection coefficients involved are very small. In this work, a population genetics model is used to measure the strength of selected codon usage bias, S, in 10 eukaryotic genomes. It is shown that the strength of selection is closely linked to expression and that reliable estimates of selection coefficients can only be obtained for genes with very similar expression levels. We compare the strength of selected codon usage for orthologous genes across all 10 genomes classified according to expression categories. Fungi genomes present the largest S values (2.24-2.56), whereas multicellular invertebrate and plant genomes present more moderate values (0.61-1.91). The large mammalian genomes (human and mouse) show low S values (0.22-0.51) for the most highly expressed genes. This might not be evidence for selection in these organisms as the technique used here to estimate S does not properly account for nucleotide composition heterogeneity along such genomes. The relationship between estimated S values and empirical estimates of population size is presented here for the first time. It is shown, as theoretically expected, that population size has an important role in the operativity of translational selection.
Collapse
Affiliation(s)
- Mario dos Reis
- School of Crystallography, Birkbeck College, London, UK.
| | | |
Collapse
|
14
|
Cox MP, Woerner AE, Wall JD, Hammer MF. Intergenic DNA sequences from the human X chromosome reveal high rates of global gene flow. BMC Genet 2008; 9:76. [PMID: 19038041 PMCID: PMC2620354 DOI: 10.1186/1471-2156-9-76] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2008] [Accepted: 11/27/2008] [Indexed: 11/13/2022] Open
Abstract
Background Despite intensive efforts devoted to collecting human polymorphism data, little is known about the role of gene flow in the ancestry of human populations. This is partly because most analyses have applied one of two simple models of population structure, the island model or the splitting model, which make unrealistic biological assumptions. Results Here, we analyze 98-kb of DNA sequence from 20 independently evolving intergenic regions on the X chromosome in a sample of 90 humans from six globally diverse populations. We employ an isolation-with-migration (IM) model, which assumes that populations split and subsequently exchange migrants, to independently estimate effective population sizes and migration rates. While the maximum effective size of modern humans is estimated at ~10,000, individual populations vary substantially in size, with African populations tending to be larger (2,300–9,000) than non-African populations (300–3,300). We estimate mean rates of bidirectional gene flow at 4.8 × 10-4/generation. Bidirectional migration rates are ~5-fold higher among non-African populations (1.5 × 10-3) than among African populations (2.7 × 10-4). Interestingly, because effective sizes and migration rates are inversely related in African and non-African populations, population migration rates are similar within Africa and Eurasia (e.g., global mean Nm = 2.4). Conclusion We conclude that gene flow has played an important role in structuring global human populations and that migration rates should be incorporated as critical parameters in models of human demography.
Collapse
Affiliation(s)
- Murray P Cox
- ARL Division of Biotechnology, University of Arizona, AZ 85721, USA.
| | | | | | | |
Collapse
|
15
|
Stacey A, Sheffield NC, Crandall KA. Calculating expected DNA remnants from ancient founding events in human population genetics. BMC Genet 2008; 9:66. [PMID: 18928554 PMCID: PMC2588638 DOI: 10.1186/1471-2156-9-66] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2008] [Accepted: 10/17/2008] [Indexed: 11/10/2022] Open
Abstract
Background Recent advancements in sequencing and computational technologies have led to rapid generation and analysis of high quality genetic data. Such genetic data have achieved wide acceptance in studies of historic human population origins and admixture. However, in studies relating to small, recent admixture events, genetic factors such as historic population sizes, genetic drift, and mutation can have pronounced effects on data reliability and utility. To address these issues we conducted genetic simulations targeting influential genetic parameters in admixed populations. Results We performed a series of simulations, adjusting variable values to assess the affect of these genetic parameters on current human population studies and what these studies infer about past population structure. Final mean allele frequencies varied from 0.0005 to over 0.50, depending on the parameters. Conclusion The results of the simulations illustrate that, while genetic data may be sensitive and powerful in large genetic studies, caution must be used when applying genetic information to small, recent admixture events. For some parameter sets, genetic data will not be adequate to detect historic admixture. In such cases, studies should consider anthropologic, archeological, and linguistic data where possible.
Collapse
Affiliation(s)
- Andrew Stacey
- Department of Statistics, Brigham Young University, Provo, UT 84602, USA.
| | | | | |
Collapse
|
16
|
Xing J, Witherspoon DJ, Ray DA, Batzer MA, Jorde LB. Mobile DNA elements in primate and human evolution. AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY 2008; Suppl 45:2-19. [PMID: 18046749 DOI: 10.1002/ajpa.20722] [Citation(s) in RCA: 106] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Roughly 50% of the primate genome consists of mobile, repetitive DNA sequences such as Alu and LINE1 elements. The causes and evolutionary consequences of mobile element insertion, which have received considerable attention during the past decade, are reviewed in this article. Because of their unique mutational mechanisms, these elements are highly useful for answering phylogenetic questions. We demonstrate how they have been used to help resolve a number of questions in primate phylogeny, including the human-chimpanzee-gorilla trichotomy and New World primate phylogeny. Alu and LINE1 element insertion polymorphisms have also been analyzed in human populations to test hypotheses about human evolution and population affinities and to address forensic issues. Finally, these elements have had impacts on the genome itself. We review how they have influenced fundamental ongoing processes like nonhomologous recombination, genomic deletion, and X chromosome inactivation.
Collapse
Affiliation(s)
- Jinchuan Xing
- Department of Human Genetics, University of Utah Health Sciences Center, Salt Lake City, UT 84112, USA
| | | | | | | | | |
Collapse
|
17
|
Shedlock AM, Takahashi K, Okada N. SINEs of speciation: tracking lineages with retroposons. Trends Ecol Evol 2007; 19:545-53. [PMID: 16701320 DOI: 10.1016/j.tree.2004.08.002] [Citation(s) in RCA: 104] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
The value of short interspersed elements (SINEs) for diagnosing common ancestry is being expanded to examine the differential sorting of lineages through the course of speciation events. Because most SINEs are neutral markers of identical descent, are not precisely excised from the genome and have a known ancestral condition, they are advantageous for reconciling gene trees and species trees with minimal phylogenetic error. A population perspective on SINE evolution combined with coalescence theory provides a context for investigating the phenomenon of ancestral polymorphism and its role in producing incongruent SINE insertion patterns among multiple loci. Studies of human Alu repeats demonstrate the value of young polymorphic SINEs for assessing human genomic diversity and tracking ancient demographics of human populations, whereas incongruent insertion patterns revealed by older fixed SINE loci, such as those in African cichlid fishes, contain information that might help identify ancient radiations that are otherwise obscured by accumulated mutations in sequence data. Here, we review the utility of retroposons for inferring common ancestry, discuss limits to the method, and clarify confusion by providing examples from the literature that illustrate how discordant multi-locus insertion patterns of retroelements can indicate lineage-sorting events that should not be misinterpreted as phylogenetic noise.
Collapse
Affiliation(s)
- Andrew M Shedlock
- Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
| | | | | |
Collapse
|
18
|
Tenesa A, Navarro P, Hayes BJ, Duffy DL, Clarke GM, Goddard ME, Visscher PM. Recent human effective population size estimated from linkage disequilibrium. Genome Res 2007; 17:520-6. [PMID: 17351134 PMCID: PMC1832099 DOI: 10.1101/gr.6023607] [Citation(s) in RCA: 304] [Impact Index Per Article: 16.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Effective population size (N(e)) determines the amount of genetic variation, genetic drift, and linkage disequilibrium (LD) in populations. Here, we present the first genome-wide estimates of human effective population size from LD data. Chromosome-specific effective population size was estimated for all autosomes and the X chromosome from estimated LD between SNP pairs <100 kb apart. We account for variation in recombination rate by using coalescent-based estimates of fine-scale recombination rate from one sample and correlating these with LD in an independent sample. Phase I of the HapMap project produced between 18 and 22 million SNP pairs in samples from four populations: Yoruba from Ibadan (YRI), Nigeria; Japanese from Tokyo (JPT); Han Chinese from Beijing (HCB); and residents from Utah with ancestry from northern and western Europe (CEU). For CEU, JPT, and HCB, the estimate of effective population size, adjusted for SNP ascertainment bias, was approximately 3100, whereas the estimate for the YRI was approximately 7500, consistent with the out-of-Africa theory of ancestral human population expansion and concurrent bottlenecks. We show that the decay in LD over distance between SNPs is consistent with recent population growth. The estimates of N(e) are lower than previously published estimates based on heterozygosity, possibly because they represent one or more bottlenecks in human population size that occurred approximately 10,000 to 200,000 years ago.
Collapse
Affiliation(s)
- Albert Tenesa
- Colon Cancer Genetics Group, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, United Kingdom
- MRC Human Genetics Unit, Western General Hospital, Edinburgh EH4 2XU, United Kingdom
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom
| | - Pau Navarro
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom
| | - Ben J. Hayes
- Victorian Institute of Animal Science, DPI, Attwood 3049, Australia
| | - David L. Duffy
- Queensland Institute of Medical Research, Royal Brisbane Hospital, Brisbane 4006, Australia
| | - Geraldine M. Clarke
- The Wellcome Trust Centre for Human Genetics, The University of Oxford, Oxford OX3 7BN, United Kingdom
| | - Mike E. Goddard
- Victorian Institute of Animal Science, DPI, Attwood 3049, Australia
- Institute of Land and Food Resources, University of Melbourne, Parkville 3010, Australia
| | - Peter M. Visscher
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom
- Queensland Institute of Medical Research, Royal Brisbane Hospital, Brisbane 4006, Australia
- Corresponding author.E-mail ; fax +61-7-3362-0101
| |
Collapse
|
19
|
Abstract
Mobile elements represent a unique and under-utilized set of tools for molecular ecologists. They are essentially homoplasy-free characters with the ability to be genotyped in a simple and efficient manner. Interpretation of the data generated using mobile elements can be simple compared to other genetic markers. They exist in a wide variety of taxa and are useful over a wide selection of temporal ranges within those taxa. Furthermore, their mode of evolution instills them with another advantage over other types of multilocus genotype data: the ability to determine loci applicable to a range of time spans in the history of a taxon. In this review, I discuss the application of mobile element markers, especially short interspersed elements (SINEs), to phylogenetic and population data, with an emphasis on potential applications to molecular ecology.
Collapse
Affiliation(s)
- David A Ray
- Department of Biology, West Virginia University, 53 Campus Dr, Morgantown, WV 26506, USA.
| |
Collapse
|
20
|
Witherspoon DJ, Marchani EE, Watkins WS, Ostler CT, Wooding SP, Anders BA, Fowlkes JD, Boissinot S, Furano AV, Ray DA, Rogers AR, Batzer MA, Jorde LB. Human population genetic structure and diversity inferred from polymorphic L1(LINE-1) and Alu insertions. Hum Hered 2006; 62:30-46. [PMID: 17003565 DOI: 10.1159/000095851] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2006] [Accepted: 07/25/2006] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND/AIMS The L1 retrotransposable element family is the most successful self-replicating genomic parasite of the human genome. L1 elements drive replication of Alu elements, and both have had far-reaching impacts on the human genome. We use L1 and Alu insertion polymorphisms to analyze human population structure. METHODS We genotyped 75 recent, polymorphic L1 insertions in 317 individuals from 21 populations in sub-Saharan Africa, East Asia, Europe and the Indian subcontinent. This is the first sample of L1 loci large enough to support detailed population genetic inference. We analyzed these data in parallel with a set of 100 polymorphic Alu insertion loci previously genotyped in the same individuals. RESULTS AND CONCLUSION The data sets yield congruent results that support the recent African origin model of human ancestry. A genetic clustering algorithm detects clusters of individuals corresponding to continental regions. The number of loci sampled is critical: with fewer than 50 typical loci, structure cannot be reliably discerned in these populations. The inclusion of geographically intermediate populations (from India) reduces the distinctness of clustering. Our results indicate that human genetic variation is neither perfectly correlated with geographic distance (purely clinal) nor independent of distance (purely clustered), but a combination of both: stepped clinal.
Collapse
Affiliation(s)
- D J Witherspoon
- Department of Human Genetics, University of Utah Health Sciences Center, Salt Lake City, UT 84112-5330, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Bataillon T, Mailund T, Thorlacius S, Steingrimsson E, Rafnar T, Halldorsson MM, Calian V, Schierup MH. The effective size of the Icelandic population and the prospects for LD mapping: inference from unphased microsatellite markers. Eur J Hum Genet 2006; 14:1044-53. [PMID: 16736029 DOI: 10.1038/sj.ejhg.5201669] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Abstract
Characterizing the extent of linkage disequilibrium (LD) in the genome is a pre-requisite for association mapping studies. Patterns of LD also contain information about the past demography of populations. In this study, we focus on the Icelandic population where LD was investigated in 12 regions of approximately 15 cM using regularly spaced microsatellite loci displaying high heterozygosity. A total of 1753 individuals were genotyped for 179 markers. LD was estimated using a composite disequilibrium measure based on unphased data. LD decreases with distance in all 12 regions and more LD than expected by chance can be detected over approximately 4 cM in our sample. Differences in the patterns of decrease of LD with distance among genomic regions were mostly due to two regions exhibiting, respectively, higher and lower proportions of pairs in LD than average within the first 4 cM. We pooled data from all regions, except these two and summarized patterns of LD by computing the proportion of pairs of loci exhibiting significant LD (at the 5% level) as a function of distance. We compared observed patterns of LD with simulated data sets obtained under scenarios with varying demography and intensity of recombination. We show that unphased data allow to make inferences on scaled recombination rates from patterns of LD. Patterns of LD in Iceland suggest a genome-wide scaled recombination rate of rho* = 200 (130-330) per cM (or an effective size of roughly 5000), in the low range of estimates recently reported in three populations from the HapMap project.
Collapse
Affiliation(s)
- Thomas Bataillon
- Bioinformatics Research Center, University of Aarhus, Høegh-Guldbergs Gade 10, DK-8000 Aarhus C, Denmark.
| | | | | | | | | | | | | | | |
Collapse
|
22
|
Hedges DJ, Cordaux R, Xing J, Witherspoon DJ, Rogers AR, Jorde LB, Batzer MA. Modeling the amplification dynamics of human Alu retrotransposons. PLoS Comput Biol 2005; 1:e44. [PMID: 16201008 PMCID: PMC1239904 DOI: 10.1371/journal.pcbi.0010044] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2005] [Accepted: 08/24/2005] [Indexed: 11/19/2022] Open
Abstract
Retrotransposons have had a considerable impact on the overall architecture of the human genome. Currently, there are three lineages of retrotransposons (Alu, L1, and SVA) that are believed to be actively replicating in humans. While estimates of their copy number, sequence diversity, and levels of insertion polymorphism can readily be obtained from existing genomic sequence data and population sampling, a detailed understanding of the temporal pattern of retrotransposon amplification remains elusive. Here we pose the question of whether, using genomic sequence and population frequency data from extant taxa, one can adequately reconstruct historical amplification patterns. To this end, we developed a computer simulation that incorporates several known aspects of primate Alu retrotransposon biology and accommodates sampling effects resulting from the methods by which mobile elements are typically discovered and characterized. By modeling a number of amplification scenarios and comparing simulation-generated expectations to empirical data gathered from existing Alu subfamilies, we were able to statistically reject a number of amplification scenarios for individual subfamilies, including that of a rapid expansion or explosion of Alu amplification at the time of human–chimpanzee divergence. Nearly 50% of the human genome is composed of mobile elements. While much of this sequence consists of inactive “fossil” elements that are no longer actively moving or generating new copies, three families are currently proliferating in human genomes. Among these, the Alu lineage has reached a copy number of over 1 million and alone accounts for approximately 10% of the genome. While considerable evidence has been gathered concerning the underlying biological mechanisms of Alu mobilization and proliferation, a detailed understanding of Alu amplification history is currently lacking. Researchers are aware, for example, that several thousand Alu elements have inserted within the human genome since the divergence of humans and chimpanzees, but how those insertions were distributed over this ~6-million-year time period is currently unknown. In this work, the authors introduce a simulation framework that seeks to incorporate both sequence diversity and empirically gathered population data from human Alu elements, in order to provide a better understanding of the last several million years of human Alu evolution. The results suggest that a rapid explosion of Alu amplification at the time of the human–chimpanzee divergence is unlikely. Therefore, it is improbable that an increase in Alu retrotransposition activity was involved in the speciation of humans and chimpanzees.
Collapse
Affiliation(s)
- Dale J Hedges
- Department of Biological Sciences, Biological Computation and Visualization Center, Center for Bio-Modular Microsystems, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - Richard Cordaux
- Department of Biological Sciences, Biological Computation and Visualization Center, Center for Bio-Modular Microsystems, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - Jinchuan Xing
- Department of Biological Sciences, Biological Computation and Visualization Center, Center for Bio-Modular Microsystems, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - David J Witherspoon
- Department of Human Genetics, University of Utah Health Sciences Center, Salt Lake City, Utah, United States of America
| | - Alan R Rogers
- Department of Anthropology, University of Utah, Salt Lake City, Utah, United States of America
| | - Lynn B Jorde
- Department of Human Genetics, University of Utah Health Sciences Center, Salt Lake City, Utah, United States of America
| | - Mark A Batzer
- Department of Biological Sciences, Biological Computation and Visualization Center, Center for Bio-Modular Microsystems, Louisiana State University, Baton Rouge, Louisiana, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
23
|
Kulski JK, Dunn DS. Polymorphic Alu insertions within the Major Histocompatibility Complex class I genomic region: a brief review. Cytogenet Genome Res 2005; 110:193-202. [PMID: 16093672 DOI: 10.1159/000084952] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2003] [Accepted: 10/21/2003] [Indexed: 11/19/2022] Open
Abstract
Most polymorphic Alu insertions (POALINs) belong to a subgroup of the Alu multicopy retrotransposon family of short interspersed nucleotide elements (SINEs) that are categorized as AluYb8 and AluYa5. The number of AluYb8/AluYa5 members (approximately 4,492 copies) is significantly less than the approximately one million fixed Alu copies per human genome. We have studied the presence of POALINs within the Major Histocompatibility Complex (MHC) class I region on the short arm of chromosome 6 (6p21.3) because this region has a high gene density, many genes with immune system functions, large sequence variations and diversity, duplications and redundancy, and a strong association with more than 100 different diseases. Since little is known about POALINs within the MHC genomic region, we undertook to identify some of the members of the AluYb8/AluYa5 subfamily and to study their frequency of distribution and genetic characteristics in different populations. As a result of our comparative genomic analyses, we identified the insertion sites for five POALINs distributed within the MHC class I region. This brief review outlines the locations of the insertions and sequence features of the five MHC POALINs, their single site and haplotype frequencies in different geographic populations, and their association with different HLA class I genes and disease. We show that the MHC POALINs have a potential value as lineage and linkage markers for the study of human population genetics, disease associations, genomic diversity and evolution.
Collapse
Affiliation(s)
- J K Kulski
- Centre for Bioinformatics and Biological Computing, School of Information Technology, Murdoch University, Murdoch, Western Australia.
| | | |
Collapse
|
24
|
Abstract
The large single nucleotide polymorphism (SNP) typing projects have provided an invaluable data resource for human population geneticists. Almost all of the available SNP loci, however, have been identified through a SNP discovery protocol that will influence the allelic distributions in the sampled loci. Standard methods for population genetic analysis based on the available SNP data will, therefore, be biased. This paper discusses the effect of this ascertainment bias on allelic distributions and on methods for quantifying linkage disequilibrium and estimating demographic parameters. Several recently developed methods for correcting for the ascertainment bias will also be discussed.
Collapse
Affiliation(s)
- Rasmus Nielsen
- Department of Biological Statistics and Computational Biology, Cornell University, 439 Warren Hall, Ithaca, NY 14853-7801, USA.
| |
Collapse
|
25
|
Bouillé M, Bousquet J. Trans-species shared polymorphisms at orthologous nuclear gene loci among distant species in the conifer Picea (Pinaceae): implications for the long-term maintenance of genetic diversity in trees. AMERICAN JOURNAL OF BOTANY 2005; 92:63-73. [PMID: 21652385 DOI: 10.3732/ajb.92.1.63] [Citation(s) in RCA: 75] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
For each of three nuclear gene loci, intraspecific- as well as trans-specific shared polymorphisms were detected in DNA among three distantly related species in the genus Picea. Few fixed interspecific polymorphisms were observed. Allele genealogies did not match species phylogenies, and species lineages were not reciprocally monophyletic. Based on molecular clocks and morphological evidence from the fossil record, the divergence time between species was estimated at 13-20 million years (my), and a mutation rate of 2.23 × 10(-10) to 3.42 × 10(-10) per site per year was estimated. Large historical population sizes in excess of 100 000 were inferred, which would have delayed the fixation of polymorphisms. These numbers translated into allele coalescence times in the order of 10 to 18 my, which implies the sharing of polymorphisms since common ancestry. These results suggest that trans-species shared polymorphisms might be frequent at plant nuclear gene loci, leading to high allelic diversity. Such a trend is more likely in trees and plants characterized by ecological and life-history determinants favoring large population sizes such as an outcrossing mating system, wind pollination, and a dominant position in ecosystem. These polymorphisms also call for caution in estimating congeneric species phylogenies from nuclear gene sequences in such plant groups.
Collapse
Affiliation(s)
- Marie Bouillé
- Chaire de recherche du Canada en génomique forestière et environnementale and Centre de recherche en biologie forestière, Université Laval, Sainte-Foy, Québec, Canada G1K 7P4
| | | |
Collapse
|
26
|
Marth GT, Czabarka E, Murvai J, Sherry ST. The allele frequency spectrum in genome-wide human variation data reveals signals of differential demographic history in three large world populations. Genetics 2004; 166:351-72. [PMID: 15020430 PMCID: PMC1470693 DOI: 10.1534/genetics.166.1.351] [Citation(s) in RCA: 236] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We have studied a genome-wide set of single-nucleotide polymorphism (SNP) allele frequency measures for African-American, East Asian, and European-American samples. For this analysis we derived a simple, closed mathematical formulation for the spectrum of expected allele frequencies when the sampled populations have experienced nonstationary demographic histories. The direct calculation generates the spectrum orders of magnitude faster than coalescent simulations do and allows us to generate spectra for a large number of alternative histories on a multidimensional parameter grid. Model-fitting experiments using this grid reveal significant population-specific differences among the demographic histories that best describe the observed allele frequency spectra. European and Asian spectra show a bottleneck-shaped history: a reduction of effective population size in the past followed by a recent phase of size recovery. In contrast, the African-American spectrum shows a history of moderate but uninterrupted population expansion. These differences are expected to have profound consequences for the design of medical association studies. The analytical methods developed for this study, i.e., a closed mathematical formulation for the allele frequency spectrum, correcting the ascertainment bias introduced by shallow SNP sampling, and dealing with variable sample sizes provide a general framework for the analysis of public variation data.
Collapse
Affiliation(s)
- Gabor T Marth
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.
| | | | | | | |
Collapse
|
27
|
Neafsey DE, Blumenstiel JP, Hartl DL. Different regulatory mechanisms underlie similar transposable element profiles in pufferfish and fruitflies. Mol Biol Evol 2004; 21:2310-8. [PMID: 15342795 DOI: 10.1093/molbev/msh243] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Comparative analysis of recently sequenced eukaryotic genomes has uncovered extensive variation in transposable element (TE) abundance, diversity, and distribution. The TE profile in the sequenced pufferfish genomes is more similar to that of Drosophila melanogaster than to human or mouse, in that pufferfish TEs exhibit low overall abundance, high family diversity, and localization in the heterochromatin. It has been suggested that selection against the deleterious effects of ectopic recombination between TEs has structured the TE profile in Drosophila and pufferfish but not in humans. We test this hypothesis by measuring the sample frequency of 48 euchromatic TE insertions in the genome of the green spotted pufferfish (Tetraodon nigroviridis). We estimate the strength of selection acting on recent insertions by analyzing the site frequency spectrum using a maximum-likelihood approach. We show that in contrast to Drosophila, euchromatic TE insertions in Tetraodon are selectively neutral and that the low copy number and compartmentalized distribution of TEs in the Tetraodon genome must be caused by regulation by means other than purifying selection acting on recent insertions. Inference of regulatory processes governing TE profiles should take into account factors such as effective population size, incidence of inbreeding/outcrossing, and other species-specific traits.
Collapse
Affiliation(s)
- Daniel E Neafsey
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, USA.
| | | | | |
Collapse
|
28
|
Otieno AC, Carter AB, Hedges DJ, Walker JA, Ray DA, Garber RK, Anders BA, Stoilova N, Laborde ME, Fowlkes JD, Huang CH, Perodeau B, Batzer MA. Analysis of the Human Alu Ya-lineage. J Mol Biol 2004; 342:109-18. [PMID: 15313610 DOI: 10.1016/j.jmb.2004.07.016] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2004] [Revised: 07/08/2004] [Accepted: 07/12/2004] [Indexed: 11/22/2022]
Abstract
The Alu Ya-lineage is a group of related, short interspersed elements (SINEs) found in primates. This lineage includes subfamilies Ya1-Ya5, Ya5a2 and others. Some of these subfamilies are still actively mobilizing in the human genome. We have analyzed 2482 elements that reside in the human genome draft sequence and focused our analyses on the 2318 human autosomal Ya Alu elements. A total of 1470 autosomal loci were subjected to polymerase chain reaction (PCR)-based assays that allow analysis of individual Ya-lineage Alu elements. About 22% (313/1452) of the Ya-lineage Alu elements were polymorphic for the insertion presence on human autosomes. Less than 0.01% (5/1452) of the Ya-lineage loci analyzed displayed insertions in orthologous loci in non-human primate genomes. DNA sequence analysis of the orthologous inserts showed that the orthologous loci contained older pre-existing Y, Sc or Sq Alu subfamily elements that were the result of parallel forward insertions or involved in gene conversion events in the human lineage. This study is the largest analysis of a group of "young", evolutionarily related human subfamilies. The size, evolutionary age and variable allele insertion frequencies of several of these subfamilies makes members of the Ya-lineage useful tools for human population studies and primate phylogenetics.
Collapse
Affiliation(s)
- Anthony C Otieno
- Department of Biological Sciences, Biological Computation and Visualization Center, Center for Bio-Modular Microsystems, Louisiana State University, 202 Life Sciences Building, Baton Rouge 70803, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Cotrim NH, Auricchio MTBM, Vicente JP, Otto PA, Mingroni-Netto RC. Polymorphic Alu insertions in six Brazilian African-derived populations. Am J Hum Biol 2004; 16:264-77. [PMID: 15101052 DOI: 10.1002/ajhb.20024] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
At least 25 African-derived populations (quilombo remnants) are believed to exist in the Ribeira River Valley, located in the southern part of São Paulo State, Brazil. We studied four Alu polymorphic loci (APO, ACE, TPA25, and FXIIIB) in individuals belonging to six quilombo remnants in addition to individuals sampled from the city of São Paulo. The allelic frequencies observed in the quilombo remnants were similar to those previously observed in African-derived populations from Central and North America. Genetic variability indexes (Fst and Gst values) in our quilombos were higher than the reported values for the majority of other populations analyzed for the same kind of markers, but lower than the variability usually observed in Amerindian groups. The observed high degree of genetic differentiation may be due to genetic drift, especially the founder effect. Our results suggest that these populations behave genetically as semi-isolates. The degree of genetic variability within populations was larger than among them, a finding described in other studies. In the neighbor-joining tree, some of the Brazilian quilombos clustered with the African and African-derived populations (São Pedro and Galvão), others with the Europeans (Pilões, Maria Rosa, and Abobral). Pedro Cubas was placed in an isolated branch. Principal component analysis was also performed and confirmed the trends observed in the neighbor-joining tree. Overall, the quilombos showed a higher degree of gene flow than average when compared to other worldwide populations, but similar to other African-derived populations.
Collapse
Affiliation(s)
- Nelson Henderson Cotrim
- Centro de Estudos do Genoma Humano, Departamento de Biologia, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | | | | | | | | |
Collapse
|
30
|
Vishwanathan H, Deepa E, Cordaux R, Stoneking M, Usha Rani MV, Majumder PP. Genetic structure and affinities among tribal populations of southern India: a study of 24 autosomal DNA markers. Ann Hum Genet 2004; 68:128-38. [PMID: 15008792 DOI: 10.1046/j.1529-8817.2003.00083.x] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
We describe the genetic structure and affinities of five Dravidian-speaking tribal populations inhabiting the Nilgiri hills of Tamil Nadu, in south India, using 24 autosomal DNA markers. Our goals were: (i). to examine what evolutionary forces have most significantly impacted south Indian tribal genetic variation, and (ii). to test whether the phenotypic similarities of some south Indian tribal groups to Africans represent a signature of close relationship to Africans or are due to convergence. All loci were polymorphic and average heterozygosities were substantial (range: 0.347-0.423). Genetic differentiation was high (Gst= 6.7%) and genetic distances were not significantly correlated with geographic distances. Genetic drift therefore probably played a significant role in shaping the patterns of genetic variation observed in southern Indian tribal populations. Otherwise, analyses of population relationships showed that Indian populations are closely related to one another, regardless of phenotypic characteristics, and do not show particular affinities to Africans. We conclude that the phenotypic similarities of some Indian groups to Africans do not reflect a close relationship between these groups, but are better explained by convergence.
Collapse
Affiliation(s)
- H Vishwanathan
- Department of Environmental Sciences, Bharathiar University, Coimbatore - 641 046, India.
| | | | | | | | | | | |
Collapse
|
31
|
Tishkoff SA, Verrelli BC. Patterns of human genetic diversity: implications for human evolutionary history and disease. Annu Rev Genomics Hum Genet 2003; 4:293-340. [PMID: 14527305 DOI: 10.1146/annurev.genom.4.070802.110226] [Citation(s) in RCA: 239] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Since the completion of the human genome sequencing project, the discovery and characterization of human genetic variation is a principal focus for future research. Comparative studies across ethnically diverse human populations and across human and nonhuman primate species is important for reconstructing human evolutionary history and for understanding the genetic basis of human disease. In this review, we summarize data on patterns of human genetic diversity and the evolutionary forces (mutation, genetic drift, migration, and selection) that have shaped these patterns of variation across both human populations and the genome. African population samples typically have higher levels of genetic diversity, a complex population substructure, and low levels of linkage disequilibrium (LD) relative to non-African populations. We discuss these differences and their implications for mapping disease genes and for understanding how population and genomic diversity have been important in the evolution, differentiation, and adaptation of humans.
Collapse
Affiliation(s)
- Sarah A Tishkoff
- Department of Biology, University of Maryland, College Park, Maryland 20742, USA.
| | | |
Collapse
|
32
|
Polanski A, Kimmel M. New Explicit Expressions for Relative Frequencies of Single-Nucleotide Polymorphisms With Application to Statistical Inference on Population Growth. Genetics 2003; 165:427-36. [PMID: 14504247 PMCID: PMC1462751 DOI: 10.1093/genetics/165.1.427] [Citation(s) in RCA: 83] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Abstract
We present new methodology for calculating sampling distributions of single-nucleotide polymorphism (SNP) frequencies in populations with time-varying size. Our approach is based on deriving analytical expressions for frequencies of SNPs. Analytical expressions allow for computations that are faster and more accurate than Monte Carlo simulations. In contrast to other articles showing analytical formulas for frequencies of SNPs, we derive expressions that contain coefficients that do not explode when the genealogy size increases. We also provide analytical formulas to describe the way in which the ascertainment procedure modifies SNP distributions. Using our methods, we study the power to test the hypothesis of exponential population expansion vs. the hypothesis of evolution with constant population size. We also analyze some of the available SNP data and we compare our results of demographic parameters estimation to those obtained in previous studies in population genetics. The analyzed data seem consistent with the hypothesis of past population growth of modern humans. The analysis of the data also shows a very strong sensitivity of estimated demographic parameters to changes of the model of the ascertainment procedure.
Collapse
Affiliation(s)
- A Polanski
- Department of Statistics, Rice University, Houston, Texas 77005, USA
| | | |
Collapse
|
33
|
Marth G, Schuler G, Yeh R, Davenport R, Agarwala R, Church D, Wheelan S, Baker J, Ward M, Kholodov M, Phan L, Czabarka E, Murvai J, Cutler D, Wooding S, Rogers A, Chakravarti A, Harpending HC, Kwok PY, Sherry ST. Sequence variations in the public human genome data reflect a bottlenecked population history. Proc Natl Acad Sci U S A 2003; 100:376-81. [PMID: 12502794 PMCID: PMC140982 DOI: 10.1073/pnas.222673099] [Citation(s) in RCA: 92] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Single-nucleotide polymorphisms (SNPs) constitute the great majority of variations in the human genome, and as heritable variable landmarks they are useful markers for disease mapping and resolving population structure. Redundant coverage in overlaps of large-insert genomic clones, sequenced as part of the Human Genome Project, comprises a quarter of the genome, and it is representative in terms of base compositional and functional sequence features. We mined these regions to produce 500,000 high-confidence SNP candidates as a uniform resource for describing nucleotide diversity and its regional variation within the genome. Distributions of marker density observed at different overlap length scales under a model of recombination and population size change show that the history of the population represented by the public genome sequence is one of collapse followed by a recent phase of mild size recovery. The inferred times of collapse and recovery are Upper Paleolithic, in agreement with archaeological evidence of the initial modern human colonization of Europe.
Collapse
Affiliation(s)
- Gabor Marth
- National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD 20894, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Wooding SP, Watkins WS, Bamshad MJ, Dunn DM, Weiss RB, Jorde LB. DNA sequence variation in a 3.7-kb noncoding sequence 5' of the CYP1A2 gene: implications for human population history and natural selection. Am J Hum Genet 2002; 71:528-42. [PMID: 12181774 PMCID: PMC379190 DOI: 10.1086/342260] [Citation(s) in RCA: 58] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2002] [Accepted: 06/10/2002] [Indexed: 11/04/2022] Open
Abstract
CYP1A2 is a cytochrome P450 gene that is involved in human physiological responses to a variety of drugs and toxins. To investigate the role of population history and natural selection in shaping genetic diversity in CYP1A2, we sequenced a 3.7-kb region 5' from CYP1A2 in a diverse collection of 113 individuals from three major continental regions of the Old World (Africa, Asia, and Europe). We also examined sequences in the 90-member National Institutes of Health DNA Polymorphism Discovery Resource (PDR). Eighteen single-nucleotide polymorphisms (SNPs) were found. Most of the high-frequency SNPs found in the Old World sample were also found in the PDR sample. However, six SNPs were detected in the Old World sample but not in the PDR sample, and two SNPs found in the PDR sample were not found in the Old World sample. Most pairs of SNPs were in complete linkage disequilibrium with one another, and there was no indication of a decline of disequilibrium with physical distance in this region. The average +/- SD nucleotide diversity in the Old World sample was 0.00043+/-0.00026. The African population had the highest level of nucleotide diversity and the lowest level of linkage disequilibrium. Two distinct haplotype clusters with broadly overlapping geographical distributions were present. Of the 17 haplotypes found in the Old World sample, 12 were found in the African sample, 8 were found in Indians, 5 were found in non-Indian Asians, and 5 were found in Europeans. Haplotypes found outside Africa were mostly a subset of those found within Africa. These patterns are all consistent with an African origin of modern humans. Seven SNPs were singletons, and the site-frequency spectrum showed a significant departure from neutral expectations, suggesting population expansion and/or natural selection. Comparison with outgroup species showed that four derived SNPs have achieved high (>0.90) frequencies in human populations, a trend consistent with the action of positive natural selection. These patterns have a number of implications for disease-association studies in CYP1A2 and other genes.
Collapse
Affiliation(s)
- S. P. Wooding
- Departments of Human Genetics and Pediatrics, University of Utah, Salt Lake City
| | - W. S. Watkins
- Departments of Human Genetics and Pediatrics, University of Utah, Salt Lake City
| | - M. J. Bamshad
- Departments of Human Genetics and Pediatrics, University of Utah, Salt Lake City
| | - D. M. Dunn
- Departments of Human Genetics and Pediatrics, University of Utah, Salt Lake City
| | - R. B. Weiss
- Departments of Human Genetics and Pediatrics, University of Utah, Salt Lake City
| | - L. B. Jorde
- Departments of Human Genetics and Pediatrics, University of Utah, Salt Lake City
| |
Collapse
|
35
|
Wooding S, Rogers A. The matrix coalescent and an application to human single-nucleotide polymorphisms. Genetics 2002; 161:1641-50. [PMID: 12196407 PMCID: PMC1462217 DOI: 10.1093/genetics/161.4.1641] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The "matrix coalescent" is a reformulation of the familiar coalescent process of population genetics. It ignores the topology of the gene tree and treats the coalescent as a Markov process describing the decay in the number of ancestors of a sample of genes as one proceeds backward in time. The matrix formulation of this process is convenient when the population changes in size, because such changes affect only the eigenvalues of the transition matrix, not the eigenvectors. The model is used here to calculate the expectation of the site frequency spectrum under various assumptions about population history. To illustrate how this method can be used with data, we then use it in conjunction with a set of SNPs to test hypotheses about the history of human population size.
Collapse
Affiliation(s)
- Stephen Wooding
- Eccles Instititute of Human Genetics, University of Utah, Salt Lake City, Utah 84112-5330, USA.
| | | |
Collapse
|
36
|
Tishkoff SA, Williams SM. Genetic analysis of African populations: human evolution and complex disease. Nat Rev Genet 2002; 3:611-21. [PMID: 12154384 DOI: 10.1038/nrg865] [Citation(s) in RCA: 229] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Affiliation(s)
- Sarah A Tishkoff
- Department of Biology, University of Maryland, College Park, Maryland 20742, USA.
| | | |
Collapse
|
37
|
Abstract
For over 30 years, a debate has raged among anthropologists about the origins of anatomically modern humans. At first the debate centered on fossil evidence, but in the past 10-15 years population geneticists have entered the fray. One model, the multiregional evolution model, posits a gradual transition from Homo erectus to anatomically modern humans throughout the Old World. In contrast, the recent African origin model hypothesizes that anatomically modern humans arose from a small, isolated population in Africa, spread out of Africa, and replaced indigenous H. erectus populations in Eurasia and Australasia. A primary objection, from a population genetics perspective, to the multiregional model is that the genetic data suggest a small effective population size for humans. This effective size is on the order of 10,000. Effective population size has a complex relationship with census size, but it has been argued that, assuming that effective size is roughly equal to the number of breeding individuals in human populations under standard demographic conditions, 10,000 breeding individuals could not have occupied much of the Old World throughout the Pleistocene and remained a cohesive species via gene flow. However, this argument is not valid if one considers population extinction and recolonization, which might have played an important role in human history during the Pleistocene. With population extinction and recolonization, the inbreeding effective population size can be small and the census size extremely large. In this paper, I will show that under conditions of population extinction and recolonization, an effective population size of 10,000 suggested by genetic data is compatible with a large census size consistent with the multiregional model.
Collapse
Affiliation(s)
- Elise Eller
- Human Genetics Center, University of Texas School of Public Health, P.O. Box 20334, 77225, Houston, TX 77225, USA.
| |
Collapse
|
38
|
Abstract
During the past 65 million years, Alu elements have propagated to more than one million copies in primate genomes, which has resulted in the generation of a series of Alu subfamilies of different ages. Alu elements affect the genome in several ways, causing insertion mutations, recombination between elements, gene conversion and alterations in gene expression. Alu-insertion polymorphisms are a boon for the study of human population genetics and primate comparative genomics because they are neutral genetic markers of identical descent with known ancestral states.
Collapse
Affiliation(s)
- Mark A Batzer
- Department of Biological Sciences, Biological Computation and Visualization Center, Louisiana State University, 202 Life Sciences Building, Baton Rouge, Louisiana 70803, USA.
| | | |
Collapse
|
39
|
Belle EMS, Eyre-Walker A. A test of whether selection maintains isochores using sites polymorphic for Alu and L1 element insertions. Genetics 2002; 160:815-7. [PMID: 11898794 PMCID: PMC1461991 DOI: 10.1093/genetics/160.2.815] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
40
|
Wakeley J, Nielsen R, Liu-Cordero SN, Ardlie K. The discovery of single-nucleotide polymorphisms--and inferences about human demographic history. Am J Hum Genet 2001; 69:1332-47. [PMID: 11704929 PMCID: PMC1235544 DOI: 10.1086/324521] [Citation(s) in RCA: 123] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2001] [Accepted: 09/24/2001] [Indexed: 11/03/2022] Open
Abstract
A method of historical inference that accounts for ascertainment bias is developed and applied to single-nucleotide polymorphism (SNP) data in humans. The data consist of 84 short fragments of the genome that were selected, from three recent SNP surveys, to contain at least two polymorphisms in their respective ascertainment samples and that were then fully resequenced in 47 globally distributed individuals. Ascertainment bias is the deviation, from what would be observed in a random sample, caused either by discovery of polymorphisms in small samples or by locus selection based on levels or patterns of polymorphism. The three SNP surveys from which the present data were derived differ both in their protocols for ascertainment and in the size of the samples used for discovery. We implemented a Monte Carlo maximum-likelihood method to fit a subdivided-population model that includes a possible change in effective size at some time in the past. Incorrectly assuming that ascertainment bias does not exist causes errors in inference, affecting both estimates of migration rates and historical changes in size. Migration rates are overestimated when ascertainment bias is ignored. However, the direction of error in inferences about changes in effective population size (whether the population is inferred to be shrinking or growing) depends on whether either the numbers of SNPs per fragment or the SNP-allele frequencies are analyzed. We use the abbreviation "SDL," for "SNP-discovered locus," in recognition of the genomic-discovery context of SNPs. When ascertainment bias is modeled fully, both the number of SNPs per SDL and their allele frequencies support a scenario of growth in effective size in the context of a subdivided population. If subdivision is ignored, however, the hypothesis of constant effective population size cannot be rejected. An important conclusion of this work is that, in demographic or other studies, SNP data are useful only to the extent that their ascertainment can be modeled.
Collapse
Affiliation(s)
- J Wakeley
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA.
| | | | | | | |
Collapse
|
41
|
Mundt CA, Nicholson IC, Zou X, Popov AV, Ayling C, Brüggemann M. Novel control motif cluster in the IgH delta-gamma 3 interval exhibits B cell-specific enhancer function in early development. JOURNAL OF IMMUNOLOGY (BALTIMORE, MD. : 1950) 2001; 166:3315-23. [PMID: 11207287 DOI: 10.4049/jimmunol.166.5.3315] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
The majority of the human Ig heavy chain (IgH) constant (C) region locus has been cloned and mapped. An exception is the region between C delta and C gamma 3, which is unstable and may be a recombination hot spot. We isolated a pBAC clone (pHuIgH3'delta-gamma 3) that established a 52-kb distance between C delta and C gamma 3. Sequence analysis identified a high number of repeat elements, explaining the instability of the region, and an unusually large accumulation of transcription factor-binding motifs, for both lymphocyte-specific and ubiquitous transcription activators (IKAROS, E47, Oct-1, USF, Myc/Max), and for factors that may repress transcription (Delta EF1, Gfi-1, E4BP4, C/EBP beta). Functional analysis in reporter gene assays revealed the importance of the C delta-C gamma 3 interval in lymphocyte differentiation and identified independent regions capable of either enhancement or silencing of reporter gene expression and interaction with the IgH intron enhancer E mu. In transgenic mice, carrying a construct that links the beta-globin reporter to the novel delta-gamma 3 intron enhancer (E delta-gamma 3), transgene transcription is exclusively found in bone marrow B cells from the early stage when IgH rearrangement is initiated up to the successful completion of H and L locus recombination, resulting in Ab expression. These findings suggest that the C delta-C gamma 3 interval exerts regulatory control on Ig gene activation and expression during early lymphoid development.
Collapse
Affiliation(s)
- C A Mundt
- Laboratory of Developmental Immunology, The Babraham Institute, Babraham, Cambridge, United Kingdom
| | | | | | | | | | | |
Collapse
|
42
|
Watkins WS, Ricker CE, Bamshad MJ, Carroll ML, Nguyen SV, Batzer MA, Harpending HC, Rogers AR, Jorde LB. Patterns of ancestral human diversity: an analysis of Alu-insertion and restriction-site polymorphisms. Am J Hum Genet 2001; 68:738-52. [PMID: 11179020 PMCID: PMC1274485 DOI: 10.1086/318793] [Citation(s) in RCA: 112] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2000] [Accepted: 01/17/2001] [Indexed: 11/04/2022] Open
Abstract
We have analyzed 35 widely distributed, polymorphic Alu loci in 715 individuals from 31 world populations. The average frequency of Alu insertions (the derived state) is lowest in Africa (.42) but is higher and similar in India (.55), Europe (.56), and Asia (.57). A comparison with 30 restriction-site polymorphisms (RSPs) for which the ancestral state has been determined shows that the frequency of derived RSP alleles is also lower in Africa (.35) than it is in Asia (.45) and in Europe (.46). Neighbor-joining networks based on Alu insertions or RSPs are rooted in Africa and show African populations as separate from other populations, with high statistical support. Correlations between genetic distances based on Alu and nuclear RSPs, short tandem-repeat polymorphisms, and mtDNA, in the same individuals, are high and significant. For the 35 loci, Alu gene diversity and the diversity attributable to population subdivision is highest in Africa but is lower and similar in Europe and Asia. The distribution of ancestral alleles is consistent with an origin of early modern human populations in sub-Saharan Africa, the isolation and preservation of ancestral alleles within Africa, and an expansion out of Africa into Eurasia. This expansion is characterized by increasing frequencies of Alu inserts and by derived RSP alleles with reduced genetic diversity in non-African populations.
Collapse
Affiliation(s)
- W S Watkins
- Department of Human Genetics, University of Utah Health Sciences Center, Salt Lake City, UT, 84112, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
43
|
Tishkoff SA, Pakstis AJ, Stoneking M, Kidd JR, Destro-Bisol G, Sanjantila A, Lu RB, Deinard AS, Sirugo G, Jenkins T, Kidd KK, Clark AG. Short tandem-repeat polymorphism/alu haplotype variation at the PLAT locus: implications for modern human origins. Am J Hum Genet 2000; 67:901-25. [PMID: 10986042 PMCID: PMC1287905 DOI: 10.1086/303068] [Citation(s) in RCA: 61] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2000] [Accepted: 07/18/2000] [Indexed: 01/10/2023] Open
Abstract
Two dinucleotide short tandem-repeat polymorphisms (STRPs) and a polymorphic Alu element spanning a 22-kb region of the PLAT locus on chromosome 8p12-q11.2 were typed in 1,287-1,420 individuals originating from 30 geographically diverse human populations, as well as in 29 great apes. These data were analyzed as haplotypes consisting of each of the dinucleotide repeats and the flanking Alu insertion/deletion polymorphism. The global pattern of STRP/Alu haplotype variation and linkage disequilibrium (LD) is informative for the reconstruction of human evolutionary history. Sub-Saharan African populations have high levels of haplotype diversity within and between populations, relative to non-Africans, and have highly divergent patterns of LD. Non-African populations have both a subset of the haplotype diversity present in Africa and a distinct pattern of LD. The pattern of haplotype variation and LD observed at the PLAT locus suggests a recent common ancestry of non-African populations, from a small population originating in eastern Africa. These data indicate that, throughout much of modern human history, sub-Saharan Africa has maintained both a large effective population size and a high level of population substructure. Additionally, Papua New Guinean and Micronesian populations have rare haplotypes observed otherwise only in African populations, suggesting ancient gene flow from Africa into Papua New Guinea, as well as gene flow between Melanesian and Micronesian populations.
Collapse
Affiliation(s)
- S A Tishkoff
- University of Maryland, Department of Biology, University of Maryland, College Park, MD 20742, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Abstract
We review the anatomical and archaeological evidence for an early population bottleneck in humans and bracket the time when it could have occurred. We outline the subsequent demographic changes that the archaeological evidence of range expansions and contractions address, and we examine how inbreeding effective population size provides an alternative view of past population size change. This addresses the question of other, more recent, population size bottlenecks, and we review nonrecombining and recombining genetic systems that may reflect them. We examine how these genetic data constrain the possibility of significant population size bottlenecks (i.e., of sufficiently small size and/or long duration to minimize genetic variation in autosomal and haploid systems) at several different critical times in human history. Different constraints appear in nonrecombining and recombining systems, and among the autosomal loci most are incompatible with any Pleistocene population size expansions. Microsatellite data seem to show Pleistocene population size expansions, but in aggregate they are difficult to interpret because different microsatellite studies do not show the same expansion. The archaeological data are only compatible with a few of these analyses, most prominently with data from Alu elements, and we use these facts to question whether the view of the past from analysis of inbreeding effective population size is valid. Finally, we examine the issue of whether inbreeding effective population size provides any reasonable measure of the actual past size of the human species. We contend that if the evidence of a population size bottleneck early in the evolution of our lineage is accepted, most genetic data either lack the resolution to address subsequent changes in the human population or do not meet the assumptions required to do so validly. It is our conclusion that, at the moment, genetic data cannot disprove a simple model of exponential population growth following a bottleneck 2 MYA at the origin of our lineage and extending through the Pleistocene. Archaeological and paleontological data indicate that this model is too oversimplified to be an accurate reflection of detailed population history, and therefore we find that genetic data lack the resolution to validly reflect many details of Pleistocene human population change. However, there is one detail that these data are sufficient to address. Both genetic and anthropological data are incompatible with the hypothesis of a recent population size bottleneck. Such an event would be expected to leave a significant mark across numerous genetic loci and observable anatomical traits, but while some subsets of data are compatible with a recent population size bottleneck, there is no consistently expressed effect that can be found across the range where it should appear, and this absence disproves the hypothesis.
Collapse
Affiliation(s)
- J Hawks
- Department of Anthropology, University of Utah, USA
| | | | | | | |
Collapse
|
45
|
Abstract
While high quality information regarding variation in genes is currently available in locus-specific or specialized mutation databases, the need remains for a general catalog of genome variation to address the large-scale sampling designs required by association studies, gene mapping, and evolutionary biology. In response to this need, the National Center for Biotechnology Information (NCBI) has established the dbSNP database http://ncbi. nlm.nih.gov/SNP/ to serve as a generalized, central variation database. Submissions to dbSNP will be integrated with other sources of information at NCBI such as GenBank, PubMed, LocusLink, and the Human Genome Project data, and the complete contents of dbSNP are available to the public via anonymous FTP. Hum Mutat 15:68-75, 2000. Published 2000 Wiley-Liss, Inc.
Collapse
Affiliation(s)
- S T Sherry
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland.
| | | | | |
Collapse
|
46
|
Sherry ST, Ward M, Sirotkin K. dbSNP—Database for Single Nucleotide Polymorphisms and Other Classes of Minor Genetic Variation. Genome Res 1999. [DOI: 10.1101/gr.9.8.677] [Citation(s) in RCA: 273] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
47
|
Harris EE, Hey J. X chromosome evidence for ancient human histories. Proc Natl Acad Sci U S A 1999; 96:3320-4. [PMID: 10077682 PMCID: PMC15940 DOI: 10.1073/pnas.96.6.3320] [Citation(s) in RCA: 148] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/1998] [Accepted: 12/29/1998] [Indexed: 11/18/2022] Open
Abstract
Diverse African and non-African samples of the X-linked PDHA1 (pyruvate dehydrogenase E1 alpha subunit) locus revealed a fixed DNA sequence difference between the two sample groups. The age of onset of population subdivision appears to be about 200 thousand years ago. This predates the earliest modern human fossils, suggesting the transformation to modern humans occurred in a subdivided population. The base of the PDHA1 gene tree is relatively ancient, with an estimated age of 1.86 million years, a late Pliocene time associated with early species of Homo. PDHA1 revealed very low variation among non-Africans, but in other respects the data are consistent with reports from other X-linked and autosomal haplotype data sets. Like these other genes, but in conflict with microsatellite and mitochondrial data, PDHA1 does not show evidence of human population expansion.
Collapse
Affiliation(s)
- E E Harris
- Department of Genetics, Rutgers University, Nelson Biological Labs, 604 Allison Road, Piscataway, NJ 08854-8082, USA
| | | |
Collapse
|
48
|
Lahr MM, Foley RA. Towards a theory of modern human origins: geography, demography, and diversity in recent human evolution. AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY 1999; Suppl 27:137-76. [PMID: 9881525 DOI: 10.1002/(sici)1096-8644(1998)107:27+<137::aid-ajpa6>3.0.co;2-q] [Citation(s) in RCA: 182] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The origins of modern humans have been the central debate in palaeoanthropology during the last decade. We examine the problem in the context of the history of anthropology, the accumulating evidence for a recent African origin, and evolutionary mechanisms. Using a historical perspective, we show that the current controversy is a continuation of older conflicts and as such relates to questions of both origins and diversity. However, a better fossil sample, improved dates, and genetic data have introduced new perspectives, and we argue that evolutionary geography, which uses spatial distributions of populations as the basis for integrating contingent, adaptive, and demographic aspects of microevolutionary change, provides an appropriate theoretical framework. Evolutionary geography is used to explore two events: the evolution of the Neanderthal lineage and the relationship between an ancestral bottleneck with the evolution of anatomically modern humans and their diversity. We argue that the Neanderthal and modern lineages share a common ancestor in an African population between 350,000 and 250,000 years ago rather than in the earlier Middle Pleistocene; this ancestral population, which developed mode 3 technology (Levallois/Middle Stone Age), dispersed across Africa and western Eurasia in a warmer period prior to independent evolution towards Neanderthals and modern humans in stage 6. Both lineages would thus share a common large-brained ancestry, a technology, and a history of dispersal. They differ in the conditions under which they subsequently evolved and their ultimate evolutionary fate. Both lineages illustrate the repeated interactions of the glacial cycles, the role of cold-arid periods in producing fragmentation of populations, bottlenecks, and isolation, and the role of warmer periods in producing trans-African dispersals.
Collapse
Affiliation(s)
- M M Lahr
- Departamento de Biologia, Instituto de Biociências, Universidade de São Paulo, Brasil
| | | |
Collapse
|
49
|
Affiliation(s)
- John H. Relethford
- Department of Anthropology, State University of New York, College at Oneonta, Oneonta, New York 13820; e-mail:
| |
Collapse
|
50
|
Sherry ST, Batzer MA, Harpending HC. MODELING THE GENETIC ARCHITECTURE OF MODERN POPULATIONS. ANNUAL REVIEW OF ANTHROPOLOGY 1998. [DOI: 10.1146/annurev.anthro.27.1.153] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
▪ Abstract This article summarizes recent genetic evidence about the population history of our species. There is a congruence of evidence from different systems showing that the genetic effective size of humans is about 10,000 reproducing adults. We discuss how the magnitude and fluctuation of this number over time is important for evaluating competing hypotheses about the nature of human evolution during the Pleistocene. The differences in estimates of effective size derived from high mutation rate and low mutation rate genetic systems allow us to trace broad-scale changes in population size. The ultimate goal is to produce a comprehensive history of our own gene pool and its spread and differentiation over the world. The genetic evidence should also complement archaeological evidence of our past by revealing aspects of our history that are not readily visible from the archaeological record, such as whether hominid populations in the Pleistocene were different species.
Collapse
Affiliation(s)
- S. T. Sherry
- Departments of Pathology and Biometry and Genetics, Stanley S. Scott Cancer Center, Neuroscience Center of Excellence, Louisiana State University Medical Center, New Orleans, 70112, Louisiana e-mail: ssherr@lsumc and
- Department of Anthropology, University of Utah, Salt Lake City, Utah 84112
| | - M. A. Batzer
- Departments of Pathology and Biometry and Genetics, Stanley S. Scott Cancer Center, Neuroscience Center of Excellence, Louisiana State University Medical Center, New Orleans, 70112, Louisiana e-mail: ssherr@lsumc and
- Department of Anthropology, University of Utah, Salt Lake City, Utah 84112
| | - H. C. Harpending
- Departments of Pathology and Biometry and Genetics, Stanley S. Scott Cancer Center, Neuroscience Center of Excellence, Louisiana State University Medical Center, New Orleans, 70112, Louisiana e-mail: ssherr@lsumc and
- Department of Anthropology, University of Utah, Salt Lake City, Utah 84112
| |
Collapse
|