1
|
Further insight into the global variability of the OCA2-HERC2 locus for human pigmentation from multiallelic markers. Sci Rep 2021; 11:22530. [PMID: 34795370 PMCID: PMC8602267 DOI: 10.1038/s41598-021-01940-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Accepted: 11/02/2021] [Indexed: 11/20/2022] Open
Abstract
The OCA2-HERC2 locus is responsible for the greatest proportion of eye color variation in humans. Numerous studies extensively described both functional SNPs and associated patterns of variation over this region. The goal of our study is to examine how these haplotype structures and allelic associations vary when highly variable markers such as microsatellites are used. Eleven microsatellites spanning 357 Kb of OCA2-HERC2 genes are analyzed in 3029 individuals from worldwide populations. We found that several markers display large differences in allele frequency (10% to 35% difference) among Europeans, East Asians and Africans. In Europe, the alleles showing increased frequency can also discriminate individuals with (IrisPlex) predicted blue and brown eyes. Distinct haplotypes are identified around the variants C and T of the functional SNP rs12913832 (associated to blue eyes), with linkage disequilibrium r2 values significant up to 237 Kb. The haplotype carrying the allele rs12913832 C has high frequency (76%) in blue eye predicted individuals (30% in brown eye predicted individuals), while the haplotype associated to the allele rs12913832 T is restricted to brown eye predicted individuals. Finally, homozygosity values reach levels of 91% near rs12913832. Odds ratios show values of 4.2, 7.4 and 10.4 for four markers around rs12913832 and 7.1 for their core haplotype. Hence, this study provides an example on the informativeness of multiallelic markers that, despite their current limited potential contribution to forensic eye color prediction, supports the use of microsatellites for identifying causing variants showing similar genetic features and history.
Collapse
|
2
|
Miller JM, Quinzin MC, Edwards DL, Eaton DAR, Jensen EL, Russello MA, Gibbs JP, Tapia W, Rueda D, Caccone A. Genome-Wide Assessment of Diversity and Divergence Among Extant Galapagos Giant Tortoise Species. J Hered 2019; 109:611-619. [PMID: 29986032 DOI: 10.1093/jhered/esy031] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2018] [Accepted: 07/04/2018] [Indexed: 12/19/2022] Open
Abstract
Genome-wide assessments allow for fuller characterization of genetic diversity, finer-scale population delineation, and better detection of demographically significant units to guide conservation compared with those based on "traditional" markers. Galapagos giant tortoises (Chelonoidis spp.) have long provided a case study for how evolutionary genetics may be applied to advance species conservation. Ongoing efforts to bolster tortoise populations, which have declined by 90%, have been informed by analyses of mitochondrial DNA sequence and microsatellite genotypic data, but could benefit from genome-wide markers. Taking this next step, we used double-digest restriction-site associated DNA sequencing to collect genotypic data at >26000 single nucleotide polymorphisms (SNPs) for 117 individuals representing all recognized extant Galapagos giant tortoise species. We then quantified genetic diversity, population structure, and compared results to estimates from mitochondrial DNA and microsatellite loci. Our analyses detected 12 genetic lineages concordant with the 11 named species as well as previously described structure within one species, C. becki. Furthermore, the SNPs provided increased resolution, detecting admixture in 4 individuals. SNP-based estimates of diversity and differentiation were significantly correlated with those derived from nuclear microsatellite loci and mitochondrial DNA sequences. The SNP toolkit presented here will serve as a resource for advancing efforts to understand tortoise evolution, species radiations, and aid conservation of the Galapagos tortoise species complex.
Collapse
Affiliation(s)
- Joshua M Miller
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT
| | - Maud C Quinzin
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT
| | - Danielle L Edwards
- Life and Environmental Sciences, University of California, Merced, Merced, CA
| | - Deren A R Eaton
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT.,Department of Ecology, Evolution, and Environmental Biology, Columbia University, New York, NY
| | - Evelyn L Jensen
- Department of Biology, University of British Columbia, Okanagan Campus, Kelowna, BC, Canada
| | - Michael A Russello
- Department of Biology, University of British Columbia, Okanagan Campus, Kelowna, BC, Canada
| | - James P Gibbs
- College of Environmental Science & Forestry, State University of New York, Syracuse, NY
| | - Washington Tapia
- Galapagos Conservancy, Fairfax, VA.,Galápagos National Park Directorate, Puerto Ayora, Galápagos, Ecuador
| | - Danny Rueda
- Galápagos National Park Directorate, Puerto Ayora, Galápagos, Ecuador
| | - Adalgisa Caccone
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT
| |
Collapse
|
3
|
Saini S, Mitra I, Mousavi N, Fotsing SF, Gymrek M. A reference haplotype panel for genome-wide imputation of short tandem repeats. Nat Commun 2018; 9:4397. [PMID: 30353011 PMCID: PMC6199332 DOI: 10.1038/s41467-018-06694-0] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Accepted: 09/18/2018] [Indexed: 12/14/2022] Open
Abstract
Short tandem repeats (STRs) are involved in dozens of Mendelian disorders and have been implicated in complex traits. However, genotyping arrays used in genome-wide association studies focus on single nucleotide polymorphisms (SNPs) and do not readily allow identification of STR associations. We leverage next-generation sequencing (NGS) from 479 families to create a SNP + STR reference haplotype panel. Our panel enables imputing STR genotypes into SNP array data when NGS is not available for directly genotyping STRs. Imputed genotypes achieve mean concordance of 97% with observed genotypes in an external dataset compared to 71% expected under a naive model. Performance varies widely across STRs, with near perfect concordance at bi-allelic STRs vs. 70% at highly polymorphic repeats. Imputation increases power over individual SNPs to detect STR associations with gene expression. Imputing STRs into existing SNP datasets will enable the first large-scale STR association studies across a range of complex traits.
Collapse
Affiliation(s)
- Shubham Saini
- Department of Computer Science and Engineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
| | - Ileena Mitra
- Bioinformatics and Systems Biology Program, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
| | - Nima Mousavi
- Department of Electrical and Computer Engineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
| | - Stephanie Feupe Fotsing
- Bioinformatics and Systems Biology Program, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
- Department of Biomedical Informatics, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
| | - Melissa Gymrek
- Department of Computer Science and Engineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA.
- Department of Medicine, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA.
| |
Collapse
|
4
|
Gaughran SJ, Quinzin MC, Miller JM, Garrick RC, Edwards DL, Russello MA, Poulakakis N, Ciofi C, Beheregaray LB, Caccone A. Theory, practice, and conservation in the age of genomics: The Galápagos giant tortoise as a case study. Evol Appl 2018; 11:1084-1093. [PMID: 30026799 PMCID: PMC6050186 DOI: 10.1111/eva.12551] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2017] [Accepted: 08/31/2017] [Indexed: 12/25/2022] Open
Abstract
High-throughput DNA sequencing allows efficient discovery of thousands of single nucleotide polymorphisms (SNPs) in nonmodel species. Population genetic theory predicts that this large number of independent markers should provide detailed insights into population structure, even when only a few individuals are sampled. Still, sampling design can have a strong impact on such inferences. Here, we use simulations and empirical SNP data to investigate the impacts of sampling design on estimating genetic differentiation among populations that represent three species of Galápagos giant tortoises (Chelonoidis spp.). Though microsatellite and mitochondrial DNA analyses have supported the distinctiveness of these species, a recent study called into question how well these markers matched with data from genomic SNPs, thereby questioning decades of studies in nonmodel organisms. Using >20,000 genomewide SNPs from 30 individuals from three Galápagos giant tortoise species, we find distinct structure that matches the relationships described by the traditional genetic markers. Furthermore, we confirm that accurate estimates of genetic differentiation in highly structured natural populations can be obtained using thousands of SNPs and 2-5 individuals, or hundreds of SNPs and 10 individuals, but only if the units of analysis are delineated in a way that is consistent with evolutionary history. We show that the lack of structure in the recent SNP-based study was likely due to unnatural grouping of individuals and erroneous genotype filtering. Our study demonstrates that genomic data enable patterns of genetic differentiation among populations to be elucidated even with few samples per population, and underscores the importance of sampling design. These results have specific implications for studies of population structure in endangered species and subsequent management decisions.
Collapse
Affiliation(s)
| | - Maud C. Quinzin
- Department of Ecology and Evolutionary BiologyYale UniversityNew HavenCTUSA
| | - Joshua M. Miller
- Department of Ecology and Evolutionary BiologyYale UniversityNew HavenCTUSA
| | | | | | - Michael A. Russello
- Department of BiologyUniversity of British Columbia, Okanagan CampusKelownaBCCanada
| | - Nikos Poulakakis
- Department of BiologySchool of Sciences and EngineeringUniversity of CreteHeraklion, CreteGreece
- Natural History Museum of CreteSchool of Sciences and EngineeringUniversity of CreteHeraklion, CreteGreece
| | - Claudio Ciofi
- Department of BiologyUniversity of FlorenceSesto Fiorentino (FI)Italy
| | - Luciano B. Beheregaray
- Molecular Ecology LabSchool of Biological SciencesFlinders UniversityAdelaideSAAustralia
| | - Adalgisa Caccone
- Department of Ecology and Evolutionary BiologyYale UniversityNew HavenCTUSA
| |
Collapse
|
5
|
Zhang Z, Zheng Y, Zhang X, Liu C, Joyce BT, Kibbe WA, Hou L, Zhang W. Linking short tandem repeat polymorphisms with cytosine modifications in human lymphoblastoid cell lines. Hum Genet 2016; 135:223-32. [PMID: 26714498 PMCID: PMC4715638 DOI: 10.1007/s00439-015-1628-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2015] [Accepted: 12/17/2015] [Indexed: 01/26/2023]
Abstract
Inter-individual variation in cytosine modifications has been linked to complex traits in humans. Cytosine modification variation is partially controlled by single nucleotide polymorphisms (SNPs), known as modified cytosine quantitative trait loci (mQTL). However, little is known about the role of short tandem repeat polymorphisms (STRPs), a class of structural genetic variants, in regulating cytosine modifications. Utilizing the published data on the International HapMap Project lymphoblastoid cell lines (LCLs), we assessed the relationships between 721 STRPs and the modification levels of 283,540 autosomal CpG sites. Our findings suggest that, in contrast to the predominant cis-acting mode for SNP-based mQTL, STRPs are associated with cytosine modification levels in both cis-acting (local) and trans-acting (distant) modes. In local scans within the ±1 Mb windows of target CpGs, 21, 9, and 21 cis-acting STRP-based mQTL were detected in CEU (Caucasian residents from Utah, USA), YRI (Yoruba people from Ibadan, Nigeria), and the combined samples, respectively. In contrast, 139,420, 76,817, and 121,866 trans-acting STRP-based mQTL were identified in CEU, YRI, and the combined samples, respectively. A substantial proportion of CpG sites detected with local STRP-based mQTL were not associated with SNP-based mQTL, suggesting that STRPs represent an independent class of mQTL. Functionally, genetic variants neighboring CpG-associated STRPs are enriched with genome-wide association study (GWAS) loci for a variety of complex traits and diseases, including cancers, based on the National Human Genome Research Institute (NHGRI) GWAS Catalog. Therefore, elucidating these STRP-based mQTL in addition to SNP-based mQTL can provide novel insights into the genetic architectures of complex traits.
Collapse
Affiliation(s)
- Zhou Zhang
- Driskill Graduate Program in Life Sciences, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, 680 N. Lake Shore Dr., Suite 1400, Chicago, IL, 60611, USA
| | - Yinan Zheng
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, 680 N. Lake Shore Dr., Suite 1400, Chicago, IL, 60611, USA
- Institute for Public Health and Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
| | - Xu Zhang
- Section of Hematology/Oncology, Department of Medicine, University of Illinois at Chicago, Chicago, IL, 60612, USA
| | - Cong Liu
- Department of Bioengineering, University of Illinois at Chicago, Chicago, IL, 60612, USA
| | - Brian Thomas Joyce
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, 680 N. Lake Shore Dr., Suite 1400, Chicago, IL, 60611, USA
- Division of Epidemiology and Biostatistics, School of Public Health, University of Illinois at Chicago, Chicago, IL, 60612, USA
| | - Warren A Kibbe
- Center for Biomedical Informatics and Information Technology, National Cancer Institute, Rockville, MD, 20850, USA
| | - Lifang Hou
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, 680 N. Lake Shore Dr., Suite 1400, Chicago, IL, 60611, USA
- The Robert H. Lurie Comprehensive Cancer Center, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
| | - Wei Zhang
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, 680 N. Lake Shore Dr., Suite 1400, Chicago, IL, 60611, USA.
- The Robert H. Lurie Comprehensive Cancer Center, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA.
- Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA.
| |
Collapse
|
6
|
Demers JE, Jiménez-Gasco MDM. Evolution of Nine Microsatellite Loci in the Fungus Fusarium oxysporum. J Mol Evol 2015; 82:27-37. [PMID: 26661928 DOI: 10.1007/s00239-015-9725-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2014] [Accepted: 11/19/2015] [Indexed: 12/11/2022]
Abstract
The evolution of nine microsatellites and one minisatellite was investigated in the fungus Fusarium oxysporum and sister taxa Fusarium redolens and Fusarium verticillioides. Compared to other organisms, fungi have been reported to contain fewer and less polymorphic microsatellites. Mutational patterns over evolutionary time were studied for these ten loci by mapping changes in core repeat numbers onto a phylogeny based on the sequence of the conserved translation elongation factor 1-α gene. The patterns of microsatellite formation, expansion, and interruption by base substitutions were followed across the phylogeny, showing that these loci are evolving in a manner similar to that of microsatellites in other eukaryotes. Most mutations could be fit to a stepwise mutation model, but a few appear to have involved multiple repeat units. No evidence of gene conversion was seen at the minisatellite locus, which may also be mutating by replication slippage. Some homoplastic numbers of repeat units were observed for these loci, and polymorphisms in the regions flanking the microsatellites may provide better genetic markers for population genetics studies of these species.
Collapse
Affiliation(s)
- Jill E Demers
- Department of Plant Pathology & Environmental Microbiology, The Pennsylvania State University, University Park, PA, USA. .,USDA-ARS Systematic Mycology and Microbiology Laboratory, 10300 Baltimore Ave., Beltsville, MD, 20705, USA.
| | - María del Mar Jiménez-Gasco
- Department of Plant Pathology & Environmental Microbiology, The Pennsylvania State University, University Park, PA, USA.
| |
Collapse
|
7
|
Kwong M, Pemberton TJ. Sequence differences at orthologous microsatellites inflate estimates of human-chimpanzee differentiation. BMC Genomics 2014; 15:990. [PMID: 25407736 PMCID: PMC4253012 DOI: 10.1186/1471-2164-15-990] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2014] [Accepted: 10/30/2014] [Indexed: 02/06/2023] Open
Abstract
Background Microsatellites---contiguous arrays of 2–6 base-pair motifs---have formed the cornerstone of population-genetic studies for over two decades. Their genotype data typically takes the form of PCR fragment lengths obtained using locus-specific primer pairs to amplify the genomic region encompassing the microsatellite. Recently, we reported a dataset of 5,795 human and 84 chimpanzee individuals with genotypes at 246 human-derived autosomal microsatellites as a resource to facilitate interspecies comparisons. A major assumption underlying this dataset is that PCR amplicons at orthologous microsatellites are commensurable between species. Results We find this assumption to be frequently incorrect owing to discordance in microsatellite organization and variability, as well as nontrivial length imbalances caused by small species-specific indels in microsatellite flanking sequences. Converting PCR fragment lengths into the repeat numbers they represent at 138 microsatellites whose organization and variability was found to be highly similar in both species, we show that interspecies incommensurability among PCR amplicons can inflate FST and DPS estimates by up to 10.6%. Separate investigations of determinants of microsatellite variability in humans and chimpanzees uncover similar patterns with mean and maximum numbers of repeats, as well as numbers and ranges of distinct alleles, all important factors in predicting heterozygosity. In contrast, across microsatellites, numbers of repeats were significantly smaller in chimpanzees than in humans, while numbers and ranges of distinct alleles were instead larger. Conclusions Our findings have fundamental implications for interspecies comparisons using microsatellites and offer new opportunities for more accurate comparisons of patterns of human and chimpanzee genetic variation in numerous areas of application. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-990) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Trevor J Pemberton
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, Manitoba, Canada.
| |
Collapse
|
8
|
Putman AI, Carbone I. Challenges in analysis and interpretation of microsatellite data for population genetic studies. Ecol Evol 2014; 4:4399-428. [PMID: 25540699 PMCID: PMC4267876 DOI: 10.1002/ece3.1305] [Citation(s) in RCA: 207] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2014] [Revised: 10/02/2014] [Accepted: 10/03/2014] [Indexed: 12/14/2022] Open
Abstract
Advancing technologies have facilitated the ever-widening application of genetic markers such as microsatellites into new systems and research questions in biology. In light of the data and experience accumulated from several years of using microsatellites, we present here a literature review that synthesizes the limitations of microsatellites in population genetic studies. With a focus on population structure, we review the widely used fixation (F ST) statistics and Bayesian clustering algorithms and find that the former can be confusing and problematic for microsatellites and that the latter may be confounded by complex population models and lack power in certain cases. Clustering, multivariate analyses, and diversity-based statistics are increasingly being applied to infer population structure, but in some instances these methods lack formalization with microsatellites. Migration-specific methods perform well only under narrow constraints. We also examine the use of microsatellites for inferring effective population size, changes in population size, and deeper demographic history, and find that these methods are untested and/or highly context-dependent. Overall, each method possesses important weaknesses for use with microsatellites, and there are significant constraints on inferences commonly made using microsatellite markers in the areas of population structure, admixture, and effective population size. To ameliorate and better understand these constraints, researchers are encouraged to analyze simulated datasets both prior to and following data collection and analysis, the latter of which is formalized within the approximate Bayesian computation framework. We also examine trends in the literature and show that microsatellites continue to be widely used, especially in non-human subject areas. This review assists with study design and molecular marker selection, facilitates sound interpretation of microsatellite data while fostering respect for their practical limitations, and identifies lessons that could be applied toward emerging markers and high-throughput technologies in population genetics.
Collapse
Affiliation(s)
- Alexander I Putman
- Department of Plant Pathology, North Carolina State University Raleigh, North Carolina, 27695-7616
| | - Ignazio Carbone
- Department of Plant Pathology, North Carolina State University Raleigh, North Carolina, 27695-7616
| |
Collapse
|
9
|
Tsai MY. Variable selection in Bayesian generalized linear-mixed models: an illustration using candidate gene case-control association studies. Biom J 2014; 57:234-53. [PMID: 25267186 DOI: 10.1002/bimj.201300259] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2013] [Revised: 04/25/2014] [Accepted: 06/21/2014] [Indexed: 11/07/2022]
Abstract
The problem of variable selection in the generalized linear-mixed models (GLMMs) is pervasive in statistical practice. For the purpose of variable selection, many methodologies for determining the best subset of explanatory variables currently exist according to the model complexity and differences between applications. In this paper, we develop a "higher posterior probability model with bootstrap" (HPMB) approach to select explanatory variables without fitting all possible GLMMs involving a small or moderate number of explanatory variables. Furthermore, to save computational load, we propose an efficient approximation approach with Laplace's method and Taylor's expansion to approximate intractable integrals in GLMMs. Simulation studies and an application of HapMap data provide evidence that this selection approach is computationally feasible and reliable for exploring true candidate genes and gene-gene associations, after adjusting for complex structures among clusters.
Collapse
Affiliation(s)
- Miao-Yu Tsai
- Institute of Statistics and Information Science, National Changhua University of Education, Changhua, 500, Taiwan
| |
Collapse
|
10
|
Abstract
A canon of population genetics concerns the properties of FST, a descriptor of spatial genetic structure. Interest for FST arose from Wright's early insights linking FST to dispersal parameters as well as to his concept of effective population size (e.g., Wright 1938, 1951). Although there is continued interest in this topic, FST also serves in other applications, such as detecting selected markers in natural populations (Beaumont and Nichols 1996) and more often in routine descriptive works. Remarkably, it is the latter use that seems to attract most discussion. Alternative descriptors have been proposed. Conversely, attempts have been made to draw biological inferences from FST properties that do not depend on biological processes. A reconsideration of its properties under biological scenarios underlines the weaknesses of such approaches.
Collapse
|
11
|
Galeano CH, Cortés AJ, Fernández AC, Soler Á, Franco-Herrera N, Makunde G, Vanderleyden J, Blair MW. Gene-based single nucleotide polymorphism markers for genetic and association mapping in common bean. BMC Genet 2012; 13:48. [PMID: 22734675 PMCID: PMC3464600 DOI: 10.1186/1471-2156-13-48] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2012] [Accepted: 06/21/2012] [Indexed: 12/19/2022] Open
Abstract
Background In common bean, expressed sequence tags (ESTs) are an underestimated source of gene-based markers such as insertion-deletions (Indels) or single-nucleotide polymorphisms (SNPs). However, due to the nature of these conserved sequences, detection of markers is difficult and portrays low levels of polymorphism. Therefore, development of intron-spanning EST-SNP markers can be a valuable resource for genetic experiments such as genetic mapping and association studies. Results In this study, a total of 313 new gene-based markers were developed at target genes. Intronic variation was deeply explored in order to capture more polymorphism. Introns were putatively identified after comparing the common bean ESTs with the soybean genome, and the primers were designed over intron-flanking regions. The intronic regions were evaluated for parental polymorphisms using the single strand conformational polymorphism (SSCP) technique and Sequenom MassARRAY system. A total of 53 new marker loci were placed on an integrated molecular map in the DOR364 × G19833 recombinant inbred line (RIL) population. The new linkage map was used to build a consensus map, merging the linkage maps of the BAT93 × JALO EEP558 and DOR364 × BAT477 populations. A total of 1,060 markers were mapped, with a total map length of 2,041 cM across 11 linkage groups. As a second application of the generated resource, a diversity panel with 93 genotypes was evaluated with 173 SNP markers using the MassARRAY-platform and KASPar technology. These results were coupled with previous SSR evaluations and drought tolerance assays carried out on the same individuals. This agglomerative dataset was examined, in order to discover marker-trait associations, using general linear model (GLM) and mixed linear model (MLM). Some significant associations with yield components were identified, and were consistent with previous findings. Conclusions In short, this study illustrates the power of intron-based markers for linkage and association mapping in common bean. The utility of these markers is discussed in relation with the usefulness of microsatellites, the molecular markers by excellence in this crop.
Collapse
Affiliation(s)
- Carlos H Galeano
- Centre of Microbial and Plant Genetics, Kasteelpark Arenberg 20, 3001, Heverlee, Belgium.
| | | | | | | | | | | | | | | |
Collapse
|
12
|
Alves I, Coelho M, Gignoux C, Damasceno A, Prista A, Rocha J. Genetic homogeneity across Bantu-speaking groups from Mozambique and Angola challenges early split scenarios between East and West Bantu populations. Hum Biol 2011; 83:13-38. [PMID: 21453002 DOI: 10.3378/027.083.0102] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
The large scale spread of Bantu-speaking populations remains one of the most debated questions in African population history. In this work we studied the genetic structure of 19 Bantu-speaking groups from Mozambique and Angola using a multilocus approach based on 14 newly developed compound haplotype systems (UEPSTRs), each consisting of a rapidly evolving short tandem repeat (STR) closely linked to a unique event polymorphism (UEP). We compared the ability of UEPs, STRs and UEPSTRs to document genetic variation at the intercontinental level and among the African Bantu populations, and found that UEPSTR systems clearly provided more resolution than UEPs or STRs alone. The observed patterns of genetic variation revealed high levels of genetic homogeneity between major populations from Angola and Mozambique, with two main outliers: the Kuvale from Angola and the Chopi from Mozambique. Within Mozambique, two Kaskazi-speaking populations from the far north (Yao and Mwani) and two Nyasa-speaking groups from the Zambezi River basin (Nyungwe and Sena) could be differentiated from the remaining groups, but no further population structure was observed across the country. The close genetic relationship between most sampled Bantu populations is consistent with high degrees of interaction between peoples living in savanna areas located to the south of the rainforest. Our results highlight the role of gene flow during the Bantu expansions and show that the genetic evidence accumulated so far is becoming increasingly difficult to reconcile with widely accepted models postulating an early split between eastern and western Bantu populations.
Collapse
Affiliation(s)
- Isabel Alves
- IPATIMUP, Instituto de Patologia e Imunologia Molecular da Universidade do Porto, Portugal
| | | | | | | | | | | |
Collapse
|
13
|
Sorenson MD, DaCosta JM. Genotyping HapSTR loci: phase determination from direct sequencing of PCR products. Mol Ecol Resour 2011; 11:1068-75. [PMID: 21692999 DOI: 10.1111/j.1755-0998.2011.03036.x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
HapSTRs combine information from a microsatellite (or simple tandem repeat, STR) with one or more single nucleotide polymorphisms in the DNA sequence immediately flanking the STR. These loci may offer increased power for the estimation of demographic parameters, but also present some challenges for data collection and analysis. We describe a process for inferring HapSTR alleles, including the flanking haplotypes, STR alleles and their phase relative to each other, directly from DNA sequence electropherograms of PCR products from heterozygous individuals. Our approach eliminates the need for more costly and time-consuming processes, such as cloning or acrylamide gel electrophoresis to separate alleles prior to sequencing.
Collapse
Affiliation(s)
- Michael D Sorenson
- Department of Biology, Boston University, 5 Cummington St., Boston, MA 02215, USA.
| | | |
Collapse
|
14
|
Limborska SA, Khrunin AV, Flegontova OV, Tasitz VA, Verbenko DA. Specificity of genetic diversity in D1S80 revealed by SNP-VNTR haplotyping. Ann Hum Biol 2011; 38:564-9. [PMID: 21834750 DOI: 10.3109/03014460.2011.568003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
BACKGROUND The allele frequency patterns of the D1S80 variable number tandem repeat (VNTR) locus have been shown to be multimodal in many different human populations. AIM To explore the complex allele distribution of the D1S80 polymorphic locus in different populations comparing the derived single nucleotide polymorphism (SNP) rs16824398-D1S80 haplotype frequencies in samples of European (Russians), Asian (Yakuts) and sub-Saharan African origin. SUBJECTS AND METHODS The D1S80 locus together with its 5'-flanking region including SNP rs16824398 was amplified using allele-specific polymerase chain reaction (PCR). RESULTS Haplotype phase determination sub-divided the total D1S80 allele spectrum into two allele sets marked by the corresponding SNP rs16824398 alleles. In non-African samples, the most frequent D1S80 alleles had 24 and 18 repeats that were associated with different SNP backgrounds (T and G alleles, respectively). Both combinations also occurred in Africans, but these samples exhibited an expanded spectrum of VNTR alleles on both SNP backgrounds. CONCLUSIONS The sub-division of the D1S80 allele spectrum shape on the linked SNP background is indicative of populations of the main human groups. The reported differences in D1S80 allele spectra between populations of different ethnic origins can be explained by the ratios of chromosomes with T and G alleles.
Collapse
Affiliation(s)
- Svetlana A Limborska
- Institute of Molecular Genetics, Russian Academy of Sciences, Kurchatov sq., 2 Moscow 123182, Russia
| | | | | | | | | |
Collapse
|
15
|
Hao C, Wang L, Ge H, Dong Y, Zhang X. Genetic diversity and linkage disequilibrium in Chinese bread wheat (Triticum aestivum L.) revealed by SSR markers. PLoS One 2011; 6:e17279. [PMID: 21365016 PMCID: PMC3041829 DOI: 10.1371/journal.pone.0017279] [Citation(s) in RCA: 123] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2010] [Accepted: 01/28/2011] [Indexed: 01/01/2023] Open
Abstract
Two hundred and fifty bread wheat lines, mainly Chinese mini core accessions, were assayed for polymorphism and linkage disequilibrium (LD) based on 512 whole-genome microsatellite loci representing a mean marker density of 5.1 cM. A total of 6,724 alleles ranging from 1 to 49 per locus were identified in all collections. The mean PIC value was 0.650, ranging from 0 to 0.965. Population structure and principal coordinate analysis revealed that landraces and modern varieties were two relatively independent genetic sub-groups. Landraces had a higher allelic diversity than modern varieties with respect to both genomes and chromosomes in terms of total number of alleles and allelic richness. 3,833 (57.0%) and 2,788 (41.5%) rare alleles with frequencies of <5% were found in the landrace and modern variety gene pools, respectively, indicating greater numbers of rare variants, or likely new alleles, in landraces. Analysis of molecular variance (AMOVA) showed that A genome had the largest genetic differentiation and D genome the lowest. In contrast to genetic diversity, modern varieties displayed a wider average LD decay across the whole genome for locus pairs with r2>0.05 (P<0.001) than the landraces. Mean LD decay distance for the landraces at the whole genome level was <5 cM, while a higher LD decay distance of 5–10 cM in modern varieties. LD decay distances were also somewhat different for each of the 21 chromosomes, being higher for most of the chromosomes in modern varieties (<5∼25 cM) compared to landraces (<5∼15 cM), presumably indicating the influences of domestication and breeding. This study facilitates predicting the marker density required to effectively associate genotypes with traits in Chinese wheat genetic resources.
Collapse
Affiliation(s)
- Chenyang Hao
- Key Laboratory of Crop Germplasm Resources and Utilization, Ministry of Agriculture, The National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Lanfen Wang
- Key Laboratory of Crop Germplasm Resources and Utilization, Ministry of Agriculture, The National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Hongmei Ge
- Key Laboratory of Crop Germplasm Resources and Utilization, Ministry of Agriculture, The National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Yuchen Dong
- Key Laboratory of Crop Germplasm Resources and Utilization, Ministry of Agriculture, The National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Xueyong Zhang
- Key Laboratory of Crop Germplasm Resources and Utilization, Ministry of Agriculture, The National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing, China
- * E-mail:
| |
Collapse
|
16
|
Li YH, Li W, Zhang C, Yang L, Chang RZ, Gaut BS, Qiu LJ. Genetic diversity in domesticated soybean (Glycine max) and its wild progenitor (Glycine soja) for simple sequence repeat and single-nucleotide polymorphism loci. THE NEW PHYTOLOGIST 2010; 188:242-53. [PMID: 20618914 DOI: 10.1111/j.1469-8137.2010.03344.x] [Citation(s) in RCA: 104] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
• The study of genetic diversity between a crop and its wild relatives may yield fundamental insights into evolutionary history and the process of domestication. • In this study, we genotyped a sample of 303 accessions of domesticated soybean (Glycine max) and its wild progenitor Glycine soja with 99 microsatellite markers and 554 single-nucleotide polymorphism (SNP) markers. • The simple sequence repeat (SSR) loci averaged 21.5 alleles per locus and overall Nei's gene diversity of 0.77. The SNPs had substantially lower genetic diversity (0.35) than SSRs. A SSR analyses indicated that G. soja exhibited higher diversity than G. max, but SNPs provided a slightly different snapshot of diversity between the two taxa. For both marker types, the primary division of genetic diversity was between the wild and domesticated accessions. Within taxa, G. max consisted of four geographic regions in China. G. soja formed six subgroups. Genealogical analyses indicated that cultivated soybean tended to form a monophyletic clade with respect to G. soja. • G. soja and G. max represent distinct germplasm pools. Limited evidence of admixture was discovered between these two species. Overall, our analyses are consistent with the origin of G. max from regions along the Yellow River of China.
Collapse
Affiliation(s)
- Ying-Hui Li
- The National Key Facility for Crop Gene Resources and Genetic Improvement (NFCRI)/Key Lab of Germplasm Utilization (MOA), Institute of Crop Science, Chinese Academy of Agricultural Sciences, 100081 Beijing, China
| | | | | | | | | | | | | |
Collapse
|
17
|
Michikawa Y, Suga T, Ishikawa A, Hayashi H, Oka A, Inoko H, Iwakawa M, Imai T. Genome wide screen identifies microsatellite markers associated with acute adverse effects following radiotherapy in cancer patients. BMC MEDICAL GENETICS 2010; 11:123. [PMID: 20701746 PMCID: PMC2928773 DOI: 10.1186/1471-2350-11-123] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/06/2009] [Accepted: 08/11/2010] [Indexed: 01/24/2023]
Abstract
Background The response of normal tissues in cancer patients undergoing radiotherapy varies, possibly due to genetic differences underlying variation in radiosensitivity. Methods Cancer patients (n = 360) were selected retrospectively from the RadGenomics project. Adverse effects within 3 months of radiotherapy completion were graded using the National Cancer Institute Common Toxicity Criteria; high grade group were grade 3 or more (n = 180), low grade group were grade 1 or less (n = 180). Pooled genomic DNA (gDNA) (n = 90 from each group) was screened using 23,244 microsatellites. Markers with different inter-group frequencies (Fisher exact test P < 0.05) were analyzed using the remaining pooled gDNA. Silencing RNA treatment was performed in cultured normal human skin fibroblasts. Results Forty-seven markers had positive association values; including one in the SEMA3A promoter region (P = 1.24 × 10-5). SEMA3A knockdown enhanced radiation resistance. Conclusions This study identified 47 putative radiosensitivity markers, and suggested a role for SEMA3A in radiosensitivity.
Collapse
Affiliation(s)
- Yuichi Michikawa
- RadGenomics Project, Research Center for Charged Particle Therapy, National Institute of Radiological Sciences, Chiba, Japan
| | | | | | | | | | | | | | | |
Collapse
|
18
|
Payseur BA, Jing P, Haasl RJ. A genomic portrait of human microsatellite variation. Mol Biol Evol 2010; 28:303-12. [PMID: 20675409 DOI: 10.1093/molbev/msq198] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Rapid advances in DNA sequencing and genotyping technologies are beginning to reveal the scope and pattern of human genomic variation. Although single nucleotide polymorphisms (SNPs) have been intensively studied, the extent and form of variation at other types of molecular variants remain poorly understood. Polymorphism at the most variable loci in the human genome, microsatellites, has rarely been examined on a genomic scale without the ascertainment biases that attend typical genotyping studies. We conducted a genomic survey of variation at microsatellites with at least three perfect repeats by comparing two complete genome sequences, the Human Genome Reference sequence and the sequence of J. Craig Venter. The genomic proportion of polymorphic loci was 2.7%, much higher than the rate of SNP variation, with marked heterogeneity among classes of loci. The proportion of variable loci increased substantially with repeat number. Repeat lengths differed in levels of variation, with longer repeat lengths generally showing higher polymorphism at the same repeat number. Microsatellite variation was weakly correlated with regional SNP number, indicating modest effects of shared genealogical history. Reductions in variation were detected at microsatellites located in introns, in untranslated regions, in coding exons, and just upstream of transcription start sites, suggesting the presence of selective constraints. Our results provide new insights into microsatellite mutational processes and yield a preview of patterns of variation that will be obtained in genomic surveys of larger numbers of individuals.
Collapse
|