1
|
Richard GF. The Startling Role of Mismatch Repair in Trinucleotide Repeat Expansions. Cells 2021; 10:cells10051019. [PMID: 33925919 PMCID: PMC8145212 DOI: 10.3390/cells10051019] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 04/20/2021] [Accepted: 04/21/2021] [Indexed: 12/26/2022] Open
Abstract
Trinucleotide repeats are a peculiar class of microsatellites whose expansions are responsible for approximately 30 human neurological or developmental disorders. The molecular mechanisms responsible for these expansions in humans are not totally understood, but experiments in model systems such as yeast, transgenic mice, and human cells have brought evidence that the mismatch repair machinery is involved in generating these expansions. The present review summarizes, in the first part, the role of mismatch repair in detecting and fixing the DNA strand slippage occurring during microsatellite replication. In the second part, key molecular differences between normal microsatellites and those that show a bias toward expansions are extensively presented. The effect of mismatch repair mutants on microsatellite expansions is detailed in model systems, and in vitro experiments on mismatched DNA substrates are described. Finally, a model presenting the possible roles of the mismatch repair machinery in microsatellite expansions is proposed.
Collapse
Affiliation(s)
- Guy-Franck Richard
- Institut Pasteur, CNRS UMR3525, 25 rue du Docteur Roux, 75015 Paris, France
| |
Collapse
|
2
|
Fan W, Xu L, Cheng H, Li M, Liu H, Jiang Y, Guo Y, Zhou Z, Hou S. Characterization of Duck ( Anas platyrhynchos) Short Tandem Repeat Variation by Population-Scale Genome Resequencing. Front Genet 2018; 9:520. [PMID: 30425731 PMCID: PMC6218588 DOI: 10.3389/fgene.2018.00520] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2018] [Accepted: 10/15/2018] [Indexed: 12/30/2022] Open
Abstract
Short tandem repeats (STRs) are usually associated with genetic diseases and gene regulatory functions, and are also important genetic markers for analysis of evolutionary, genetic diversity and forensic. However, for the majority of STRs in the duck genome, their population genetic properties and functional impacts remain poorly defined. Recent advent of next generation sequencing (NGS) has offered an opportunity for profiling large numbers of polymorphic STRs. Here, we reported a population-scale analysis of STR variation using genome resequencing in mallard and Pekin duck. Our analysis provided the first genome-wide duck STR reference including 198,022 STR loci with motif size of 2–6 base pairs. We observed a relatively uneven distribution of STRs in different genomic regions, which indicates that the occurrence of STRs in duck genome is not random, but undergoes a directional selection pressure. Using genome resequencing data of 23 mallard and 26 Pekin ducks, we successfully identified 89,891 polymorphic STR loci. Intensive analysis of this dataset suggested that shorter repeat motif, longer reference tract length, higher purity, and residing outside of a coding region are all associated with an increase in STR variability. STR genotypes were utilized for population genetic analysis, and the results showed that population structure and divergence patterns among population groups can be efficiently captured. In addition, comparison between Pekin duck and mallard identified 3,122 STRs with extremely divergent allele frequency, which overlapped with a set of genes related to nervous system, energy metabolism and behavior. The evolutionary analysis revealed that the genes containing divergent STRs may play important roles in phenotypic changes during duck domestication. The variation analysis of STRs in population scale provides valuable resource for future study of genetic diversity and genome evolution in duck.
Collapse
Affiliation(s)
- Wenlei Fan
- Key Laboratory of Animal (Poultry) Genetics Breeding and Reproduction, Ministry of Agriculture and Rural Affairs, State Key Laboratory of Animal Nutrition, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China.,State Key Laboratory of Animal Nutrition, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Lingyang Xu
- Key Laboratory of Animal (Poultry) Genetics Breeding and Reproduction, Ministry of Agriculture and Rural Affairs, State Key Laboratory of Animal Nutrition, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Hong Cheng
- College of Animal Science and Technology, Northwest A&F University, Yangling, China
| | - Ming Li
- College of Animal Science and Technology, Northwest A&F University, Yangling, China
| | - Hehe Liu
- Key Laboratory of Animal (Poultry) Genetics Breeding and Reproduction, Ministry of Agriculture and Rural Affairs, State Key Laboratory of Animal Nutrition, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Yong Jiang
- Key Laboratory of Animal (Poultry) Genetics Breeding and Reproduction, Ministry of Agriculture and Rural Affairs, State Key Laboratory of Animal Nutrition, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Yuming Guo
- State Key Laboratory of Animal Nutrition, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Zhengkui Zhou
- Key Laboratory of Animal (Poultry) Genetics Breeding and Reproduction, Ministry of Agriculture and Rural Affairs, State Key Laboratory of Animal Nutrition, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Shuisheng Hou
- Key Laboratory of Animal (Poultry) Genetics Breeding and Reproduction, Ministry of Agriculture and Rural Affairs, State Key Laboratory of Animal Nutrition, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| |
Collapse
|
3
|
Willems T, Gymrek M, Poznik G, Tyler-Smith C, Erlich Y, Erlich Y. Population-Scale Sequencing Data Enable Precise Estimates of Y-STR Mutation Rates. Am J Hum Genet 2016; 98:919-933. [PMID: 27126583 DOI: 10.1016/j.ajhg.2016.04.001] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2016] [Accepted: 04/01/2016] [Indexed: 01/23/2023] Open
Abstract
Short tandem repeats (STRs) are mutation-prone loci that span nearly 1% of the human genome. Previous studies have estimated the mutation rates of highly polymorphic STRs by using capillary electrophoresis and pedigree-based designs. Although this work has provided insights into the mutational dynamics of highly mutable STRs, the mutation rates of most others remain unknown. Here, we harnessed whole-genome sequencing data to estimate the mutation rates of Y chromosome STRs (Y-STRs) with 2-6 bp repeat units that are accessible to Illumina sequencing. We genotyped 4,500 Y-STRs by using data from the 1000 Genomes Project and the Simons Genome Diversity Project. Next, we developed MUTEA, an algorithm that infers STR mutation rates from population-scale data by using a high-resolution SNP-based phylogeny. After extensive intrinsic and extrinsic validations, we harnessed MUTEA to derive mutation-rate estimates for 702 polymorphic STRs by tracing each locus over 222,000 meioses, resulting in the largest collection of Y-STR mutation rates to date. Using our estimates, we identified determinants of STR mutation rates and built a model to predict rates for STRs across the genome. These predictions indicate that the load of de novo STR mutations is at least 75 mutations per generation, rivaling the load of all other known variant types. Finally, we identified Y-STRs with potential applications in forensics and genetic genealogy, assessed the ability to differentiate between the Y chromosomes of father-son pairs, and imputed Y-STR genotypes.
Collapse
Affiliation(s)
| | | | | | | | | | - Yaniv Erlich
- New York Genome Center, New York, NY 10013, USA; Whitehead Institute for Biomedical Research, 9 Cambridge Center, Cambridge, MA 02139, USA; Department of Computer Science, Fu Foundation School of Engineering, Columbia University, New York, NY 10027, USA; Center for Computational Biology and Bioinformatics, Columbia University, New York, NY 10032, USA.
| |
Collapse
|
4
|
Absence of MutSβ leads to the formation of slipped-DNA for CTG/CAG contractions at primate replication forks. DNA Repair (Amst) 2016; 42:107-18. [PMID: 27155933 DOI: 10.1016/j.dnarep.2016.04.002] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2015] [Revised: 03/22/2016] [Accepted: 04/05/2016] [Indexed: 11/22/2022]
Abstract
Typically disease-causing CAG/CTG repeats expand, but rare affected families can display high levels of contraction of the expanded repeat amongst offspring. Understanding instability is important since arresting expansions or enhancing contractions could be clinically beneficial. The MutSβ mismatch repair complex is required for CAG/CTG expansions in mice and patients. Oddly, by unknown mechanisms MutSβ-deficient mice incur contractions instead of expansions. Replication using CTG or CAG as the lagging strand template is known to cause contractions or expansions respectively; however, the interplay between replication and repair leading to this instability remains unclear. Towards understanding how repeat contractions may arise, we performed in vitro SV40-mediated replication of repeat-containing plasmids in the presence or absence of mismatch repair. Specifically, we separated repair from replication: Replication mediated by MutSβ- and MutSα-deficient human cells or cell extracts produced slipped-DNA heteroduplexes in the contraction- but not expansion-biased replication direction. Replication in the presence of MutSβ disfavoured the retention of replication products harbouring slipped-DNA heteroduplexes. Post-replication repair of slipped-DNAs by MutSβ-proficient extracts eliminated slipped-DNAs. Thus, a MutSβ-deficiency likely enhances repeat contractions because MutSβ protects against contractions by repairing template strand slip-outs. Replication deficient in LigaseI or PCNA-interaction mutant LigaseI revealed slipped-DNA formation at lagging strands. Our results reveal that distinct mechanisms lead to expansions or contractions and support inhibition of MutSβ as a therapeutic strategy to enhance the contraction of expanded repeats.
Collapse
|
5
|
Merritt BJ, Culley TM, Avanesyan A, Stokes R, Brzyski J. An empirical review: Characteristics of plant microsatellite markers that confer higher levels of genetic variation. APPLICATIONS IN PLANT SCIENCES 2015; 3:apps1500025. [PMID: 26312192 PMCID: PMC4542939 DOI: 10.3732/apps.1500025] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2015] [Accepted: 07/08/2015] [Indexed: 05/14/2023]
Abstract
During microsatellite marker development, researchers must choose from a pool of possible primer pairs to further test in their species of interest. In many cases, the goal is maximizing detectable levels of genetic variation. To guide researchers and determine which markers are associated with higher levels of genetic variation, we conducted a literature review based on 6782 genomic microsatellite markers published from 1997-2012. We examined relationships between heterozygosity (H e or H o) or allele number (A) with the following marker characteristics: repeat type, motif length, motif region, repeat frequency, and microsatellite size. Variation across taxonomic groups was also analyzed. There were significant differences between imperfect and perfect repeat types in A and H e. Dinucleotide motifs exhibited significantly higher A, H e, and H o than most other motifs. Repeat frequency and motif region were positively correlated with A, H e, and H o, but correlations with microsatellite size were minimal. Higher taxonomic groups were disproportionately represented in the literature and showed little consistency. In conclusion, researchers should carefully consider marker characteristics so they can be tailored to the desired application. If researchers aim to target high genetic variation, dinucleotide motif lengths with large repeat frequencies may be best.
Collapse
Affiliation(s)
- Benjamin J. Merritt
- Department of Biological Science, University of Cincinnati, 614 Rieveschl Hall, Cincinnati, Ohio 45221-0006 USA
- Author for correspondence:
| | - Theresa M. Culley
- Department of Biological Science, University of Cincinnati, 614 Rieveschl Hall, Cincinnati, Ohio 45221-0006 USA
| | - Alina Avanesyan
- Iowa State University, 1317 Illinois Avenue, Ames, Iowa 50014 USA
| | - Richard Stokes
- University of Illinois at Springfield, One University Plaza, MS HSB 224, Springfield, Illinois 62703-5407 USA
| | - Jessica Brzyski
- Department of Biology, Seton Hill University, 1 Seton Hill Drive, Greensburg, Pennsylvania 15601 USA
| |
Collapse
|
6
|
Merritt BJ, Culley TM, Avanesyan A, Stokes R, Brzyski J. An empirical review: Characteristics of plant microsatellite markers that confer higher levels of genetic variation. APPLICATIONS IN PLANT SCIENCES 2015; 3:apps1500025. [PMID: 26312192 DOI: 10.5061/dryad.7gr39] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 03/13/2015] [Accepted: 07/08/2015] [Indexed: 05/27/2023]
Abstract
During microsatellite marker development, researchers must choose from a pool of possible primer pairs to further test in their species of interest. In many cases, the goal is maximizing detectable levels of genetic variation. To guide researchers and determine which markers are associated with higher levels of genetic variation, we conducted a literature review based on 6782 genomic microsatellite markers published from 1997-2012. We examined relationships between heterozygosity (H e or H o) or allele number (A) with the following marker characteristics: repeat type, motif length, motif region, repeat frequency, and microsatellite size. Variation across taxonomic groups was also analyzed. There were significant differences between imperfect and perfect repeat types in A and H e. Dinucleotide motifs exhibited significantly higher A, H e, and H o than most other motifs. Repeat frequency and motif region were positively correlated with A, H e, and H o, but correlations with microsatellite size were minimal. Higher taxonomic groups were disproportionately represented in the literature and showed little consistency. In conclusion, researchers should carefully consider marker characteristics so they can be tailored to the desired application. If researchers aim to target high genetic variation, dinucleotide motif lengths with large repeat frequencies may be best.
Collapse
Affiliation(s)
- Benjamin J Merritt
- Department of Biological Science, University of Cincinnati, 614 Rieveschl Hall, Cincinnati, Ohio 45221-0006 USA
| | - Theresa M Culley
- Department of Biological Science, University of Cincinnati, 614 Rieveschl Hall, Cincinnati, Ohio 45221-0006 USA
| | - Alina Avanesyan
- Iowa State University, 1317 Illinois Avenue, Ames, Iowa 50014 USA
| | - Richard Stokes
- University of Illinois at Springfield, One University Plaza, MS HSB 224, Springfield, Illinois 62703-5407 USA
| | - Jessica Brzyski
- Department of Biology, Seton Hill University, 1 Seton Hill Drive, Greensburg, Pennsylvania 15601 USA
| |
Collapse
|
7
|
Kwong M, Pemberton TJ. Sequence differences at orthologous microsatellites inflate estimates of human-chimpanzee differentiation. BMC Genomics 2014; 15:990. [PMID: 25407736 PMCID: PMC4253012 DOI: 10.1186/1471-2164-15-990] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2014] [Accepted: 10/30/2014] [Indexed: 02/06/2023] Open
Abstract
Background Microsatellites---contiguous arrays of 2–6 base-pair motifs---have formed the cornerstone of population-genetic studies for over two decades. Their genotype data typically takes the form of PCR fragment lengths obtained using locus-specific primer pairs to amplify the genomic region encompassing the microsatellite. Recently, we reported a dataset of 5,795 human and 84 chimpanzee individuals with genotypes at 246 human-derived autosomal microsatellites as a resource to facilitate interspecies comparisons. A major assumption underlying this dataset is that PCR amplicons at orthologous microsatellites are commensurable between species. Results We find this assumption to be frequently incorrect owing to discordance in microsatellite organization and variability, as well as nontrivial length imbalances caused by small species-specific indels in microsatellite flanking sequences. Converting PCR fragment lengths into the repeat numbers they represent at 138 microsatellites whose organization and variability was found to be highly similar in both species, we show that interspecies incommensurability among PCR amplicons can inflate FST and DPS estimates by up to 10.6%. Separate investigations of determinants of microsatellite variability in humans and chimpanzees uncover similar patterns with mean and maximum numbers of repeats, as well as numbers and ranges of distinct alleles, all important factors in predicting heterozygosity. In contrast, across microsatellites, numbers of repeats were significantly smaller in chimpanzees than in humans, while numbers and ranges of distinct alleles were instead larger. Conclusions Our findings have fundamental implications for interspecies comparisons using microsatellites and offer new opportunities for more accurate comparisons of patterns of human and chimpanzee genetic variation in numerous areas of application. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-990) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Trevor J Pemberton
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, Manitoba, Canada.
| |
Collapse
|
8
|
Ananda G, Hile SE, Breski A, Wang Y, Kelkar Y, Makova KD, Eckert KA. Microsatellite interruptions stabilize primate genomes and exist as population-specific single nucleotide polymorphisms within individual human genomes. PLoS Genet 2014; 10:e1004498. [PMID: 25033203 PMCID: PMC4102424 DOI: 10.1371/journal.pgen.1004498] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2013] [Accepted: 05/28/2014] [Indexed: 01/01/2023] Open
Abstract
Interruptions of microsatellite sequences impact genome evolution and can alter disease manifestation. However, human polymorphism levels at interrupted microsatellites (iMSs) are not known at a genome-wide scale, and the pathways for gaining interruptions are poorly understood. Using the 1000 Genomes Phase-1 variant call set, we interrogated mono-, di-, tri-, and tetranucleotide repeats up to 10 units in length. We detected ∼26,000–40,000 iMSs within each of four human population groups (African, European, East Asian, and American). We identified population-specific iMSs within exonic regions, and discovered that known disease-associated iMSs contain alleles present at differing frequencies among the populations. By analyzing longer microsatellites in primate genomes, we demonstrate that single interruptions result in a genome-wide average two- to six-fold reduction in microsatellite mutability, as compared with perfect microsatellites. Centrally located interruptions lowered mutability dramatically, by two to three orders of magnitude. Using a biochemical approach, we tested directly whether the mutability of a specific iMS is lower because of decreased DNA polymerase strand slippage errors. Modeling the adenomatous polyposis coli tumor suppressor gene sequence, we observed that a single base substitution interruption reduced strand slippage error rates five- to 50-fold, relative to a perfect repeat, during synthesis by DNA polymerases α, β, or η. Computationally, we demonstrate that iMSs arise primarily by base substitution mutations within individual human genomes. Our biochemical survey of human DNA polymerase α, β, δ, κ, and η error rates within certain microsatellites suggests that interruptions are created most frequently by low fidelity polymerases. Our combined computational and biochemical results demonstrate that iMSs are abundant in human genomes and are sources of population-specific genetic variation that may affect genome stability. The genome-wide identification of iMSs in human populations presented here has important implications for current models describing the impact of microsatellite polymorphisms on gene expression. Microsatellites are short tandem repeat DNA sequences located throughout the human genome that display a high degree of inter-individual variation. This characteristic makes microsatellites an attractive tool for population genetics and forensics research. Some microsatellites affect gene expression, and mutations within such microsatellites can cause disease. Interruption mutations disrupt the perfect repeated array and are frequently associated with altered disease risk, but they have not been thoroughly studied in human genomes. We identified interrupted mono-, di-, tri- and tetranucleotide MSs (iMS) within individual genomes from African, European, Asian and American population groups. We show that many iMSs, including some within disease-associated genes, are unique to a single population group. By measuring the conservation of microsatellites between human and chimpanzee genomes, we demonstrate that interruptions decrease the probability of microsatellite mutations throughout the genome. We demonstrate that iMSs arise in the human genome by single base changes within the DNA, and provide biochemical data suggesting that these stabilizing changes may be created by error-prone DNA polymerases. Our genome-wide study supports the model in which iMSs act to stabilize individual genomes, and suggests that population-specific differences in microsatellite architecture may be an avenue by which genetic ancestry impacts individual disease risk.
Collapse
Affiliation(s)
- Guruprasad Ananda
- Department of Biology, Penn State University, University Park, Pennsylvania, United States of America
| | - Suzanne E. Hile
- Department of Pathology, Gittlen Cancer Research Foundation, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania, United States of America
| | - Amanda Breski
- Department of Pathology, Gittlen Cancer Research Foundation, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania, United States of America
| | - Yanli Wang
- Department of Biology, Penn State University, University Park, Pennsylvania, United States of America
| | - Yogeshwar Kelkar
- Department of Biology, Penn State University, University Park, Pennsylvania, United States of America
| | - Kateryna D. Makova
- Department of Biology, Penn State University, University Park, Pennsylvania, United States of America
- Center for Medical Genomics, Penn State University, University Park, Pennsylvania, United States of America
- * E-mail: (KDM); (KAE)
| | - Kristin A. Eckert
- Department of Pathology, Gittlen Cancer Research Foundation, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania, United States of America
- Center for Medical Genomics, Penn State University, University Park, Pennsylvania, United States of America
- * E-mail: (KDM); (KAE)
| |
Collapse
|
9
|
Grinberg A, Biggs P, Dukkipati V, George T. Extensive intra-host genetic diversity uncovered in Cryptosporidium parvum using Next Generation Sequencing. INFECTION GENETICS AND EVOLUTION 2013; 15:18-24. [DOI: 10.1016/j.meegid.2012.08.017] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2012] [Revised: 08/28/2012] [Accepted: 08/28/2012] [Indexed: 11/28/2022]
|
10
|
Gemayel R, Cho J, Boeynaems S, Verstrepen KJ. Beyond junk-variable tandem repeats as facilitators of rapid evolution of regulatory and coding sequences. Genes (Basel) 2012; 3:461-80. [PMID: 24704980 PMCID: PMC3899988 DOI: 10.3390/genes3030461] [Citation(s) in RCA: 79] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2012] [Revised: 07/19/2012] [Accepted: 07/21/2012] [Indexed: 01/19/2023] Open
Abstract
Copy Number Variations (CNVs) and Single Nucleotide Polymorphisms (SNPs) have been the major focus of most large-scale comparative genomics studies to date. Here, we discuss a third, largely ignored, type of genetic variation, namely changes in tandem repeat number. Historically, tandem repeats have been designated as non functional “junk” DNA, mostly as a result of their highly unstable nature. With the exception of tandem repeats involved in human neurodegenerative diseases, repeat variation was often believed to be neutral with no phenotypic consequences. Recent studies, however, have shown that as many as 10% to 20% of coding and regulatory sequences in eukaryotes contain an unstable repeat tract. Contrary to initial suggestions, tandem repeat variation can have useful phenotypic consequences. Examples include rapid variation in microbial cell surface, tuning of internal molecular clocks in flies and the dynamic morphological plasticity in mammals. As such, tandem repeats can be useful functional elements that facilitate evolvability and rapid adaptation.
Collapse
Affiliation(s)
- Rita Gemayel
- Laboratory for Systems Biology, VIB, Gaston Geenslaan 1, B-3001 Heverlee, Belgium.
| | - Janice Cho
- Laboratory for Systems Biology, VIB, Gaston Geenslaan 1, B-3001 Heverlee, Belgium.
| | - Steven Boeynaems
- Laboratory for Systems Biology, VIB, Gaston Geenslaan 1, B-3001 Heverlee, Belgium.
| | - Kevin J Verstrepen
- Laboratory for Systems Biology, VIB, Gaston Geenslaan 1, B-3001 Heverlee, Belgium.
| |
Collapse
|
11
|
Kelkar YD, Eckert KA, Chiaromonte F, Makova KD. A matter of life or death: how microsatellites emerge in and vanish from the human genome. Genome Res 2011; 21:2038-48. [PMID: 21994250 DOI: 10.1101/gr.122937.111] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Microsatellites--tandem repeats of short DNA motifs--are abundant in the human genome and have high mutation rates. While microsatellite instability is implicated in numerous genetic diseases, the molecular processes involved in their emergence and disappearance are still not well understood. Microsatellites are hypothesized to follow a life cycle, wherein they are born and expand into adulthood, until their degradation and death. Here we identified microsatellite births/deaths in human, chimpanzee, and orangutan genomes, using macaque and marmoset as outgroups. We inferred mutations causing births/deaths based on parsimony, and investigated local genomic environments affecting them. We also studied birth/death patterns within transposable elements (Alus and L1s), coding regions, and disease-associated loci. We observed that substitutions were the predominant cause for births of short microsatellites, while insertions and deletions were important for births of longer microsatellites. Substitutions were the cause for deaths of microsatellites of virtually all lengths. AT-rich L1 sequences exhibited elevated frequency of births/deaths over their entire length, while GC-rich Alus only in their 3' poly(A) tails and middle A-stretches, with differences depending on transposable element integration timing. Births/deaths were strongly selected against in coding regions. Births/deaths occurred in genomic regions with high substitution rates, protomicrosatellite content, and L1 density, but low GC content and Alu density. The majority of the 17 disease-associated microsatellites examined are evolutionarily ancient (were acquired by the common ancestor of simians). Our genome-wide investigation of microsatellite life cycle has fundamental applications for predicting the susceptibility of birth/death of microsatellites, including many disease-causing loci.
Collapse
Affiliation(s)
- Yogeshwar D Kelkar
- Department of Biology, Penn State University, University Park, Pennsylvania 16802, USA
| | | | | | | |
Collapse
|
12
|
The in vitro fidelity of yeast DNA polymerase δ and polymerase ε holoenzymes during dinucleotide microsatellite DNA synthesis. DNA Repair (Amst) 2011; 10:497-505. [PMID: 21429821 DOI: 10.1016/j.dnarep.2011.02.003] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2010] [Revised: 02/11/2011] [Accepted: 02/18/2011] [Indexed: 11/20/2022]
Abstract
Elucidating the sources of genetic variation within microsatellite alleles has important implications for understanding the etiology of human diseases. Mismatch repair is a well described pathway for the suppression of microsatellite instability. However, the cellular polymerases responsible for generating microsatellite errors have not been fully described. We address this gap in knowledge by measuring the fidelity of recombinant yeast polymerase δ (Pol δ) and ɛ (Pol ɛ) holoenzymes during synthesis of a [GT/CA] microsatellite. The in vitro HSV-tk forward assay was used to measure DNA polymerase errors generated during gap-filling of complementary GT(10) and CA(10)-containing substrates and ∼90 nucleotides of HSV-tk coding sequence surrounding the microsatellites. The observed mutant frequencies within the microsatellites were 4 to 30-fold higher than the observed mutant frequencies within the coding sequence. More specifically, the rate of Pol δ and Pol ɛ misalignment-based insertion/deletion errors within the microsatellites was ∼1000-fold higher than the rate of insertion/deletion errors within the HSV-tk gene. Although the most common microsatellite error was the deletion of a single repeat unit, ∼ 20% of errors were deletions of two or more units for both polymerases. The differences in fidelity for wild type enzymes and their exonuclease-deficient derivatives were ∼2-fold for unit-based microsatellite insertion/deletion errors. Interestingly, the exonucleases preferentially removed potentially stabilizing interruption errors within the microsatellites. Since Pol δ and Pol ɛ perform not only the bulk of DNA replication in eukaryotic cells but also are implicated in performing DNA synthesis associated with repair and recombination, these results indicate that microsatellite errors may be introduced into the genome during multiple DNA metabolic pathways.
Collapse
|
13
|
Rorick MM, Wagner GP. The origin of conserved protein domains and amino acid repeats via adaptive competition for control over amino acid residues. J Mol Evol 2010; 70:29-43. [PMID: 20024539 PMCID: PMC3368225 DOI: 10.1007/s00239-009-9305-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2009] [Accepted: 11/18/2009] [Indexed: 10/20/2022]
Abstract
Some proteins, such as homeodomain transcription factors, contain highly conserved regions of sequence. It has recently been suggested that multiple functional domains overlap in the homeodomain, together explaining this high conservation. However, the question remains why so many functional domains cluster together in one relatively small and constrained region of the protein. Here we have modeled an evolutionary mechanism that can produce this kind of clustering: conserved functional domains are displaced from the parts of the molecule that are undergoing adaptive evolution because novel functions generally out-compete conserved functions for control over the identity of amino acid residues. We call this model COAA, for Competition Over Amino Acids. We also studied the evolution of amino acid repeats (a.k.a. homopeptides), which are especially prevalent in transcription factors. Repeats that are encoded by non-homogenous mixtures of synonymous codons cannot be explained by replication slippage alone. Our model provides two explanations for their origin, maintenance, and over-representation in highly conserved proteins. We demonstrate that either competition between multiple functional domains for space within a sequence, or reuse of a sequence for many functions over time, can cause the evolution of amino acid repeats. Both of these processes are characteristic of multifunctional proteins such as homeodomain transcription factors. We conclude that the COAA model can explain two widely recognized features of transcription factor proteins: conserved domains and a tendency to accumulate homopeptides.
Collapse
Affiliation(s)
- Mary M Rorick
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520-8106, USA.
| | | |
Collapse
|
14
|
Pemberton TJ, Sandefur CI, Jakobsson M, Rosenberg NA. Sequence determinants of human microsatellite variability. BMC Genomics 2009; 10:612. [PMID: 20015383 PMCID: PMC2806349 DOI: 10.1186/1471-2164-10-612] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2009] [Accepted: 12/16/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Microsatellite loci are frequently used in genomic studies of DNA sequence repeats and in population studies of genetic variability. To investigate the effect of sequence properties of microsatellites on their level of variability we have analyzed genotypes at 627 microsatellite loci in 1,048 worldwide individuals from the HGDP-CEPH cell line panel together with the DNA sequences of these microsatellites in the human RefSeq database. RESULTS Calibrating PCR fragment lengths in individual genotypes by using the RefSeq sequence enabled us to infer repeat number in the HGDP-CEPH dataset and to calculate the mean number of repeats (as opposed to the mean PCR fragment length), under the assumption that differences in PCR fragment length reflect differences in the numbers of repeats in the embedded repeat sequences. We find the mean and maximum numbers of repeats across individuals to be positively correlated with heterozygosity. The size and composition of the repeat unit of a microsatellite are also important factors in predicting heterozygosity, with tetra-nucleotide repeat units high in G/C content leading to higher heterozygosity. Finally, we find that microsatellites containing more separate sets of repeated motifs generally have higher heterozygosity. CONCLUSIONS These results suggest that sequence properties of microsatellites have a significant impact in determining the features of human microsatellite variability.
Collapse
Affiliation(s)
- Trevor J Pemberton
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA.
| | | | | | | |
Collapse
|
15
|
Increased number of glutamine repeats in the C-terminal of Candida albicans Rlm1p enhances the resistance to stress agents. Antonie Van Leeuwenhoek 2009; 96:395-404. [PMID: 19484503 DOI: 10.1007/s10482-009-9352-5] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2009] [Accepted: 05/14/2009] [Indexed: 10/20/2022]
Abstract
The highly polymorphic microsatellite CAI described for Candida albicans genotyping was found to be located within the RLM1 gene which codes for a transcription factor from the MADS box family that, in Saccharomyces cerevisiae, is known to regulate the expression of genes involved in the cell wall integrity pathway. The aim of this work was to study CAI genetic variability in a wide group of C. albicans isolates and determine the response of genetic variants to cell wall damaging stress agents. One hundred twenty-three C. albicans isolates were genotyped with CAI microsatellite (CAA/G)(n), and 35 alleles were found with repeat units varying from 11 to 49. Alleles with less than 29 repetitions were the most frequent, while the longer ones were underrepresented and had a more complex internal structure. Combinations of RLM1 alleles generated 66 different genotypes. Significant differences (P < 0.05) in the susceptibility patterns to menadione, hydrogen peroxide, SDS, acetic acid, and CFW, stress agents affecting cell integrity, were found between strains harbouring alleles ranging from 17 to 28 repetitions and strains with longer alleles, suggesting that an increased number of repetitive units in the C. albicans RLM1 gene could be related to stress response.
Collapse
|
16
|
Treangen TJ, Abraham AL, Touchon M, Rocha EPC. Genesis, effects and fates of repeats in prokaryotic genomes. FEMS Microbiol Rev 2009; 33:539-71. [PMID: 19396957 DOI: 10.1111/j.1574-6976.2009.00169.x] [Citation(s) in RCA: 110] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
DNA repeats are causes and consequences of genome plasticity. Repeats are created by intrachromosomal recombination or horizontal transfer. They are targeted by recombination processes leading to amplifications, deletions and rearrangements of genetic material. The identification and analysis of repeats in nearly 700 genomes of bacteria and archaea is facilitated by the existence of sequence data and adequate bioinformatic tools. These have revealed the immense diversity of repeats in genomes, from those created by selfish elements to the ones used for protection against selfish elements, from those arising from transient gene amplifications to the ones leading to stable duplications. Experimental works have shown that some repeats do not carry any adaptive value, while others allow functional diversification and increased expression. All repeats carry some potential to disorganize and destabilize genomes. Because recombination and selection for repeats vary between genomes, the number and types of repeats are also quite diverse and in line with ecological variables, such as host-dependent associations or population sizes, and with genetic variables, such as the recombination machinery. From an evolutionary point of view, repeats represent both opportunities and problems. We describe how repeats are created and how they can be found in genomes. We then focus on the functional and genomic consequences of repeats that dictate their fate.
Collapse
|
17
|
Eckert KA, Hile SE. Every microsatellite is different: Intrinsic DNA features dictate mutagenesis of common microsatellites present in the human genome. Mol Carcinog 2009; 48:379-88. [PMID: 19306292 DOI: 10.1002/mc.20499] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Microsatellite sequences are ubiquitous in the human genome and are important regulators of genome function. Here, we examine the mutational mechanisms governing the stability of highly abundant mono-, di-, and tetranucleotide microsatellites. Microsatellite mutation rate estimates from pedigree analyses and experimental models range from a low of approximately 10(-6) to a high of approximately 10(-2) mutations per locus per generation. The vast majority of observed mutational variation can be attributed to features intrinsic to the allele itself, including motif size, length, and sequence composition. A greater than linear relationship between motif length and mutagenesis has been observed in several model systems. Motif sequence differences contribute up to 10-fold to the variation observed in human cell mutation rates. The major mechanism of microsatellite mutagenesis is strand slippage during DNA synthesis. DNA polymerases produce errors within microsatellites at a frequency that is 10- to 100-fold higher than the frequency of frameshifts in coding sequences. Motif sequence significantly affects both polymerase error rate and specificity, resulting in strand biases within complementary microsatellites. Importantly, polymerase errors within microsatellites include base substitutions, deletions, and complex mutations, all of which produced interrupted alleles from pure microsatellites. Postreplication mismatch repair efficiency is affected by microsatellite motif size and sequence, also contributing to the observed variation in microsatellite mutagenesis. Inhibition of DNA synthesis within common microsatellites is highly sequence-dependent, and is positively correlated with the production of errors. DNA secondary structure within common microsatellites can account for some DNA polymerase pause sites, and may be an important factor influencing mutational specificity.
Collapse
Affiliation(s)
- Kristin A Eckert
- Department of Pathology, The Jake Gittlen Cancer Research Foundation, The Pennsylvania State University College of Medicine, 500 University Drive, PA, USA
| | | |
Collapse
|
18
|
Eggert LS, Beadell JS, McClung A, McIntosh CE, Fleischer RC. Evolution of microsatellite loci in the adaptive radiation of Hawaiian honeycreepers. J Hered 2009; 100:137-47. [PMID: 19153085 DOI: 10.1093/jhered/esn111] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Previous studies have examined germ-line mutations to infer the processes that generate and maintain variability in microsatellite loci. Few studies, however, have examined patterns to infer processes that act on microsatellite loci over evolutionary time. Here, we examine changes in 8 dinucleotide loci across the adaptive radiation of Hawaiian honeycreepers. The loci were found to be highly variable across the radiation, and we did not detect ascertainment bias with respect to allelic diversity or allele size ranges. In examining patterns at the sequence level, we found that changes in flanking regions, repeat motifs, or repeat interruptions were often shared between closely related species and may be phylogenetically informative. Genetic distance measures based on microsatellites were strongly correlated with those based on mitochondrial DNA (mtDNA) sequences as well as with divergence time up to 3 My. Phylogenetic inferences based on microsatellite genetic distances consistently recovered 2 of the 4 honeycreeper clades observed in a tree based on mtDNA sequences but differed from the mtDNA tree in the relationships among clades. Our results confirm that microsatellite loci may be conserved over evolutionary time, making them useful in population-level studies of species that diverged from the species in which they were characterized as long as 5 Ma. Despite this, we found that their use in phylogenetic inference was limited to closely related honeycreeper species.
Collapse
Affiliation(s)
- Lori S Eggert
- Center for Conservation and Evolutionary Genetics, National Zoological Park and National Museum of Natural History, Smithsonian Institution, 3001 Connecticut Avenue NW, Washington, DC 20008, USA.
| | | | | | | | | |
Collapse
|
19
|
Richard GF, Kerrest A, Dujon B. Comparative genomics and molecular dynamics of DNA repeats in eukaryotes. Microbiol Mol Biol Rev 2008; 72:686-727. [PMID: 19052325 PMCID: PMC2593564 DOI: 10.1128/mmbr.00011-08] [Citation(s) in RCA: 323] [Impact Index Per Article: 20.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Repeated elements can be widely abundant in eukaryotic genomes, composing more than 50% of the human genome, for example. It is possible to classify repeated sequences into two large families, "tandem repeats" and "dispersed repeats." Each of these two families can be itself divided into subfamilies. Dispersed repeats contain transposons, tRNA genes, and gene paralogues, whereas tandem repeats contain gene tandems, ribosomal DNA repeat arrays, and satellite DNA, itself subdivided into satellites, minisatellites, and microsatellites. Remarkably, the molecular mechanisms that create and propagate dispersed and tandem repeats are specific to each class and usually do not overlap. In the present review, we have chosen in the first section to describe the nature and distribution of dispersed and tandem repeats in eukaryotic genomes in the light of complete (or nearly complete) available genome sequences. In the second part, we focus on the molecular mechanisms responsible for the fast evolution of two specific classes of tandem repeats: minisatellites and microsatellites. Given that a growing number of human neurological disorders involve the expansion of a particular class of microsatellites, called trinucleotide repeats, a large part of the recent experimental work on microsatellites has focused on these particular repeats, and thus we also review the current knowledge in this area. Finally, we propose a unified definition for mini- and microsatellites that takes into account their biological properties and try to point out new directions that should be explored in a near future on our road to understanding the genetics of repeated sequences.
Collapse
Affiliation(s)
- Guy-Franck Richard
- Institut Pasteur, Unité de Génétique Moléculaire des Levures, CNRS, URA2171, Université Pierre et Marie Curie, UFR927, 25 rue du Dr. Roux, F-75015, Paris, France.
| | | | | |
Collapse
|
20
|
VÄLI ÜLO, EINARSSON ANNIKA, WAITS LISETTE, ELLEGREN HANS. To what extent do microsatellite markers reflect genome-wide genetic diversity in natural populations? Mol Ecol 2008; 17:3808-17. [DOI: 10.1111/j.1365-294x.2008.03876.x] [Citation(s) in RCA: 196] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
21
|
Karaiskou N, Buggiotti L, Leder E, Primmer CR. High degree of transferability of 86 newly developed zebra finch EST-linked microsatellite markers in 8 bird species. ACTA ACUST UNITED AC 2008; 99:688-93. [PMID: 18583388 DOI: 10.1093/jhered/esn052] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
High-resolution analysis for population genetic and functional studies requires the use of large numbers of polymorphic markers. The recent increase of available genetic tools is facilitated by the use of publicly available expressed sequence tag (EST) sequence databases that are a valuable resource for identifying gene-linked markers. In the present study, we applied bioinformatics analyses to identify microsatellite markers present in EST sequences from a zebra finch (Taeniopgia guttata) EST database and we explore the success of cross-species amplification of EST-linked microsatellite markers in 7 passerine and 1 nonpasserine species. Eighty-six zebra finch EST-linked microsatellite loci were screened for polymorphism revealing a high amplification success rate and adequate levels of polymorphism (33.3-51%) for relatively closely related species, whereas success decreased in the most distantly related species to zebra finch. EST-linked microsatellites appear to be more highly transferable between taxa than anonymous microsatellites as they revealed higher amplification and polymorphism success between different families indicating that they will be a useful source of gene-linked polymorphic markers in a broad range of avian species.
Collapse
Affiliation(s)
- Nikoletta Karaiskou
- Department of Genetics, Development and Molecular Biology, School of Biology, Aristotle University of Thessaloniki, PO Box 54 124, Thessaloniki, Macedonia, Greece
| | | | | | | |
Collapse
|
22
|
Brandström M, Ellegren H. Genome-wide analysis of microsatellite polymorphism in chicken circumventing the ascertainment bias. Genome Res 2008; 18:881-7. [PMID: 18356314 DOI: 10.1101/gr.075242.107] [Citation(s) in RCA: 78] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Studies of microsatellites evolution based on marker data almost inherently suffer from an ascertainment bias because there is selection for the most mutable and polymorphic loci during marker development. To circumvent this bias we took advantage of whole-genome shotgun sequence data from three unrelated chicken individuals that, when aligned to the genome reference sequence, give sequence information on two chromosomes from about one-fourth (375,000) of all microsatellite loci containing di- through pentanucleotide repeat motifs in the chicken genome. Polymorphism is seen at loci with as few as five repeat units, and the proportion of dimorphic loci then increases to 50% for sequences with approximately 10 repeat units, to reach a maximum of 75%-80% for sequences with 15 or more repeat units. For any given repeat length, polymorphism increases with decreasing GC content of repeat motifs for dinucleotides, nonhairpin-forming trinucleotides, and tetranucleotides. For trinucleotide repeats which are likely to form hairpin structures, polymorphism increases with increasing GC content, indicating that the relative stability of hairpins affects the rate of replication slippage. For any given repeat length, polymorphism is significantly lower for imperfect compared to perfect repeats and repeat interruptions occur in >15% of loci. However, interruptions are not randomly distributed within repeat arrays but are preferentially located toward the ends. There is negative correlation between microsatellite abundance and single nucleotide polymorphism (SNP) density, providing large-scale genomic support for the hypothesis that equilibrium microsatellite distributions are governed by a balance between rate of replication slippage and rate of point mutation.
Collapse
Affiliation(s)
- Mikael Brandström
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18D, SE-752 36 Uppsala, Sweden
| | | |
Collapse
|
23
|
Boyer JC, Hawk JD, Stefanovic L, Farber RA. Sequence-dependent effect of interruptions on microsatellite mutation rate in mismatch repair-deficient human cells. Mutat Res 2007; 640:89-96. [PMID: 18242644 DOI: 10.1016/j.mrfmmm.2007.12.005] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2007] [Revised: 11/21/2007] [Accepted: 12/11/2007] [Indexed: 11/18/2022]
Abstract
Although microsatellite mutation rates generally increase with increasing length of the repeat tract, interruptions in a microsatellite may stabilize it. We have performed a direct analysis of the effect of microsatellite interruptions on mutation rate and spectrum in cultured mammalian cells. Two mononucleotide sequences (G(17) and A(17)) and a dinucleotide [(CA)(17)] were compared with interrupted repeats of the same size and with sequences of 8 repeat units. MMR-deficient (MMR(-)) cells were used for these studies to eliminate effects of this repair process. Mutation rates were determined by fluctuation analysis on cells containing a microsatellite sequence at the 5' end of an antibiotic-resistance gene; the vector carrying this sequence was integrated in the genome of the cells. In general, interrupted sequences had lower mutation rates than perfect ones of the same size, but the magnitude of the difference was dependent upon the sequence of the interrupting base(s). Some interrupted repeats had mutation rates that were lower than those of perfect sequences of the same length but similar to those of half the length. This suggests that interrupting bases effectively divide microsatellites into smaller repeat runs with mutational characteristics different from those of the corresponding full-length microsatellite. We conclude that interruptions decrease microsatellite mutation rate and influence the spectrum of frameshift mutations. The sequence of the interrupting base(s) determines the magnitude of the effect on mutation rate.
Collapse
Affiliation(s)
- Jayne C Boyer
- Department of Pathology and Laboratory Medicine, University of North Carolina at Chapel Hill, CB #7525, Chapel Hill, NC 27599, United States.
| | | | | | | |
Collapse
|
24
|
Hile SE, Eckert KA. DNA polymerase kappa produces interrupted mutations and displays polar pausing within mononucleotide microsatellite sequences. Nucleic Acids Res 2007; 36:688-96. [PMID: 18079151 PMCID: PMC2241860 DOI: 10.1093/nar/gkm1089] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Microsatellites are ubiquitously present in eukaryotic genomes and are implicated as positive factors in evolution. At the nucleotide level, microsatellites undergo slippage events that alter allele length and base changes that interrupt the repetitive tract. We examined DNA polymerase errors within a [T]11 microsatellite using an in vitro assay that preferentially detects mutations other than unit changes. We observed that human DNA polymerase kappa (Pol κ) inserts dGMP and dCMP within the [T]11 mononucleotide repeat, producing an interrupted 12-bp allele. Polymerase β produced such interruptions at a lower frequency. These data demonstrate that DNA polymerases are capable of directly producing base interruptions within microsatellites. At the molecular level, expanded microsatellites have been implicated in DNA replication fork stalling. Using an in vitro primer extension assay, we observed sequence-specific synthesis termination by DNA polymerases within mononucleotides. Quantitatively, intense, polar pausing was observed for both pol κ and polymerase α-primase within a [T]11 allele. A mechanism is proposed in which pausing results from DNA bending within the duplex stem of the nascent DNA. Our data support the concept of a microsatellite life-cycle, and are consistent with the models in which DNA sequence or secondary structures contributes to non-uniform rates of replication fork progression.
Collapse
Affiliation(s)
- Suzanne E Hile
- Department of Pathology, Gittlen Cancer Research Foundation, The Pennsylvania State University College of Medicine, 500 University Drive, Hershey, PA 17033, USA
| | | |
Collapse
|
25
|
McConnell R, Middlemist S, Scala C, Strassmann JE, Queller DC. An unusually low microsatellite mutation rate in Dictyostelium discoideum, an organism with unusually abundant microsatellites. Genetics 2007; 177:1499-507. [PMID: 17947436 PMCID: PMC2147952 DOI: 10.1534/genetics.107.076067] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2007] [Accepted: 09/04/2007] [Indexed: 01/13/2023] Open
Abstract
The genome of the social amoeba Dictyostelium discoideum is known to have a very high density of microsatellite repeats, including thousands of triplet microsatellite repeats in coding regions that apparently code for long runs of single amino acids. We used a mutation accumulation study to see if unusually high microsatellite mutation rates contribute to this pattern. There was a modest bias toward mutations that increase repeat number, but because upward mutations were smaller than downward ones, this did not lead to a net average increase in size. Longer microsatellites had higher mutation rates than shorter ones, but did not show greater directional bias. The most striking finding is that the overall mutation rate is the lowest reported for microsatellites: approximately 1 x 10(-6) for 10 dinucleotide loci and 6 x 10(-6) for 52 trinucleotide loci (which were longer). High microsatellite mutation rates therefore do not explain the high incidence of microsatellites. The causal relation may in fact be reversed, with low mutation rates evolving to protect against deleterious fitness effects of mutation at the numerous microsatellites.
Collapse
Affiliation(s)
- Ryan McConnell
- Department of Ecology and Evolutionary Biology, Rice University, Houston, Texas 77005, USA
| | | | | | | | | |
Collapse
|
26
|
Butland SL, Devon RS, Huang Y, Mead CL, Meynert AM, Neal SJ, Lee SS, Wilkinson A, Yang GS, Yuen MMS, Hayden MR, Holt RA, Leavitt BR, Ouellette BFF. CAG-encoded polyglutamine length polymorphism in the human genome. BMC Genomics 2007; 8:126. [PMID: 17519034 PMCID: PMC1896166 DOI: 10.1186/1471-2164-8-126] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2006] [Accepted: 05/22/2007] [Indexed: 11/10/2022] Open
Abstract
Abstract
Background
Expansion of polyglutamine-encoding CAG trinucleotide repeats has been identified as the pathogenic mutation in nine different genes associated with neurodegenerative disorders. The majority of individuals clinically diagnosed with spinocerebellar ataxia do not have mutations within known disease genes, and it is likely that additional ataxias or Huntington disease-like disorders will be found to be caused by this common mutational mechanism. We set out to determine the length distributions of CAG-polyglutamine tracts for the entire human genome in a set of healthy individuals in order to characterize the nature of polyglutamine repeat length variation across the human genome, to establish the background against which pathogenic repeat expansions can be detected, and to prioritize candidate genes for repeat expansion disorders.
Results
We found that repeats, including those in known disease genes, have unique distributions of glutamine tract lengths, as measured by fragment analysis of PCR-amplified repeat regions. This emphasizes the need to characterize each distribution and avoid making generalizations between loci. The best predictors of known disease genes were occurrence of a long CAG-tract uninterrupted by CAA codons in their reference genome sequence, and high glutamine tract length variance in the normal population. We used these parameters to identify eight priority candidate genes for polyglutamine expansion disorders. Twelve CAG-polyglutamine repeats were invariant and these can likely be excluded as candidates. We outline some confusion in the literature about this type of data, difficulties in comparing such data between publications, and its application to studies of disease prevalence in different populations. Analysis of Gene Ontology-based functions of CAG-polyglutamine-containing genes provided a visual framework for interpretation of these genes' functions. All nine known disease genes were involved in DNA-dependent regulation of transcription or in neurogenesis, as were all of the well-characterized priority candidate genes.
Conclusion
This publication makes freely available the normal distributions of CAG-polyglutamine repeats in the human genome. Using these background distributions, against which pathogenic expansions can be identified, we have begun screening for mutations in individuals clinically diagnosed with novel forms of spinocerebellar ataxia or Huntington disease-like disorders who do not have identified mutations within the known disease-associated genes.
Collapse
|
27
|
Doxiadis GGM, de Groot N, Claas FHJ, Doxiadis IIN, van Rood JJ, Bontrop RE. A highly divergent microsatellite facilitating fast and accurate DRB haplotyping in humans and rhesus macaques. Proc Natl Acad Sci U S A 2007; 104:8907-12. [PMID: 17502594 PMCID: PMC1868589 DOI: 10.1073/pnas.0702964104] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The DRB region of the MHC in primate species is known to display abundant region configuration polymorphism with regard to the number and content of genes present per haplotype. Furthermore, depending on the species studied, the different DRB genes themselves may display varying degrees of allelic polymorphism. Because of this combination of diversity (differential gene number) and polymorphism (allelic variation), molecular typing methods for the primate DRB region are cumbersome. All intact DRB genes present in humans and rhesus macaques appear to possess, however, a complex and highly divergent microsatellite. Microsatellite analysis of a sizeable panel of outbred rhesus macaques, covering most of the known Mamu-DRB haplotypes, resulted in the definition of unique genotyping patterns that appear to be specific for a given haplotype. Subsequent examination of a representative panel of human cells illustrated that this approach also facilitates high-resolution HLA-DRB typing in an easy, quick, and reproducible fashion. The genetic composition of this complex microsatellite is shown to be in concordance with the phylogenetic relationships of various HLA-DRB and Mamu-DRB exon 2 gene/lineage sequences. Moreover, its length variability segregates with allelic variation of the respective gene. This simple protocol may find application in a variety of research avenues such as transplantation biology, disease association studies, molecular ecology, paternity testing, and forensic medicine.
Collapse
Affiliation(s)
- Gaby G. M. Doxiadis
- Department of Comparative Genetics and Refinement, Biomedical Primate Research Centre, P.O. Box 3306, 2280 GH, Rijswijk, The Netherlands; and
- To whom correspondence may be addressed. E-mail: or
| | - Nanine de Groot
- Department of Comparative Genetics and Refinement, Biomedical Primate Research Centre, P.O. Box 3306, 2280 GH, Rijswijk, The Netherlands; and
| | - Frans H. J. Claas
- Department of Immunohematology and Blood Transfusion, Leiden University Medical Centre, E3-Q, P.O. Box 9600, 2300 RC, Leiden, The Netherlands
| | - Ilias I. N. Doxiadis
- Department of Immunohematology and Blood Transfusion, Leiden University Medical Centre, E3-Q, P.O. Box 9600, 2300 RC, Leiden, The Netherlands
| | - Jon J. van Rood
- Department of Immunohematology and Blood Transfusion, Leiden University Medical Centre, E3-Q, P.O. Box 9600, 2300 RC, Leiden, The Netherlands
- To whom correspondence may be addressed. E-mail: or
| | - Ronald E. Bontrop
- Department of Comparative Genetics and Refinement, Biomedical Primate Research Centre, P.O. Box 3306, 2280 GH, Rijswijk, The Netherlands; and
| |
Collapse
|
28
|
Leclercq S, Rivals E, Jarne P. Detecting microsatellites within genomes: significant variation among algorithms. BMC Bioinformatics 2007; 8:125. [PMID: 17442102 PMCID: PMC1876248 DOI: 10.1186/1471-2105-8-125] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2006] [Accepted: 04/18/2007] [Indexed: 11/25/2022] Open
Abstract
Background Microsatellites are short, tandemly-repeated DNA sequences which are widely distributed among genomes. Their structure, role and evolution can be analyzed based on exhaustive extraction from sequenced genomes. Several dedicated algorithms have been developed for this purpose. Here, we compared the detection efficiency of five of them (TRF, Mreps, Sputnik, STAR, and RepeatMasker). Results Our analysis was first conducted on the human X chromosome, and microsatellite distributions were characterized by microsatellite number, length, and divergence from a pure motif. The algorithms work with user-defined parameters, and we demonstrate that the parameter values chosen can strongly influence microsatellite distributions. The five algorithms were then compared by fixing parameters settings, and the analysis was extended to three other genomes (Saccharomyces cerevisiae, Neurospora crassa and Drosophila melanogaster) spanning a wide range of size and structure. Significant differences for all characteristics of microsatellites were observed among algorithms, but not among genomes, for both perfect and imperfect microsatellites. Striking differences were detected for short microsatellites (below 20 bp), regardless of motif. Conclusion Since the algorithm used strongly influences empirical distributions, studies analyzing microsatellite evolution based on a comparison between empirical and theoretical size distributions should therefore be considered with caution. We also discuss why a typological definition of microsatellites limits our capacity to capture their genomic distributions.
Collapse
Affiliation(s)
- Sébastien Leclercq
- LIRMM, UMR 5506 CNRS – Université de Montpellier II, 161 rue Ada, Montpellier, France
- CEFE, UMR 5175 CNRS – Université de Montpellier II, 1919 route de Mende, Montpellier, France
| | - Eric Rivals
- LIRMM, UMR 5506 CNRS – Université de Montpellier II, 161 rue Ada, Montpellier, France
| | - Philippe Jarne
- CEFE, UMR 5175 CNRS – Université de Montpellier II, 1919 route de Mende, Montpellier, France
| |
Collapse
|
29
|
Slate J, Hale MC, Birkhead TR. Simple sequence repeats in zebra finch (Taeniopygia guttata) expressed sequence tags: a new resource for evolutionary genetic studies of passerines. BMC Genomics 2007; 8:52. [PMID: 17300727 PMCID: PMC1804275 DOI: 10.1186/1471-2164-8-52] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2006] [Accepted: 02/14/2007] [Indexed: 11/10/2022] Open
Abstract
Background Passerines (perching birds) are widely studied across many biological disciplines including ecology, population biology, neurobiology, behavioural ecology and evolutionary biology. However, understanding the molecular basis of relevant traits is hampered by the paucity of passerine genomics tools. Efforts to address this problem are underway, and the zebra finch (Taeniopygia guttata) will be the first passerine to have its genome sequenced. Here we describe a bioinformatic analysis of zebra finch expressed sequence tag (EST) Genbank entries. Results A total of 48,862 ESTs were downloaded from GenBank and assembled into contigs, representing an estimated 17,404 unique sequences. The unique sequence set contained 638 simple sequence repeats (SSRs) or microsatellites of length ≥20 bp and purity ≥90% and 144 simple sequence repeats of length ≥30 bp. A chromosomal location for the majority of SSRs was predicted by BLASTing against assembly 2.1 of the chicken genome sequence. The relative exonic location (5' untranslated region, coding region or 3' untranslated region) was predicted for 218 of the SSRs, by BLAST search against the ENSEMBL chicken peptide database. Ten loci were examined for polymorphism in two zebra finch populations and two populations of a distantly related passerine, the house sparrow Passer domesticus. Linkage was confirmed for four loci that were predicted to reside on the passerine homologue of chicken chromosome 7. Conclusion We show that SSRs are abundant within zebra finch ESTs, and that their genomic location can be predicted from sequence similarity with the assembled chicken genome sequence. We demonstrate that a useful proportion of zebra finch EST-SSRs are likely to be polymorphic, and that they can be used to build a linkage map. Finally, we show that many zebra finch EST-SSRs are likely to be useful in evolutionary genetic studies of other passerines.
Collapse
Affiliation(s)
- Jon Slate
- Department of Animal & Plant Sciences, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK
| | - Matthew C Hale
- Department of Animal & Plant Sciences, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK
| | - Timothy R Birkhead
- Department of Animal & Plant Sciences, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK
| |
Collapse
|
30
|
Edelist C, Lexer C, Dillmann C, Sicard D, Rieseberg LH. Microsatellite signature of ecological selection for salt tolerance in a wild sunflower hybrid species, Helianthus paradoxus. Mol Ecol 2007; 15:4623-34. [PMID: 17107488 PMCID: PMC2442927 DOI: 10.1111/j.1365-294x.2006.03112.x] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
The hybrid sunflower species Helianthus paradoxus inhabits sporadic salt marshes in New Mexico and southwest Texas, USA, whereas its parental species, Helianthus annuus and Helianthus petiolaris, are salt sensitive. Previous studies identified three genomic regions - survivorship quantitative trait loci (QTLs) - that were under strong selection in experimental hybrids transplanted into the natural habitat of H. paradoxus. Here we ask whether these same genomic regions experienced significant selection during the origin and evolution of the natural hybrid, H. paradoxus. This was accomplished by comparing the variability of microsatellites linked to the three survivorship QTLs with those from genomic regions that were neutral in the experimental hybrids. As predicted if one or more selective sweeps had occurred in these regions, microsatellites linked to the survivorship QTLs exhibited a significant reduction in diversity in populations of the natural hybrid species. In contrast, no difference in diversity levels was observed between the two microsatellite classes in parental populations.
Collapse
Affiliation(s)
- Cécile Edelist
- Department of Biology, Indiana University, Bloomington, IN 47405, USA.
| | | | | | | | | |
Collapse
|
31
|
Buschiazzo E, Gemmell NJ. The rise, fall and renaissance of microsatellites in eukaryotic genomes. Bioessays 2006; 28:1040-50. [PMID: 16998838 DOI: 10.1002/bies.20470] [Citation(s) in RCA: 190] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Microsatellites are among the most versatile of genetic markers, being used in an impressive number of biological applications. However, the evolutionary dynamics of these markers remain a source of contention. Almost 20 years after the discovery of these ubiquitous simple sequences, new genomic data are clarifying our understanding of the structure, distribution and variability of microsatellites in genomes, especially for the eukaryotes. While these new data provide a great deal of descriptive information about the nature and abundance of microsatellite sequences within eukaryotic genomes, there have been few attempts to synthesise this information to develop a global concept of evolution. This review provides an up-to-date account of the mutational processes, biases and constraints believed to be involved in the evolution of microsatellites, particularly with respect to the creation and degeneration of microsatellites, which we assert may be broadly viewed as a life cycle. In addition, we identify areas of contention that require further research and propose some possible directions for future investigation.
Collapse
Affiliation(s)
- Emmanuel Buschiazzo
- School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.
| | | |
Collapse
|
32
|
Pardi F, Sibly RM, Wilkinson MJ, Whittaker JC. On the structural differences between markers and genomic AC microsatellites. J Mol Evol 2005; 60:688-93. [PMID: 15983876 DOI: 10.1007/s00239-004-0274-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2004] [Accepted: 09/07/2004] [Indexed: 10/25/2022]
Abstract
AC microsatellites have proved particularly useful as genetic markers. For some purposes, such as in population biology, the inferences drawn depend on the quantitative values of their mutation rates. This, together with intrinsic biological interest, has led to widespread study of microsatellite mutational mechanisms. Now, however, inconsistencies are appearing in the results of marker-based versus non-marker-based studies of mutational mechanisms. The reasons for this have not been investigated, but one possibility, pursued here, is that the differences result from structural differences between markers and genomic microsatellites. Here we report a comparison between the CEPH AC marker microsatellites and the global population of AC microsatellites in the human genome. AC marker microsatellites are longer than the global average. Controlling for length, marker microsatellites contain on average fewer interruptions, and have longer segments, than their genomic counterparts. Related to this, marker microsatellites show a greater tendency to concentrate the majority of their repeats into one segment. These differences plausibly result from scientists selecting markers for their high polymorphism. In addition to the structural differences, there are differences in the base composition of flanking sequences, marker flanking regions being richer in C and G and poorer in A and T. Our results indicate that there are profound differences between marker and genomic microsatellites that almost certainly affect their mutation rates. There is a need for a unified model of mutational mechanisms that accounts for both marker-derived and genomic observations. A suggestion is made as to how this might be done.
Collapse
|
33
|
Sainudiin R, Durrett RT, Aquadro CF, Nielsen R. Microsatellite mutation models: insights from a comparison of humans and chimpanzees. Genetics 2005; 168:383-95. [PMID: 15454551 PMCID: PMC1448085 DOI: 10.1534/genetics.103.022665] [Citation(s) in RCA: 75] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Using genomic data from homologous microsatellite loci of pure AC repeats in humans and chimpanzees, several models of microsatellite evolution are tested and compared using likelihood-ratio tests and the Akaike information criterion. A proportional-rate, linear-biased, one-phase model emerges as the best model. A focal length toward which the mutational and/or substitutional process is linearly biased is a crucial feature of microsatellite evolution. We find that two-phase models do not lead to a significantly better fit than their one-phase counterparts. The performance of models based on the fit of their stationary distributions to the empirical distribution of microsatellite lengths in the human genome is consistent with that based on the human-chimp comparison. Microsatellites interrupted by even a single point mutation exhibit a twofold decrease in their mutation rate when compared to pure AC repeats. In general, models that allow chimps to have a larger per-repeat unit slippage rate and/or a shorter focal length compared to humans give a better fit to the human-chimp data as well as the human genomic data.
Collapse
Affiliation(s)
- Raazesh Sainudiin
- Department of Statistical Science, Cornell University, Ithaca, New York 14853, USA.
| | | | | | | |
Collapse
|
34
|
Muir G, Schlötterer C. Evidence for shared ancestral polymorphism rather than recurrent gene flow at microsatellite loci differentiating two hybridizing oaks (Quercus spp.). Mol Ecol 2004; 14:549-61. [PMID: 15660945 DOI: 10.1111/j.1365-294x.2004.02418.x] [Citation(s) in RCA: 153] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Quercus petraea and Quercus robur are two closely related oak species, considered to hybridize. Genetic markers, however, indicate that despite sharing most alleles, the two species remain separate genetic units. Analysis of 20 microsatellite loci in multiple populations from both species suggested a genome-wide differentiation. Thus, the allele sharing between both species could be explained either by low rates of gene flow or shared ancestral variation. We performed further analyses of population differentiation in a biogeographical setting and an admixture analysis in mixed oak stands to distinguish between both hypotheses. Based on our results we propose that the low genetic differentiation among these species results from shared ancestry rather than high rates of gene flow.
Collapse
Affiliation(s)
- Graham Muir
- Institut für Tierzucht und Genetik, Veterinärmedizinische Universität Wien, Josef Baumann Gasse 1, 1210, Wien, Austria
| | | |
Collapse
|
35
|
Sibov ST, de Souza CL, Garcia AAF, Garcia AF, Silva AR, Mangolin CA, Benchimol LL, de Souza AP. Molecular mapping in tropical maize (Zea mays L.) using microsatellite markers. 1. Map construction and localization of loci showing distorted segregation. Hereditas 2004; 139:96-106. [PMID: 15061810 DOI: 10.1111/j.1601-5223.2003.01666.x] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Microsatellites have become the most important class of markers for mapping procedures. Primarily based on restriction fragment length polymorphism (RFLP) markers, several molecular genetic maps of maize have been developed, mainly using temperate inbred maize lines. To characterize the level of polymorphism of microsatellite loci and construct a genetic map in tropical maize, two elite inbred lines, L-08-05F and L-14-4B, were crossed to produce 400 F(2) individuals that were used as a mapping population. A survey of 859 primer pair sequences of microsatellites was used. The polymorphism screens of each microsatellite and genotype assignment were performed using high-resolution agarose gels. About 54 % of the primer sets gave clearly scorable amplification products, 13 % did not amplify and 33 % could not be scored on agarose gels. A total of 213 polymorphic markers were identified and used to genotype the mapping population. Among the polymorphic markers, 40 showed loci deviating from expected Mendelian ratios and clusters of deviating markers were located in three chromosome regions. Non-Mendelian scoring was present in 19 markers. The final genetic map with 117 markers spanned 1634 cM in length with an average interval of 14 cM between adjacent markers.
Collapse
Affiliation(s)
- Sérgio Tadeu Sibov
- Centro de Biologia Molecular e Engenharia Genética, Universidade Estadual de Campinas (CBMEG/UNICAMP), Cidade Universitária Zeferino Vaz, Campinas, SP, Brazil
| | | | | | | | | | | | | | | |
Collapse
|
36
|
Affiliation(s)
- Hans Ellegren
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18D, SE-752 36 Uppsala, Sweden.
| |
Collapse
|
37
|
Nürnberger B, Hofman S, Förg-Brey B, Praetzel G, Maclean A, Szymura JM, Abbott CM, Barton NH. A linkage map for the hybridising toads Bombina bombina and B. variegata (Anura: Discoglossidae). Heredity (Edinb) 2003; 91:136-42. [PMID: 12886280 DOI: 10.1038/sj.hdy.6800291] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Stable hybrid zones in which ecologically divergent taxa give rise to a range of recombinants are natural laboratories in which the genetic basis of adaptation and reproductive isolation can be unraveled. One such hybrid zone is formed by the fire-bellied toads Bombina bombina and B. variegata (Anura: Discoglossidae). Adaptations to permanent and ephemeral breeding habitats, respectively, have shaped numerous phenotypic differences between the taxa. All of these are, in principle, candidates for a genetic dissection via QTL mapping. We present here a linkage map of 28 codominant and 10 dominant markers in the Bombina genome. In an F2 cross, markers that were mainly microsatellites, SSCPs or allozymes were mapped to 20 linkage groups. Among the 40 isolated CA microsatellites, we noted a preponderance of compound and frequently interleaved CA-TA repeats as well as a striking polarity at the 5' end of the repeats.
Collapse
Affiliation(s)
- B Nürnberger
- Department Biologie II, Ludwig-Maximilians-Universität, Karlstr. 23-25, 80333 München, Germany.
| | | | | | | | | | | | | | | |
Collapse
|
38
|
Harr B, Kauer M, Schlötterer C. Hitchhiking mapping: a population-based fine-mapping strategy for adaptive mutations in Drosophilamelanogaster. Proc Natl Acad Sci U S A 2002; 99:12949-54. [PMID: 12351680 PMCID: PMC130566 DOI: 10.1073/pnas.202336899] [Citation(s) in RCA: 152] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2002] [Indexed: 01/02/2023] Open
Abstract
The identification of genes contributing to the adaptation of local populations is of great biological interest. In an attempt to characterize functionally important differences among African and non-African Drosophila melanogaster populations, we surveyed neutral microsatellite variation in an 850-kb genomic sequence. Three genomic regions were identified that putatively bear an adaptive mutation associated with the habitat expansion of D. melanogaster. A further inspection of two regions by sequence analysis of multiple fragments confirmed the presence of a recent beneficial mutation in the non-African populations. Our study suggests that hitchhiking mapping is a universal approach for the identification of ecologically important mutations.
Collapse
Affiliation(s)
- Bettina Harr
- Institut für Tierzucht und Genetik, Veterinärmedizinische Universität, Veterinärplatz 1, 1210 Vienna, Austria
| | | | | |
Collapse
|
39
|
Yamada NA, Smith GA, Castro A, Roques CN, Boyer JC, Farber RA. Relative rates of insertion and deletion mutations in dinucleotide repeats of various lengths in mismatch repair proficient mouse and mismatch repair deficient human cells. Mutat Res 2002; 499:213-25. [PMID: 11827714 DOI: 10.1016/s0027-5107(01)00282-2] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Microsatellites are DNA elements composed of short tandem repeats of 1-5bp. These sequences are particularly prone to frameshift mutation by insertion-deletion loop formation during replication. The mismatch repair system is responsible for correcting these replication errors, and microsatellite mutation rates are significantly elevated in the absence of mismatch repair. We have investigated the effect of varying the number of repeats in a (CA)n microsatellite on mutation rates in cultured mammalian cells proficient or deficient in mismatch repair. We have also compared the relative rates of single-repeat insertions and deletions in these cells. Two plasmid vectors were constructed for each repeat unit number (n=8, 17, and 30), such that the microsatellites, placed upstream of a bacterial neomycin resistance gene (neo), disrupted the reading frame of the gene in the (-1) or (+1) direction. Plasmids were introduced separately into the cells, where they integrated into the cellular genome. Mutation rates were determined by selection of clones with frameshift mutations in the microsatellite that restored the reading frame of the neo gene. We found that mutation rates were significantly higher for (CA)17 and (CA)30 tracts than for (CA)8 tracts in both mismatch repair proficient (mouse) and deficient (human) cells. A mutational bias favoring insertions was generally observed. In both (CA)17 and (CA)30 tracts, single-repeat insertion rates were higher than single-repeat deletion rates with or without mismatch repair; deletions of multiple repeat units (> or =8bp) were observed in these tracts, where as deletions this large were not found in the (CA)8 tract. Single-repeat mutations of both types were made at similar rates in (CA)8 tracts in human mismatch repair deficient (MMR-) cells, but single-repeat insertion rates were higher than single-repeat deletion rates in mouse mismatch repair proficient (MMR+) cells. Results of these direct studies on microsatellite mutations in cultured cells should be useful for refinement of mathematical models for microsatellite evolution.
Collapse
Affiliation(s)
- Nazumi A Yamada
- Department of Pathology and Laboratory Medicine, University of North Carolina at Chapel Hill, CB#7525 Brinkhous-Bullitt Building, Chapel Hill, NC 27599, USA
| | | | | | | | | | | |
Collapse
|
40
|
Katti MV, Ranjekar PK, Gupta VS. Differential distribution of simple sequence repeats in eukaryotic genome sequences. Mol Biol Evol 2001; 18:1161-7. [PMID: 11420357 DOI: 10.1093/oxfordjournals.molbev.a003903] [Citation(s) in RCA: 315] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Complete chromosome/genome sequences available from humans, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, and Saccharomyces cerevisiae were analyzed for the occurrence of mono-, di-, tri-, and tetranucleotide repeats. In all of the genomes studied, dinucleotide repeat stretches tended to be longer than other repeats. Additionally, tetranucleotide repeats in humans and trinucleotide repeats in Drosophila also seemed to be longer. Although the trends for different repeats are similar between different chromosomes within a genome, the density of repeats may vary between different chromosomes of the same species. The abundance or rarity of various di- and trinucleotide repeats in different genomes cannot be explained by nucleotide composition of a sequence or potential of repeated motifs to form alternative DNA structures. This suggests that in addition to nucleotide composition of repeat motifs, characteristic DNA replication/repair/recombination machinery might play an important role in the genesis of repeats. Moreover, analysis of complete genome coding DNA sequences of Drosophila, C. elegans, and yeast indicated that expansions of codon repeats corresponding to small hydrophilic amino acids are tolerated more, while strong selection pressures probably eliminate codon repeats encoding hydrophobic and basic amino acids. The locations and sequences of all of the repeat loci detected in genome sequences and coding DNA sequences are available at http://www.ncl-india.org/ssr and could be useful for further studies.
Collapse
Affiliation(s)
- M V Katti
- Plant Molecular Biology Unit, Division of Biochemical Sciences, National Chemical Laboratory, Pune, India
| | | | | |
Collapse
|
41
|
Suzuki A, Maruno A, Tahira T, Hayashi K. Polar alteration of short tandem repeats (STRs) in mammalian cells. Mutat Res 2001; 474:159-68. [PMID: 11239973 DOI: 10.1016/s0027-5107(01)00063-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Instability of short tandem repeats (STRs) in DNA during replication is observed in all organisms examined, and is causatively involved in various human diseases. We explore the mechanisms involved in instability by examining length changes occurring during the replication of [(CA)(20)TA](n) and [(CAG)(20)TAG](n), in human cells. We show that the majority of alterations consist of an insertion or deletion of one repeat unit, and base substitutions or length changes involving many repeat units are rare. We also show that length changes of two-tract STRs are biased toward the 3'-end of the repeat tract, in reference to lagging strand synthesis. There are some differences between our observations and previous observations in microbes, e.g. the orientation effect was not observed in this study. The results of this study are discussed in terms of the molecular mechanisms leading to alterations in repeat tracts.
Collapse
Affiliation(s)
- A Suzuki
- Division of Genome Analysis, Institute of Genetic Information, Kyushu University, 3-1-1 Maidashi, Higashi-ku, 812-8582, Fukuoka, Japan
| | | | | | | |
Collapse
|
42
|
Rolfsmeier ML, Dixon MJ, Lahue RS. Mismatch repair blocks expansions of interrupted trinucleotide repeats in yeast. Mol Cell 2000; 6:1501-7. [PMID: 11163222 DOI: 10.1016/s1097-2765(00)00146-5] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Disease-causing expansions of trinucleotide repeats (TNRs) can occur very frequently. In contrast, expansions are rare if the TNR is interrupted (imperfect). The molecular mechanism stabilizing interrupted alleles and thereby preventing disease has been elusive. We show that mismatch repair is the major stabilizing force for interrupted TNRs in Saccharomyces cerevisiae. Interrupted alleles expand much more often when mismatch repair is blocked by mutation or by poorly corrected mispairs. These results suggest that interruptions lead to mismatched expansion precursors. In normal cells, expansions are prevented in trans by mismatch repair, which coexcises the mismatches plus the aberrant, TNR-mediated secondary structure that otherwise resists removal. This study indicates a novel role for mismatch repair in mutation avoidance and, potentially, in disease prevention.
Collapse
Affiliation(s)
- M L Rolfsmeier
- Eppley Institute for Research in Cancer and Allied Diseases, University of Nebraska Medical Center, Omaha, NE 68198-6805, USA
| | | | | |
Collapse
|
43
|
Abstract
Microsatellite DNA sequences mutate at rates several orders of magnitude higher than that of the bulk of DNA. Such high rates mean that spontaneous mutations that form new-length variants can realistically be seen in pedigree analysis. Data on observed mutation events from various organisms are now accumulating, allowing inferences on DNA sequence evolution to be made through an unusually direct approach. Here I discuss and integrate microsatellite mutation data in an evolutionary context. A striking feature of the mutation process is that it seems highly heterogeneous, with distinct differences between species, repeat types, loci and alleles. Age and sex also affect the mutation rate. Within genomes at equilibrium, the microsatellite-length distribution is a delicate balance between biased mutation processes and point mutations acting towards the decay of repetitive DNA. Indeed, simple repeats do not evolve simply.
Collapse
Affiliation(s)
- H Ellegren
- Dept of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18D, SE-752 36, Uppsala, Sweden.
| |
Collapse
|
44
|
Anderson TJ, Su XZ, Roddam A, Day KP. Complex mutations in a high proportion of microsatellite loci from the protozoan parasite Plasmodium falciparum. Mol Ecol 2000; 9:1599-608. [PMID: 11050555 DOI: 10.1046/j.1365-294x.2000.01057.x] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Microsatellite loci are generally assumed to evolve via a stepwise mutational process and a battery of statistical techniques has been developed in recent years based on this or related mutation models. It is therefore important to investigate the appropriateness of these models in a wide variety of taxa. We used two approaches to examine mutation patterns in the malaria parasite Plasmodium falciparum: (i) we examined sequence variation at 12 tri-nucleotide repeat loci; and (ii) we analysed patterns of repeat structure and heterozygosity at 114 loci using data from 12 laboratory parasite lines. The sequencing study revealed complex patterns of mutation in five of the 12 loci studied. Alleles at two loci contain indels of 24 bp and 57 bp in flanking regions, while in the other three loci, blocks of imperfect microsatellites appear to be duplicated or inserted; these loci essentially consist of minisatellite repeats, with each repeat unit containing four to eight microsatellites. The survey of heterozygosity revealed a positive relationship between repeat number and microsatellite variability for both di- and trinucleotides, indicating a higher mutation rate in loci with longer repeat arrays. Comparisons of levels of variation in different repeat types indicate that the mutation rate of dinucleotide-bearing loci is 1.6-2.1 times faster than trinucleotides, consistent with the lower mean number of repeats in trinucleotide-bearing loci. However, despite the evidence that microsatellite arrays themselves are evolving in a manner consistent with stepwise mutation model in P. falciparum, the high frequency of complex mutations precludes the use of analytical tools based on this mutation model for many microsatellite-bearing loci in this protozoan. The results call into question the generality of models based on stepwise mutation for analysing microsatellite data, but also demonstrate the ease with which loci that violate model assumptions can be detected using minimal sequencing effort.
Collapse
Affiliation(s)
- T J Anderson
- Department of Genetics, Southwest Foundation for Biomedical Research, PO Box 760549, San Antonio, TX 78245-0549, USA.
| | | | | | | |
Collapse
|
45
|
Abstract
Analyzing mutation spectra is a very powerful method to determine the effects of various types of DNA damage and to understand the workings of various DNA repair pathways. However, compiling sequence-specific mutation spectra is laborious; even with modern sequencing technology, it is rare to obtain spectra with more than several hundred data points. Two assay systems are described for yeast, one for insertion/deletion mutations and one for base substitution mutations, that allow determination of specific mutations without the necessity of DNA sequencing. The assay for insertion/deletion mutations uses a variety of different simple repeats placed in frame with URA3 such that insertions or deletions lead to a selectable Ura(-) phenotype; essentially all such mutations are in the simple repeat sequence. The assay for base substitution mutations uses a series of six strains with different mutations in one essential codon of the CYC1 gene. Because only true reversions lead to a selectable phenotype, the bases mutated in any reversion event are known. The advantage of these assays is that they can quantitatively determine over several orders of magnitude the types of mutations that occur under a given set of conditions, without DNA sequencing.
Collapse
Affiliation(s)
- G F Crouse
- Department of Biology, Emory University, Atlanta, Georgia 30322, USA
| |
Collapse
|
46
|
Abstract
Microsatellite DNA loci have recently been adopted for many biological applications. Comparative studies across a wide range of species has revealed many details of their mutational properties and evolutionary life cycles. Experience shows that a full understanding of these processes is essential to ensure the effective use of microsatellites as analytical tools. In this article, we review the controversies that have arisen as biologists have taken up this new technology and the emerging consensus that has resulted from their debates. We point to the need for comparative DNA sequencing studies to produce input data for a new generation of theoretical models of microsatellite behaviour. We conclude by presenting our own conceptual model, 'Snakes and Ladders', as an aid to theory development.
Collapse
Affiliation(s)
- G K Chambers
- Institute for Molecular Systematics, School of Biological Sciences, Victoria University, Wellington, New Zealand.
| | | |
Collapse
|
47
|
Kruglyak S, Durrett R, Schug MD, Aquadro CF. Distribution and abundance of microsatellites in the yeast genome can Be explained by a balance between slippage events and point mutations. Mol Biol Evol 2000; 17:1210-9. [PMID: 10908641 DOI: 10.1093/oxfordjournals.molbev.a026404] [Citation(s) in RCA: 66] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We fit a Markov chain model of microsatellite evolution introduced by Kruglyak et al. to data on all di-, tri-, and tetranucleotide repeats in the yeast genome. Our results suggest that many features of the distribution of abundance and length of microsatellites can be explained by this simple model, which incorporates a competition between slippage events and base pair substitutions, with no need to invoke selection or constraints on the lengths. Our results provide some new information on slippage rates for individual repeat motifs, which suggest that AT-rich trinucleotide repeats have higher slippage rates. As our model predicts, we found that many repeats were adjacent to shorter repeats of the same motif. However, we also found a significant tendency of microsatellites of different motifs to cluster.
Collapse
Affiliation(s)
- S Kruglyak
- Department of Mathematics, University of Southern California, CA, USA
| | | | | | | |
Collapse
|
48
|
Harr B, Zangerl B, Schlötterer C. Removal of microsatellite interruptions by DNA replication slippage: phylogenetic evidence from Drosophila. Mol Biol Evol 2000; 17:1001-9. [PMID: 10889213 DOI: 10.1093/oxfordjournals.molbev.a026381] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Microsatellites are tandem repetitions of short (1-6 bp) motifs. It is widely assumed that microsatellites degenerate through the accumulation of base substitutions in the repeat array. Using a phylogenetic framework, we studied the evolutionary dynamics of interruptions in three Drosophila microsatellite loci. For all three loci, we show that the interruptions in a microsatellite can be lost, resulting in a longer uninterrupted microsatellite stretch. These results indicate that mutations in the microsatellite array do not necessarily lead to decay but may represent only a transition state during the evolution of a microsatellite. Most likely, this purification of interrupted microsatellites is caused by DNA replication slippage.
Collapse
Affiliation(s)
- B Harr
- Institut für Tierzucht und Genetik, Veterinärmedizinische Universität Wien, Austria
| | | | | |
Collapse
|
49
|
Sia EA, Butler CA, Dominska M, Greenwell P, Fox TD, Petes TD. Analysis of microsatellite mutations in the mitochondrial DNA of Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 2000; 97:250-5. [PMID: 10618404 PMCID: PMC26649 DOI: 10.1073/pnas.97.1.250] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/1999] [Indexed: 01/28/2023] Open
Abstract
In the nuclear genome of Saccharomyces cerevisiae, simple, repetitive DNA sequences (microsatellites) mutate at rates much higher than nonrepetitive sequences. Most of these mutations are deletions or additions of repeat units. The yeast mitochondrial genome also contains many microsatellites. To examine the stability of these sequences, we constructed a reporter gene (arg8(m)) containing out-of-frame insertions of either poly(AT) or poly(GT) tracts within the coding sequence. Yeast strains with this reporter gene inserted within the mitochondrial genome were constructed. Using these strains, we showed that poly(GT) tracts were considerably less stable than poly(AT) tracts and that alterations usually involved deletions rather than additions of repeat units. In contrast, in the nuclear genome, poly(GT) and poly(AT) tracts had similar stabilities, and alterations usually involved additions rather than deletions. Poly(GT) tracts were more stable in the mitochondria of diploid cells than in haploids. In addition, an msh1 mutation destabilized poly(GT) tracts in the mitochondrial genome.
Collapse
Affiliation(s)
- E A Sia
- Department of Biology, Curriculum in Genetics and Molecular Biology, University of North Carolina, Chapel Hill, NC 27599-3280, USA
| | | | | | | | | | | |
Collapse
|
50
|
Bergström TF, Engkvist H, Erlandsson R, Josefsson A, Mack SJ, Erlich HA, Gyllensten U. Tracing the origin of HLA-DRB1 alleles by microsatellite polymorphism. Am J Hum Genet 1999; 64:1709-18. [PMID: 10330359 PMCID: PMC1377915 DOI: 10.1086/302401] [Citation(s) in RCA: 21] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open
Abstract
We analyzed the origin of allelic diversity at the class II HLA-DRB1 locus, using a complex microsatellite located in intron 2, close to the polymorphic second exon. A phylogenetic analysis of human, gorilla, and chimpanzee DRB1 sequences indicated that the structure of the microsatellite has evolved, primarily by point mutations, from a putative ancestral (GT)x(GA)y-complex-dinucleotide repeat. In all contemporary DRB1 allelic lineages, with the exception of the human *04 and the gorilla *08 lineages, the (GA)y repeat is interrupted, often by a G-->C substitution. In general, the length of the 3' (GA)y repeat correlates with the allelic lineage and thus evolves more slowly than a middle (GA)z repeat, whose length correlates with specific alleles within the lineage. Comparison of the microsatellite sequence from 30 human DRB1 alleles showed the longer 5' (GT)x to be more variable than the shorter middle (GA)z and 3' (GA)y repeats. Analysis of multiple samples with the same exon sequence, derived from different continents, showed that the 5' (GT)x repeat evolves more rapidly than the middle (GA)z and the 3' (GA)y repeats, which is consistent with findings of a higher mutation rate for longer tracts. The microsatellite-repeat-length variation was used to trace the origin of new DRB1 alleles, such as the new *08 alleles found in the Cayapa people of Ecuador and the Ticuna people of Brazil.
Collapse
Affiliation(s)
- T F Bergström
- Department of Genetics and Pathology, Unit of Medical Genetics, Beijer Laboratory, University of Uppsala, Uppsala, Sweden
| | | | | | | | | | | | | |
Collapse
|