1
|
Eslami Rasekh M, Hernández Y, Drinan SD, Fuxman Bass J, Benson G. Genome-wide characterization of human minisatellite VNTRs: population-specific alleles and gene expression differences. Nucleic Acids Res 2021; 49:4308-4324. [PMID: 33849068 PMCID: PMC8096271 DOI: 10.1093/nar/gkab224] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Revised: 03/06/2021] [Accepted: 03/18/2021] [Indexed: 11/12/2022] Open
Abstract
Variable Number Tandem Repeats (VNTRs) are tandem repeat (TR) loci that vary in copy number across a population. Using our program, VNTRseek, we analyzed human whole genome sequencing datasets from 2770 individuals in order to detect minisatellite VNTRs, i.e., those with pattern sizes ≥7 bp. We detected 35 638 VNTR loci and classified 5676 as commonly polymorphic (i.e. with non-reference alleles occurring in >5% of the population). Commonly polymorphic VNTR loci were found to be enriched in genomic regions with regulatory function, i.e. transcription start sites and enhancers. Investigation of the commonly polymorphic VNTRs in the context of population ancestry revealed that 1096 loci contained population-specific alleles and that those could be used to classify individuals into super-populations with near-perfect accuracy. Search for quantitative trait loci (eQTLs), among the VNTRs proximal to genes, indicated that in 187 genes expression differences correlated with VNTR genotype. We validated our predictions in several ways, including experimentally, through the identification of predicted alleles in long reads, and by comparisons showing consistency between sequencing platforms. This study is the most comprehensive analysis of minisatellite VNTRs in the human population to date.
Collapse
Affiliation(s)
| | - Yözen Hernández
- Graduate Program in Bioinformatics, Boston University, Boston, MA 02215, USA
| | | | - Juan I Fuxman Bass
- Graduate Program in Bioinformatics, Boston University, Boston, MA 02215, USA
- Department of Biology, Boston University, Boston, MA 02215, USA
| | - Gary Benson
- Graduate Program in Bioinformatics, Boston University, Boston, MA 02215, USA
- Department of Biology, Boston University, Boston, MA 02215, USA
- Department of Computer Science, Boston University, Boston, MA 02215, USA
| |
Collapse
|
2
|
Saeed AF, Wang R, Wang S. Microsatellites in Pursuit of Microbial Genome Evolution. Front Microbiol 2016; 6:1462. [PMID: 26779133 PMCID: PMC4700210 DOI: 10.3389/fmicb.2015.01462] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2015] [Accepted: 12/07/2015] [Indexed: 12/27/2022] Open
Abstract
Microsatellites or short sequence repeats are widespread genetic markers which are hypermutable 1-6 bp long short nucleotide motifs. Significantly, their applications in genetics are extensive due to their ceaseless mutational degree, widespread length variations and hypermutability skills. These features make them useful in determining the driving forces of evolution by using powerful molecular techniques. Consequently, revealing important questions, for example, what is the significance of these abundant sequences in DNA, what are their roles in genomic evolution? The answers of these important questions are hidden in the ways these short motifs contributed in altering the microbial genomes since the origin of life. Even though their size ranges from 1 -to- 6 bases, these repeats are becoming one of the most popular genetic probes in determining their associations and phylogenetic relationships in closely related genomes. Currently, they have been widely used in molecular genetics, biotechnology and evolutionary biology. However, due to limited knowledge; there is a significant gap in research and lack of information concerning hypermutational mechanisms. These mechanisms play a key role in microsatellite loci point mutations and phase variations. This review will extend the understandings of impacts and contributions of microsatellite in genomic evolution and their universal applications in microbiology.
Collapse
Affiliation(s)
- Abdullah F. Saeed
- Key Laboratory of Biopesticide and Chemical Biology of Education Ministry, School of Life Sciences, Fujian Agriculture and Forestry UniversityFuzhou, China
| | | | | |
Collapse
|
3
|
Repeats in Transforming Acidic Coiled-Coil (TACC) Genes. Biochem Genet 2013; 51:458-73. [DOI: 10.1007/s10528-013-9577-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2012] [Accepted: 12/30/2012] [Indexed: 02/04/2023]
|
4
|
Kato M, Haku T, Hibino T, Fukada H, Mishima Y, Yamashita I, Minoshima S, Nagayama K, Shimizu N. Stable minihairpin structures forming at minisatellite DNA isolated from yellow fin sea bream Acanthopagrus latus. Comp Biochem Physiol B Biochem Mol Biol 2006; 146:427-37. [PMID: 17258918 DOI: 10.1016/j.cbpb.2006.11.029] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2006] [Revised: 11/27/2006] [Accepted: 11/28/2006] [Indexed: 12/20/2022]
Abstract
The lengths of simple repeat sequences are generally unstable or polymorphic (highly variable with respect to the numbers of tandem repeats). Previously we have isolated a family of minisatellite DNA (GenBank accession AF422186) that appears specifically and abundantly in the genome of yellow fin sea bream Acanthopagrus latus but not in closely-related red sea bream Pagrus major, and found that the numbers of tandem arrays in the homologous loci are polymorphic. This means that the minisatellite sequence has appeared and propagated in A. latus genome after speciation. In order to understand what makes the minisatellite widespread within the A. latus genome and what causes the polymorphic nature of the number of tandem repeats, the structural features of single-stranded polynucleotides were analyzed by electrophoresis, chemical modification, circular dichroism (CD), differential scanning calorimetry (DSC) and electron microscopy. The results suggest that a portion of the repeat unit forms a stable minihairpin structure, and it can cause polymerase pausing within the minisatellite DNA.
Collapse
Affiliation(s)
- Mikio Kato
- Department of Biological Science, Osaka Prefecture University Graduate School of Science, 1-1 Gakuencho, Naka-ku, Sakai 599-8531, Japan.
| | | | | | | | | | | | | | | | | |
Collapse
|
5
|
Tse JYM, Liu VWS, Yeung WSB, Lau EYL, Ng EHY, Ho PC. Molecular analysis of the androgen receptor gene in Hong Kong Chinese infertile men. J Assist Reprod Genet 2003; 20:227-33. [PMID: 12877254 PMCID: PMC3455324 DOI: 10.1023/a:1024107528283] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
PURPOSE To investigate the relationship between CAG repeat length in the androgen receptor gene and impaired spermatogenesis in Hong Kong Chinese population. METHODS The CAG repeat region was amplified by polymerase chain reaction (PCR) in 85 nonobstructive azoospermic or severe oligozoospermic men, and 45 fertile males. The number of CAG repeat was analyzed by DNA sequencing. Serum FSH, LH, and testosterone levels were also determined in these men. RESULTS Among nonobstructive azoospermic males, three men (5.7%) possessed short CAG repeats (< 16), and three (5.7%) other men possessed long CAG repeats (> 30). Short CAG repeats (< 16) were also found in two severe oligozoospermic males (6.3%). The incidence of infertile men with short or long CAG repeats is significantly higher in the azoospermic group (p = 0.03) but not in the severe oligozoospermic group (p = 0.17) when compared with the fertile controls CONCLUSION Our data suggest an association between CAG repeat lengths and impaired spermatogenesis in azoospermic males in our population.
Collapse
Affiliation(s)
- J Y M Tse
- Department of Obstetrics and Gynaecology, The University of Hong Kong, Pokfulam Road, Hong Kong, People's Republic of China.
| | | | | | | | | | | |
Collapse
|
6
|
Sun XM, Lieschke GJ. Abnormal protein tyrosine kinases associated with human haematological malignancies. Chin J Cancer Res 2002. [DOI: 10.1007/s11670-002-0018-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
|
7
|
Viguera E, Canceill D, Ehrlich SD. Replication slippage involves DNA polymerase pausing and dissociation. EMBO J 2001; 20:2587-95. [PMID: 11350948 PMCID: PMC125466 DOI: 10.1093/emboj/20.10.2587] [Citation(s) in RCA: 199] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Genome rearrangements can take place by a process known as replication slippage or copy-choice recombination. The slippage occurs between repeated sequences in both prokaryotes and eukaryotes, and is invoked to explain microsatellite instability, which is related to several human diseases. We analysed the molecular mechanism of slippage between short direct repeats, using in vitro replication of a single-stranded DNA template that mimics the lagging strand synthesis. We show that slippage involves DNA polymerase pausing, which must take place within the direct repeat, and that the pausing polymerase dissociates from the DNA. We also present evidence that, upon polymerase dissociation, only the terminal portion of the newly synthesized strand separates from the template and anneals to another direct repeat. Resumption of DNA replication then completes the slippage process.
Collapse
Affiliation(s)
- E Viguera
- Laboratoire de Génétique Microbienne, Institut National de la Recherche Agronomique, Domaine de Vilvert, 78350 Jouy en Josas, France.
| | | | | |
Collapse
|
8
|
Bowater RP, Wells RD. The intrinsically unstable life of DNA triplet repeats associated with human hereditary disorders. PROGRESS IN NUCLEIC ACID RESEARCH AND MOLECULAR BIOLOGY 2001; 66:159-202. [PMID: 11051764 DOI: 10.1016/s0079-6603(00)66029-4] [Citation(s) in RCA: 98] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Expansions of specific DNA triplet repeats are the cause of an increasing number of hereditary neurological disorders in humans. In some diseases, such as Huntington's and several spinocerebellar ataxias, the repetitive DNA sequences are translated into long tracts of the same amino acid (usually glutamine), which alters interactions with cellular constituents and leads to the development of disease. For other disorders, including common genetic disorders such as myotonic dystrophy and fragile X syndrome, the DNA repeat is located in noncoding regions of transcribed sequences and disease is probably caused by altered gene expression. In studies in lower organisms, mammalian cells, and transgenic mice, high frequencies of length changes (increases and decreases) occur in long DNA triplet repeats. These observations are similar to other types of repetitive DNA sequences, which also undergo frequent length changes at genomic loci. A variety of processes acting on DNA influence the genetic stability of DNA triplet repeats, including replication, recombination, repair, and transcription. It is not yet known how these different multienzyme systems interact to produce the genetic mutation of expanded repeats. In vitro studies have identified that DNA triplet repeats can adopt several unusual DNA structures, including hairpins, triplexes, quadruplexes, slipped structures, and highly flexible and writhed helices. The formation of stable unusual structures within the cell is likely to disturb DNA metabolism and be a critical intermediate in the molecular mechanism(s) leading to genetic instabilities of DNA repeats and, hence, to disease pathogenesis.
Collapse
Affiliation(s)
- R P Bowater
- Molecular Biology Sector, School of Biological Sciences, University of East Anglia, Norwich, United Kingdom
| | | |
Collapse
|
9
|
Cristillo AD, Mortimer JR, Barrette IH, Lillicrap TP, Forsdyke DR. Double-stranded RNA as a not-self alarm signal: to evade, most viruses purine-load their RNAs, but some (HTLV-1, Epstein-Barr) pyrimidine-load. J Theor Biol 2001; 208:475-91. [PMID: 11222051 DOI: 10.1006/jtbi.2000.2233] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
For double-stranded RNA (dsRNA) to signal the presence of foreign (non-self) nucleic acid, self-RNA-self-RNA interactions should be minimized. Indeed, self-RNAs appear to have been fine-tuned over evolutionary time by the introduction of purines in clusters in the loop regions of stem-loop structures. This adaptation should militate against the "kissing" interactions which initiate formation of dsRNA. Our analyses of virus base compositions suggest that, to avoid triggering the host cell's dsRNA surveillance mechanism, most viruses purine-load their RNAs to resemble host RNAs ("stealth" strategy). However, some GC-rich latent viruses (HTLV-1, EBV) pyrimidine-load their RNAs. It is suggested that when virus production begins, these RNAs suddenly increase in concentration and impair host mRNA function by virtue of an excess of complementary "kissing" interactions ("surprise" strategy). Remarkably, the only mRNA expressed in the most fundamental form of EBV latency (the "EBNA-1 program") is purine-loaded. This apparent stealth strategy is reinforced by a simple sequence repeat which prefers purine-rich codons. During latent infection the EBNA-1 protein may evade recognition by cytotoxic T-cells, not by virtue of containing a simple sequence amino acid repeat as has been proposed, but by virtue of the encoding mRNA being purine-loaded to prevent interactions with host RNAs of either genic or non-genic origin.
Collapse
Affiliation(s)
- A D Cristillo
- Department of Biochemistry, Queen's University, Kingston, Ontario, K7L3N6, Canada
| | | | | | | | | |
Collapse
|
10
|
Wren JD, Forgacs E, Fondon JW, Pertsemlidis A, Cheng SY, Gallardo T, Williams RS, Shohet RV, Minna JD, Garner HR. Repeat polymorphisms within gene regions: phenotypic and evolutionary implications. Am J Hum Genet 2000; 67:345-56. [PMID: 10889045 PMCID: PMC1287183 DOI: 10.1086/303013] [Citation(s) in RCA: 112] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2000] [Accepted: 06/02/2000] [Indexed: 11/03/2022] Open
Abstract
We have developed an algorithm that predicted 11,265 potentially polymorphic tandem repeats within transcribed sequences. We estimate that 22% (2,207/9,717) of the annotated clusters within UniGene contain at least one potentially polymorphic locus. Our predictions were tested by allelotyping a panel of approximately 30 individuals for 5% of these regions, confirming polymorphism for more than half the loci tested. Our study indicates that tandem-repeat polymorphisms in genes are more common than is generally believed. Approximately 8% of these loci are within coding sequences and, if polymorphic, would result in frameshifts. Our catalogue of putative polymorphic repeats within transcribed sequences comprises a large set of potentially phenotypic or disease-causing loci. In addition, from the anomalous character of the repetitive sequences within unannotated clusters, we also conclude that the UniGene cluster count substantially overestimates the number of genes in the human genome. We hypothesize that polymorphisms in repeated sequences occur with some baseline distribution, on the basis of repeat homogeneity, size, and sequence composition, and that deviations from that distribution are indicative of the nature of selection pressure at that locus. We find evidence of selective maintenance of the ability of some genes to respond very rapidly, perhaps even on intragenerational timescales, to fluctuating selective pressures.
Collapse
Affiliation(s)
- J D Wren
- Program in Genetics, Southwestern Graduate School of Biomedical Sciences, Dallas, TX, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|