1
|
Chen Z, Xuan Y, Liang G, Yang X, Yu Z, Barker SC, Kelava S, Bu W, Liu J, Gao S. Precise annotation of tick mitochondrial genomes reveals multiple copy number variation of short tandem repeats and one transposon-like element. BMC Genomics 2020; 21:488. [PMID: 32680454 PMCID: PMC7367389 DOI: 10.1186/s12864-020-06906-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2019] [Accepted: 07/10/2020] [Indexed: 02/07/2023] Open
Abstract
Background In the present study, we used long-PCR amplification coupled with Next-Generation Sequencing (NGS) to obtain complete mitochondrial (mt) genomes of individual ticks and unprecedently performed precise annotation of these mt genomes. We aimed to: (1) develop a simple, cost-effective and accurate method for the study of extremely high AT-content mt genomes within an individual animal (e.g. Dermacentor silvarum) containing miniscule DNA; (2) provide a high-quality reference genome for D. silvarum with precise annotation and also for future studies of other tick mt genomes; and (3) detect and analyze mt DNA variation within an individual tick. Results These annotations were confirmed by the PacBio full-length transcriptome data to cover both entire strands of the mitochondrial genomes without any gaps or overlaps. Moreover, two new and important findings were reported for the first time, contributing fundamental knowledge to mt biology. The first was the discovery of a transposon-like element that may eventually reveal much about mechanisms of gene rearrangements in mt genomes. Another finding was that Copy Number Variation (CNV) of Short Tandem Repeats (STRs) account for mitochondrial sequence diversity (heterogeneity) within an individual tick, insect, mouse or human, whereas SNPs were not detected. The CNV of STRs in the protein-coding genes resulted in frameshift mutations in the proteins, which can cause deleterious effects. Mitochondria containing these deleterious STR mutations accumulate in cells and can produce deleterious proteins. Conclusions We proposed that the accumulation of CNV of STRs in mitochondria may cause aging or diseases. Future tests of the CNV of STRs hypothesis help to ultimately reveal the genetic basis of mitochondrial DNA variation and its consequences (e.g., aging and diseases) in animals. Our study will lead to the reconsideration of the importance of STRs and a unified study of CNV of STRs with longer and shorter repeat units (particularly polynucleotides) in both nuclear and mt genomes.
Collapse
Affiliation(s)
- Ze Chen
- Hebei Key Laboratory of Animal Physiology, Biochemistry and Molecular Biology, College of Life Sciences, Hebei Normal University, Shijiazhuang, Hebei, 050024, P. R. China
| | - Yibo Xuan
- Hebei Key Laboratory of Animal Physiology, Biochemistry and Molecular Biology, College of Life Sciences, Hebei Normal University, Shijiazhuang, Hebei, 050024, P. R. China.,College of Life Sciences, Nankai University, Tianjin, Tianjin, 300071, P. R. China
| | - Guangcai Liang
- Frontier Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin, Tianjin, 300350, P. R. China
| | - Xiaolong Yang
- Hebei Key Laboratory of Animal Physiology, Biochemistry and Molecular Biology, College of Life Sciences, Hebei Normal University, Shijiazhuang, Hebei, 050024, P. R. China
| | - Zhijun Yu
- Hebei Key Laboratory of Animal Physiology, Biochemistry and Molecular Biology, College of Life Sciences, Hebei Normal University, Shijiazhuang, Hebei, 050024, P. R. China
| | - Stephen C Barker
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, QLD, 4072, Australia
| | - Samuel Kelava
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, QLD, 4072, Australia
| | - Wenjun Bu
- College of Life Sciences, Nankai University, Tianjin, Tianjin, 300071, P. R. China
| | - Jingze Liu
- Hebei Key Laboratory of Animal Physiology, Biochemistry and Molecular Biology, College of Life Sciences, Hebei Normal University, Shijiazhuang, Hebei, 050024, P. R. China.
| | - Shan Gao
- College of Life Sciences, Nankai University, Tianjin, Tianjin, 300071, P. R. China. .,School of Statistics and Data Science, Nankai University, Tianjin, Tianjin, 300071, P. R. China.
| |
Collapse
|
2
|
Maurice S, Montes MS, Nielsen BJ, Bødker L, Martin MD, Jønck CG, Kjøller R, Rosendahl S. Population genomics of an outbreak of the potato late blight pathogen, Phytophthora infestans, reveals both clonality and high genotypic diversity. MOLECULAR PLANT PATHOLOGY 2019; 20:1134-1146. [PMID: 31145530 PMCID: PMC6640178 DOI: 10.1111/mpp.12819] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
An outbreak of the potato late blight pathogen Phytophthora infestans in Denmark was characterized in order to resolve the population structure and determine to what extent sexual reproduction was occurring. A standard set of microsatellite simple sequence repeats (SSRs) and single nucleotide polymorphism (SNP) markers generated using restriction site-associated DNA sequencing (RAD-seq) were employed in parallel. A total of 83 individuals, isolated from seven different potato fields in 2014, were analysed together with five Danish whole-genome sequenced isolates, as well as two Mexican individuals used as an outgroup. From a filtered dataset of 55 288 SNPs, population genomics analyses revealed no sign of recombination, implying clonality. In spite of this, multilocus genotypes were unique to individual potato fields, with little evidence of gene flow between fields. Ploidy analysis performed on the SNPs dataset indicated that the majority of isolates were diploid. These contradictory results with clonality and high genotypic diversity may suggest that rare sexual events likely still contribute to the population. Comparison of the results generated by SSRs vs SNPs data indicated that large marker sets, generated by RAD-seq, may be advised going forward, as it provides a higher level of genetic discrimination than SSRs.
Collapse
Affiliation(s)
- Sundy Maurice
- Section for Genetics and Evolutionary Biology, Department of BiosciencesUniversity of OsloBlindernveien 31Oslo0316Norway
| | - Melanie S. Montes
- Department of BiologyUniversity of CopenhagenUniversitetsparken 15Copenhagen O2100Denmark
| | - Bent J. Nielsen
- Department of AgroecologyAarhus UniversityForsøgsvej 1Slagelse4200Denmark
| | - Lars Bødker
- Danish Centre for Food and AgricultureAarhus UniversityBlichers Allé 20Tjele8830Denmark
| | - Michael D. Martin
- Centre for GeogeneticsNatural History Museum of DenmarkSølvgade 83Copenhagen‐K1307Denmark
| | - Carina G. Jønck
- Department of BiologyUniversity of CopenhagenUniversitetsparken 15Copenhagen O2100Denmark
| | - Rasmus Kjøller
- Department of BiologyUniversity of CopenhagenUniversitetsparken 15Copenhagen O2100Denmark
| | - Søren Rosendahl
- Department of BiologyUniversity of CopenhagenUniversitetsparken 15Copenhagen O2100Denmark
| |
Collapse
|
3
|
de Groot T, Meis JF. Microsatellite Stability in STR Analysis Aspergillus fumigatus Depends on Number of Repeat Units. Front Cell Infect Microbiol 2019; 9:82. [PMID: 30984630 PMCID: PMC6449440 DOI: 10.3389/fcimb.2019.00082] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Accepted: 03/11/2019] [Indexed: 01/02/2023] Open
Abstract
More than a decade ago a short tandem repeat-based typing method was developed for the fungus Aspergillus fumigatus. This STRAf assay is based on the analysis of nine short tandem repeat markers. Interpretation of this STRAf assay is complicated when there are only one or two differences in tandem repeat markers between isolates, as the stability of these markers is unknown. To determine the stability of these nine markers, a STRAf assay was performed on 73–100 successive generations of five clonally expanded A. fumigatus isolates. In a total of 473 generations we found five times an increase of one tandem repeat unit. Three changes were found in the trinucleotide repeat marker STRAf 3A, while the other two were found in the trinucleotide repeat marker STRAf 3C. The di- or tetranucleotide repeats were not altered. The altered STRAf markers 3A and 3C demonstrated the highest number of repeat units (≥50) as compared to the other markers (≤26). Altogether, we demonstrated that 7 of 9 STRAf markers remain stable for 473 generations and that the frequency of alterations in tandem repeats is positively correlated with the number of repeats. The potential low level instability of STRAf markers 3A and 3C should be taken into account when interpreting STRAf data during an outbreak.
Collapse
Affiliation(s)
- Theun de Groot
- Department of Medical Microbiology and Infectious Diseases, Canisius Wilhelmina Hospital (CWZ), Nijmegen, Netherlands
| | - Jacques F Meis
- Department of Medical Microbiology and Infectious Diseases, Canisius Wilhelmina Hospital (CWZ), Nijmegen, Netherlands.,Centre of Expertise in Mycology, Radboudumc/CWZ, Nijmegen, Netherlands.,Department of Medical Microbiology, Radboudumc, Nijmegen, Netherlands
| |
Collapse
|
4
|
Examination of Clock and Adcyap1 gene variation in a neotropical migratory passerine. PLoS One 2018; 13:e0190859. [PMID: 29324772 PMCID: PMC5764313 DOI: 10.1371/journal.pone.0190859] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2017] [Accepted: 12/21/2017] [Indexed: 11/19/2022] Open
Abstract
Complex behavioral traits, such as those making up a migratory phenotype, are regulated by multiple environmental factors and multiple genes. We investigated possible relationships between microsatellite variation at two candidate genes implicated in the control of migratory behavior, Clock and Adcyap1, and several aspects of migratory life-history and evolutionary divergence in the Painted Bunting (Passerina ciris), a species that shows wide variation in migratory and molting strategies across a disjunct distribution. We focused on Clock and Adcyap1 microsatellite variation across three Painted Bunting populations in Oklahoma, Louisiana, and North Carolina, and for the Oklahoma breeding population we used published migration tracking data on adult males to explore phenotypic variation in individual migratory behavior. We found no correlation between microsatellite allele size within either Clock and Adcyap1 relative to the initiation or duration of fall migration in adult males breeding in Oklahoma. We also show the lack of significant correlations with aspects of the migratory phenotype for the Louisiana population. Our research highlights the limitations of studying microsatellite allelic mutations that are of undetermined functional influence relative to complex behavioral phenotypes.
Collapse
|
5
|
NGAI MINGYIN, SAITOU NARUYA. The effect of perfection status on mutation rates of microsatellites in primates. ANTHROPOL SCI 2016. [DOI: 10.1537/ase.160124] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
Affiliation(s)
- MING YIN NGAI
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo
- Division of Population Genetics, National Institute of Genetics, Mishima
| | - NARUYA SAITOU
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo
- Division of Population Genetics, National Institute of Genetics, Mishima
| |
Collapse
|
6
|
Lenz C, Haerty W, Golding GB. Increased substitution rates surrounding low-complexity regions within primate proteins. Genome Biol Evol 2014; 6:655-65. [PMID: 24572016 PMCID: PMC3971593 DOI: 10.1093/gbe/evu042] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Previous studies have found that DNA-flanking low-complexity regions (LCRs) have an increased substitution rate. Here, the substitution rate was confirmed to increase in the vicinity of LCRs in several primate species, including humans. This effect was also found among human sequences from the 1000 Genomes Project. A strong correlation was found between average substitution rate per site and distance from the LCR, as well as the proportion of genes with gaps in the alignment at each site and distance from the LCR. Along with substitution rates, dN/dS ratios were also determined for each site, and the proportion of sites undergoing negative selection was found to have a negative relationship with distance from the LCR.
Collapse
Affiliation(s)
- Carolyn Lenz
- Department of Biology, McMaster University, Hamilton, Ontario, Canada
| | | | | |
Collapse
|
7
|
Abstract
Birth-death processes (BDPs) are continuous-time Markov chains that track the number of "particles" in a system over time. While widely used in population biology, genetics and ecology, statistical inference of the instantaneous particle birth and death rates remains largely limited to restrictive linear BDPs in which per-particle birth and death rates are constant. Researchers often observe the number of particles at discrete times, necessitating data augmentation procedures such as expectation-maximization (EM) to find maximum likelihood estimates. For BDPs on finite state-spaces, there are powerful matrix methods for computing the conditional expectations needed for the E-step of the EM algorithm. For BDPs on infinite state-spaces, closed-form solutions for the E-step are available for some linear models, but most previous work has resorted to time-consuming simulation. Remarkably, we show that the E-step conditional expectations can be expressed as convolutions of computable transition probabilities for any general BDP with arbitrary rates. This important observation, along with a convenient continued fraction representation of the Laplace transforms of the transition probabilities, allows for novel and efficient computation of the conditional expectations for all BDPs, eliminating the need for truncation of the state-space or costly simulation. We use this insight to derive EM algorithms that yield maximum likelihood estimation for general BDPs characterized by various rate models, including generalized linear models. We show that our Laplace convolution technique outperforms competing methods when they are available and demonstrate a technique to accelerate EM algorithm convergence. We validate our approach using synthetic data and then apply our methods to cancer cell growth and estimation of mutation parameters in microsatellite evolution.
Collapse
Affiliation(s)
- Forrest W Crawford
- Department of Biostatistics, Yale University, 60 College Street, Box 208034, New Haven, CT 06510 USA
| | - Vladimir N Minin
- Department of Statistics, University of Washington, Padelford Hall C-315, Box 354322, Seattle, WA 98195-4322 USA
| | - Marc A Suchard
- Department of Biomathematics, University of California Los Angeles, 6558 Gonda Building, Los Angeles, CA 90095-1766 USA ; Department of Biostatistics, University of California Los Angeles, 6558 Gonda Building, Los Angeles, CA 90095-1766 USA ; Department of Human Genetics, University of California Los Angeles, 6558 Gonda Building, Los Angeles, CA 90095-1766 USA
| |
Collapse
|
8
|
LaRue BL, Lagacé R, Chang CW, Holt A, Hennessy L, Ge J, King JL, Chakraborty R, Budowle B. Characterization of 114 insertion/deletion (INDEL) polymorphisms, and selection for a global INDEL panel for human identification. Leg Med (Tokyo) 2014; 16:26-32. [DOI: 10.1016/j.legalmed.2013.10.006] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2013] [Revised: 08/19/2013] [Accepted: 10/22/2013] [Indexed: 11/15/2022]
|
9
|
Loire E, Higuet D, Netter P, Achaz G. Evolution of coding microsatellites in primate genomes. Genome Biol Evol 2013; 5:283-95. [PMID: 23315383 PMCID: PMC3590770 DOI: 10.1093/gbe/evt003] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Microsatellites (SSRs) are highly susceptible to expansions and contractions. When located in a coding sequence, the insertion or the deletion of a single unit for a mono-, di-, tetra-, or penta(nucleotide)-SSR creates a frameshift. As a consequence, one would expect to find only very few of these SSRs in coding sequences because of their strong deleterious potential. Unexpectedly, genomes contain many coding SSRs of all types. Here, we report on a study of their evolution in a phylogenetic context using the genomes of four primates: human, chimpanzee, orangutan, and macaque. In a set of 5,015 orthologous genes unambiguously aligned among the four species, we show that, except for tri- and hexa-SSRs, for which insertions and deletions are frequently observed, SSRs in coding regions evolve mainly by substitutions. We show that the rate of substitution in all types of coding SSRs is typically two times higher than in the rest of coding sequences. Additionally, we observe that although numerous coding SSRs are created and lost by substitutions in the lineages, their numbers remain constant. This last observation suggests that the coding SSRs have reached equilibrium. We hypothesize that this equilibrium involves a combination of mutation, drift, and selection. We thus estimated the fitness cost of mono-SSRs and show that it increases with the number of units. We finally show that the cost of coding mono-SSRs greatly varies from function to function, suggesting that the strength of the selection that acts against them can be correlated to gene functions.
Collapse
Affiliation(s)
- Etienne Loire
- UMR 7138, Systématique, Adaptation, Evolution (UPMC, CNRS, MNHN, IRD), Paris, France
| | | | | | | |
Collapse
|
10
|
Mature microsatellites: mechanisms underlying dinucleotide microsatellite mutational biases in human cells. G3-GENES GENOMES GENETICS 2013; 3:451-63. [PMID: 23450065 PMCID: PMC3583453 DOI: 10.1534/g3.112.005173] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/20/2012] [Accepted: 12/30/2012] [Indexed: 12/19/2022]
Abstract
Dinucleotide microsatellites are dynamic DNA sequences that affect genome stability. Here, we focused on mature microsatellites, defined as pure repeats of lengths above the threshold and unlikely to mutate below it in a single mutational event. We investigated the prevalence and mutational behavior of these sequences by using human genome sequence data, human cells in culture, and purified DNA polymerases. Mature dinucleotides (≥10 units) are present within exonic sequences of >350 genes, resulting in vulnerability to cellular genetic integrity. Mature dinucleotide mutagenesis was examined experimentally using ex vivo and in vitro approaches. We observe an expansion bias for dinucleotide microsatellites up to 20 units in length in somatic human cells, in agreement with previous computational analyses of germ-line biases. Using purified DNA polymerases and human cell lines deficient for mismatch repair (MMR), we show that the expansion bias is caused by functional MMR and is not due to DNA polymerase error biases. Specifically, we observe that the MutSα and MutLα complexes protect against expansion mutations. Our data support a model wherein different MMR complexes shift the balance of mutations toward deletion or expansion. Finally, we show that replication fork progression is stalled within long dinucleotides, suggesting that mutational mechanisms within long repeats may be distinct from shorter lengths, depending on the biochemistry of fork resolution. Our work combines computational and experimental approaches to explain the complex mutational behavior of dinucleotide microsatellites in humans.
Collapse
|
11
|
Abstract
It has been known for many years that the mutation rate varies across the genome. However, only with the advent of large genomic data sets is the full extent of this variation becoming apparent. The mutation rate varies over many different scales, from adjacent sites to whole chromosomes, with the strongest variation seen at the smallest scales. Some of these patterns have clear mechanistic bases, but much of the rate variation remains unexplained, and some of it is deeply perplexing. Variation in the mutation rate has important implications in evolutionary biology and underexplored implications for our understanding of hereditary disease and cancer.
Collapse
|
12
|
Haerty W, Golding GB. Increased polymorphism near low-complexity sequences across the genomes of Plasmodium falciparum isolates. Genome Biol Evol 2011; 3:539-50. [PMID: 21602572 PMCID: PMC3140889 DOI: 10.1093/gbe/evr045] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Low-complexity regions (LCRs) within proteins sequences are often considered to evolve neutrally even though recent studies reported evidence for selection acting on some of them. Because of their widespread distribution among eukaryotes genomes and the potential deleterious effect of expansion/contraction of some of them in humans, low-complexity sequences are of major interest and numerous studies have attempted to describe their dynamic between genomes as well as the factors correlated to their variation and to assess their selective value. However, due to the scarcity of individual genomes within a species, most of the analyses so far have been performed at the species level with the implicit assumption that the variation both in composition and size within species is too small relative to the between-species divergence to affect the conclusions of the analysis. Here we used the available genomes of 14 Plasmodium falciparum isolates to assess the relationship between low-complexity sequence variation and factors such as nucleotide polymorphism across strains, sequence composition, and protein expression. We report that more than half of the 7,711 low-complexity sequences found within aligned coding sequences are variable in size among strains. Across strains, we observed an increasing density of polymorphic sites toward the LCR boundaries. This observation strongly suggests the joint effects of lowered selective constraints on low-complexity sequences and a mutagenic effect of these simple sequences.
Collapse
Affiliation(s)
- Wilfried Haerty
- Department of Biology, McMaster University, Hamilton, Ontario, Canada
| | | |
Collapse
|