1
|
Sang MK, Patnaik HH, Park JE, Song DK, Jeong JY, Hong CE, Kim YT, Shin HJ, Ziwei L, Hwang HJ, Park SY, Kang SW, Park SH, Cha SJ, Ko JH, Shin EH, Park HS, Jo YH, Han YS, Patnaik BB, Lee YS. Transcriptome analysis of Haemaphysalis flava female using Illumina HiSeq 4000 sequencing: de novo assembly, functional annotation and discovery of SSR markers. Parasit Vectors 2023; 16:367. [PMID: 37848984 PMCID: PMC10583488 DOI: 10.1186/s13071-023-05923-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Accepted: 08/09/2023] [Indexed: 10/19/2023] Open
Abstract
BACKGROUND Ticks are ectoparasites capable of directly damaging their hosts and transmitting vector-borne diseases. The ixodid tick Haemaphysalis flava has a broad distribution that extends from East to South Asia. This tick is a reservoir of severe fever with thrombocytopenia syndrome virus (SFTSV) that causes severe hemorrhagic disease, with cases reported from China, Japan and South Korea. Recently, the distribution of H. flava in South Korea was found to overlap with the occurrence of SFTSV. METHODS This study was undertaken to discover the molecular resources of H. flava female ticks using the Illumina HiSeq 4000 system, the Trinity de novo sequence assembler and annotation against public databases. The locally curated Protostome database (PANM-DB) was used to screen the putative adaptation-related transcripts classified to gene families, such as angiotensin-converting enzyme, aquaporin, adenylate cyclase, AMP-activated protein kinase, glutamate receptors, heat shock proteins, molecular chaperones, insulin receptor, mitogen-activated protein kinase and solute carrier family proteins. Also, the repeats and simple sequence repeats (SSRs) were screened from the unigenes using RepeatMasker (v4.0.6) and MISA (v1.0) software tools, followed by the designing of SSRs flanking primers using BatchPrimer 3 (v1.0) software. RESULTS The transcriptome produced a total of 69,822 unigenes, of which 46,175 annotated to the homologous proteins in the PANM-DB. The unigenes were also mapped to the EuKaryotic Orthologous Groups (KOG), Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) specializations. Promiscuous presence of protein kinase, zinc finger (C2H2-type), reverse transcriptase, and RNA recognition motif domains was observed in the unigenes. A total of 3480 SSRs were screened, of which 1907 and 1274 were found as tri- and dinucleotide repeats, respectively. A list of primer sequences flanking the SSR motifs was detailed for validation of polymorphism in H. flava and the related tick species. CONCLUSIONS The reference transcriptome information on H. flava female ticks will be useful for an enriched understanding of tick biology, its competency to act as a vector and the study of species diversity related to disease transmission.
Collapse
Affiliation(s)
- Min Kyu Sang
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, South Korea
- Research Support Center for Bio-Bigdata Analysis and Utilization of Biological Resources, Soonchunhyang University, Asan, Chungnam, South Korea
| | - Hongray Howrelia Patnaik
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, South Korea
| | - Jie Eun Park
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, South Korea
- Research Support Center for Bio-Bigdata Analysis and Utilization of Biological Resources, Soonchunhyang University, Asan, Chungnam, South Korea
| | - Dae Kwon Song
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, South Korea
- Research Support Center for Bio-Bigdata Analysis and Utilization of Biological Resources, Soonchunhyang University, Asan, Chungnam, South Korea
| | - Jun Yang Jeong
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, South Korea
- Department of Biology, College of Natural Sciences, Soonchunhyang University, Asan, 31538, Chungnam, South Korea
| | - Chan Eui Hong
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, South Korea
- Department of Biology, College of Natural Sciences, Soonchunhyang University, Asan, 31538, Chungnam, South Korea
| | - Yong Tae Kim
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, South Korea
- Department of Biology, College of Natural Sciences, Soonchunhyang University, Asan, 31538, Chungnam, South Korea
| | - Hyeon Jun Shin
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, South Korea
- Department of Biology, College of Natural Sciences, Soonchunhyang University, Asan, 31538, Chungnam, South Korea
| | - Liu Ziwei
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, South Korea
- Department of Biology, College of Natural Sciences, Soonchunhyang University, Asan, 31538, Chungnam, South Korea
| | - Hee Ju Hwang
- Department of Biology, College of Natural Sciences, Soonchunhyang University, Asan, 31538, Chungnam, South Korea
| | - So Young Park
- Biodiversity Research Team, Animal & Plant Research Department, Nakdonggang National Institute of Biological Resources, Sangju, Gyeongbuk, South Korea
| | - Se Won Kang
- Biological Resource Center (BRC), Korea Research Institute of Bioscience and Biotechnology (KRIBB), Jeongeup, Jeonbuk, South Korea
| | - Seung-Hwan Park
- Biological Resource Center (BRC), Korea Research Institute of Bioscience and Biotechnology (KRIBB), Jeongeup, Jeonbuk, South Korea
| | - Sung-Jae Cha
- Johns Hopkins Malaria Research Institute, Department of Molecular Microbiology & Immunology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Jung Ho Ko
- Police Science Institute, Korean National Police University, Asan, Chungnam, 31539, South Korea
| | - E Hyun Shin
- Research Institute, Korea Pest Control Association, Seoul, 08501, South Korea
| | - Hong Seog Park
- Research Institute, GnC BIO Co., LTD., 621-6 Banseok-dong, Yuseong-gu, Daejeon, 34069, South Korea
| | - Yong Hun Jo
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, South Korea
- Department of Biology, College of Natural Sciences, Soonchunhyang University, Asan, 31538, Chungnam, South Korea
| | - Yeon Soo Han
- College of Agriculture and Life Science, Chonnam National University, 77 Yongbong-ro, Buk-gu, Gwangju, 61186, South Korea
| | - Bharat Bhusan Patnaik
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, South Korea
- Department of Biology, College of Natural Sciences, Soonchunhyang University, Asan, 31538, Chungnam, South Korea
- PG Department of Biosciences and Biotechnology, Fakir Mohan University, Nuapadhi, Balasore , Odisha, 756089, India
| | - Yong Seok Lee
- Korea Native Animal Resources Utilization Convergence Research Institute (KNAR), Soonchunhyang University, Asan, Chungnam, South Korea.
- Research Support Center for Bio-Bigdata Analysis and Utilization of Biological Resources, Soonchunhyang University, Asan, Chungnam, South Korea.
- Department of Biology, College of Natural Sciences, Soonchunhyang University, Asan, 31538, Chungnam, South Korea.
| |
Collapse
|
2
|
Song X, Yang T, Zhang X, Yuan Y, Yan X, Wei Y, Zhang J, Zhou C. Comparison of the Microsatellite Distribution Patterns in the Genomes of Euarchontoglires at the Taxonomic Level. Front Genet 2021; 12:622724. [PMID: 33719337 PMCID: PMC7953163 DOI: 10.3389/fgene.2021.622724] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Accepted: 02/05/2021] [Indexed: 02/05/2023] Open
Abstract
Microsatellite or simple sequence repeat (SSR) instability within genes can induce genetic variation. The SSR signatures remain largely unknown in different clades within Euarchontoglires, one of the most successful mammalian radiations. Here, we conducted a genome-wide characterization of microsatellite distribution patterns at different taxonomic levels in 153 Euarchontoglires genomes. Our results showed that the abundance and density of the SSRs were significantly positively correlated with primate genome size, but no significant relationship with the genome size of rodents was found. Furthermore, a higher level of complexity for perfect SSR (P-SSR) attributes was observed in rodents than in primates. The most frequent type of P-SSR was the mononucleotide P-SSR in the genomes of primates, tree shrews, and colugos, while mononucleotide or dinucleotide motif types were dominant in the genomes of rodents and lagomorphs. Furthermore, (A)n was the most abundant motif in primate genomes, but (A)n, (AC)n, or (AG)n was the most abundant motif in rodent genomes which even varied within the same genus. The GC content and the repeat copy numbers of P-SSRs varied in different species when compared at different taxonomic levels, reflecting underlying differences in SSR mutation processes. Notably, the CDSs containing P-SSRs were categorized by functions and pathways using Gene Ontology and Kyoto Encyclopedia of Genes and Genomes annotations, highlighting their roles in transcription regulation. Generally, this work will aid future studies of the functional roles of the taxonomic features of microsatellites during the evolution of mammals in Euarchontoglires.
Collapse
Affiliation(s)
- Xuhao Song
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong, China.,Institute of Ecology, China West Normal University, Nanchong, China
| | - Tingbang Yang
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong, China.,Institute of Ecology, China West Normal University, Nanchong, China
| | - Xinyi Zhang
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong, China
| | - Ying Yuan
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong, China
| | - Xianghui Yan
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong, China
| | - Yi Wei
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong, China.,Institute of Ecology, China West Normal University, Nanchong, China
| | - Jun Zhang
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong, China.,Institute of Ecology, China West Normal University, Nanchong, China
| | - Caiquan Zhou
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong, China.,Institute of Ecology, China West Normal University, Nanchong, China
| |
Collapse
|
3
|
Song X, Yang T, Yan X, Zheng F, Xu X, Zhou C. Comparison of microsatellite distribution patterns in twenty-nine beetle genomes. Gene 2020; 757:144919. [PMID: 32603771 DOI: 10.1016/j.gene.2020.144919] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2020] [Revised: 06/15/2020] [Accepted: 06/20/2020] [Indexed: 01/20/2023]
Abstract
Simple sequence repeats (SSRs) represent an important source of genetic variation that provides a basis for adaptation to different environments in organisms. In this study, we examined the distribution patterns of SSRs in twenty-nine beetle genomes and carried out Gene Ontology (GO) analysis of CDSs embedded with perfect SSRs (P-SSRs). The results demonstrated that imperfect SSRs (I-SSRs) represented the most abundant SSR category in beetle genomes and in different genomic regions (CDS, exon, and intron regions). The numbers of P-SSRs, I-SSRs, compound SSRs, and variable number tandem repeats were positively correlated with beetle genome size, whereas neither the frequency nor the density of the SSRs was correlated with genome size. Moreover, our results demonstrated that common genomic features of P-SSRs within the same suborder or family of Coleoptera were rare. Mono-, di-, tri-, or tetranucleotide SSRs were the most abundant P-SSR categories in beetle genomes. The preferred predominant repeat motif among the mononucleotide P-SSRs was (A)n, but the most frequent repeat motifs for other length classes varied differentially among these genomes. Furthermore, the P-SSR type with the highest GC content differed in the beetle genomes and in different genomic regions. CV (coefficient of variability) analysis demonstrated that the repeat copy numbers of P-SSRs presented relatively higher variation in introns than in CDSs and exons. The GO terms of CDSs containing P-SSRs for molecular functions were mainly enriched in "binding" and "transcription". Our findings will be useful for studying the functional roles of microsatellite heterogeneity in beetle adaptation.
Collapse
Affiliation(s)
- Xuhao Song
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong 637009, Sichuan Province, China.
| | - Tingbang Yang
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong 637009, Sichuan Province, China
| | - Xianghui Yan
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong 637009, Sichuan Province, China
| | - Fake Zheng
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong 637009, Sichuan Province, China
| | - Xiaoqin Xu
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong 637009, Sichuan Province, China
| | - Caiquan Zhou
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong 637009, Sichuan Province, China.
| |
Collapse
|
4
|
Wang X, Zhang Y, Qiao L, Chen B. Comparative analyses of simple sequence repeats (SSRs) in 23 mosquito species genomes: Identification, characterization and distribution (Diptera: Culicidae). INSECT SCIENCE 2019; 26:607-619. [PMID: 29484820 PMCID: PMC7379697 DOI: 10.1111/1744-7917.12577] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Revised: 01/20/2018] [Accepted: 01/24/2018] [Indexed: 05/28/2023]
Abstract
Simple sequence repeats (SSRs) exist in both eukaryotic and prokaryotic genomes and are the most popular genetic markers, but the SSRs of mosquito genomes are still not well understood. In this study, we identified and analyzed the SSRs in 23 mosquito species using Drosophila melanogaster as reference at the whole-genome level. The results show that SSR numbers (33 076-560 175/genome) and genome sizes (574.57-1342.21 Mb) are significantly positively correlated (R2 = 0.8992, P < 0.01), but the correlation in individual species varies in these mosquito species. In six types of SSR, mono- to trinucleotide SSRs are dominant with cumulative percentages of 95.14%-99.00% and densities of 195.65/Mb-787.51/Mb, whereas tetra- to hexanucleotide SSRs are rare with 1.12%-4.22% and 3.76/Mb-40.23/Mb. The (A/T)n, (AC/GT)n and (AGC/GCT)n are the most frequent motifs in mononucleotide, dinucleotide and trinucleotide SSRs, respectively, and the motif frequencies of tetra- to hexanucleotide SSRs appear to be species-specific. The 10-20 bp length of SSRs are dominant with the number of 110 561 ± 93 482 and the frequency of 87.25% ± 5.73% on average, and the number and frequency decline with the increase of length. Most SSRs (83.34% ± 7.72%) are located in intergenic regions, followed by intron regions (11.59% ± 5.59%), exon regions (3.74% ± 1.95%), and untranslated regions (1.32% ± 1.39%). The mono-, di- and trinucleotide SSRs are the main SSRs in both gene regions (98.55% ± 0.85%) and exon regions (99.27% ± 0.52%). An average of 42.52% of total genes contains SSRs, and the preference for SSR occurrence in different gene subcategories are species-specific. The study provides useful insights into the SSR diversity, characteristics and distribution in 23 mosquito species of genomes.
Collapse
Affiliation(s)
- Xiao‐Ting Wang
- Chongqing Key Laboratory of Vector Insects; Chongqing Key Laboratory of Animal Biology; Institute of Entomology and Molecular BiologyChongqing Normal UniversityChongqingChina
| | - Yu‐Juan Zhang
- Chongqing Key Laboratory of Vector Insects; Chongqing Key Laboratory of Animal Biology; Institute of Entomology and Molecular BiologyChongqing Normal UniversityChongqingChina
| | - Liang Qiao
- Chongqing Key Laboratory of Vector Insects; Chongqing Key Laboratory of Animal Biology; Institute of Entomology and Molecular BiologyChongqing Normal UniversityChongqingChina
| | - Bin Chen
- Chongqing Key Laboratory of Vector Insects; Chongqing Key Laboratory of Animal Biology; Institute of Entomology and Molecular BiologyChongqing Normal UniversityChongqingChina
| |
Collapse
|
5
|
Ding S, Wang S, He K, Jiang M, Li F. Large-scale analysis reveals that the genome features of simple sequence repeats are generally conserved at the family level in insects. BMC Genomics 2017; 18:848. [PMID: 29110701 PMCID: PMC5674736 DOI: 10.1186/s12864-017-4234-0] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2017] [Accepted: 10/23/2017] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND Simple sequence repeats (SSR), also called microsatellites, have been widely used as genetic markers, and have been extensively studied in some model insects. At present, the genomes of more than 100 insect species are available. However, the features of SSRs in most insect genomes remain largely unknown. RESULTS We identified 15.01 million SSRs across 136 insect genomes. The number of identified SSRs was positively associated with genome size in insects, but the frequency and density per megabase of genomes were not. Most insect SSRs (56.2-93.1%) were perfect (no mismatch). Imperfect (at least one mismatch) SSRs (average length 22-73 bp) were longer than perfect SSRs (16-30 bp). The most abundant insect SSRs were the di- and trinucleotide types, which accounted for 27.2% and 22.0% of all SSRs, respectively. On average, 59.1%, 36.8%, and 3.7% of insect SSRs were located in intergenic, intronic, and exonic regions, respectively. The percentages of various types of SSRs were similar among insects from the same family. However, they were dissimilar among insects from different families within orders. We carried out a phylogenetic analysis using the SSR frequencies. Species from the same family were generally clustered together in the evolutionary tree. However, insects from the same order but not in the same family did not cluster together. These results indicated that although SSRs undergo rapid expansions and contractions in different populations of the same species, the general genomic features of insect SSRs remain conserved at the family level. CONCLUSION Millions of insect SSRs were identified and their genome features were analyzed. Most insect SSRs were perfect and were located in intergenic regions. We presented evidence that the variance of insect SSRs accumulated after the differentiation of insect families.
Collapse
Affiliation(s)
- Simin Ding
- Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insects, Zhejiang University, 866 Yuhangtang Road, Hangzhou, 310058 China
| | - Shuping Wang
- Technical Centre for Animal Plant and Food Inspection and Quarantine, Shanghai Entry-exit Inspection and Quarantine Bureau, Shanghai, 200135 China
| | - Kang He
- Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insects, Zhejiang University, 866 Yuhangtang Road, Hangzhou, 310058 China
| | - Mingxing Jiang
- Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insects, Zhejiang University, 866 Yuhangtang Road, Hangzhou, 310058 China
| | - Fei Li
- Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insects, Zhejiang University, 866 Yuhangtang Road, Hangzhou, 310058 China
| |
Collapse
|
6
|
Bishnoi R, Singla D. APMicroDB: A microsatellite database of Acyrthosiphon pisum. GENOMICS DATA 2017; 12:111-115. [PMID: 28413782 PMCID: PMC5384296 DOI: 10.1016/j.gdata.2017.03.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/10/2016] [Revised: 03/23/2017] [Accepted: 03/26/2017] [Indexed: 11/28/2022]
Abstract
Pea aphids represent a complex genetic system that could be used for QTL analysis, genetic diversity and population genetics studies. Here, we described the development of first microsatellite repeat database of the pea aphid (APMicroDB), accessible at “http://deepaklab.com/aphidmicrodb”. We identified 3,40,233 SSRs using MIcroSAtellite (MISA) tool that was distributed in 14,067 (out of 23,924) scaffold of the pea aphid. We observed 89.53% simple repeats of which 73.41% were mono-nucleotide, followed by di-nucleotide repeats. This database stored information about the repeats kind, GC content, motif type (mono - hexa), genomic location etc. We have also incorporated the primer information derived from Primer3 software of the 250bp flanking region of the identified marker. Blast tool is also provided for searching the user query sequence for identified marker and their primers. This work has an immense use for scientific community working in the field of agricultural pest management, QTL mapping, and host-pathogen interaction analysis.
Collapse
Affiliation(s)
- Ritika Bishnoi
- Institute of Microbial Technology, Sector 39-A, Chandigarh 160036, India
| | - Deepak Singla
- Centre for Agricultural Bioinformatics, Indian Agricultural Statistics Research Institute, Library Avenue, Pusa, New Delhi, India
| |
Collapse
|
7
|
Battistuzzi FU, Schneider KA, Spencer MK, Fisher D, Chaudhry S, Escalante AA. Profiles of low complexity regions in Apicomplexa. BMC Evol Biol 2016; 16:47. [PMID: 26923229 PMCID: PMC4770516 DOI: 10.1186/s12862-016-0625-0] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2015] [Accepted: 02/17/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Low complexity regions (LCRs) are a ubiquitous feature in genomes and yet their evolutionary history and functional roles are unclear. Previous studies have shown contrasting evidence in favor of both neutral and selective mechanisms of evolution for different sets of LCRs suggesting that modes of identification of these regions may play a role in our ability to discern their evolutionary history. To further investigate this issue, we used a multiple threshold approach to identify species-specific profiles of proteome complexity and, by comparing properties of these sets, determine the influence that starting parameters have on evolutionary inferences. RESULTS We find that, although qualitatively similar, quantitatively each species has a unique LCR profile which represents the frequency of these regions within each genome. Inferences based on these profiles are more accurate in comparative analyses of genome complexity as they allow to determine the relative complexity of multiple genomes as well as the type of repetitiveness that is most common in each. Based on the multiple threshold LCR sets obtained, we identified predominant evolutionary mechanisms at different complexity levels, which show neutral mechanisms acting on highly repetitive LCRs (e.g., homopolymers) and selective forces becoming more important as heterogeneity of the LCRs increases. CONCLUSIONS Our results show how inferences based on LCRs are influenced by the parameters used to identify these regions. Sets of LCRs are heterogeneous aggregates of regions that include homo- and heteropolymers and, as such, evolve according to different mechanisms. LCR profiles provide a new way to investigate genome complexity across species and to determine the driving mechanism of their evolution.
Collapse
Affiliation(s)
| | - Kristan A Schneider
- Department of MNI, University of Applied Sciences Mittweida, Mittweida, Germany.
| | - Matthew K Spencer
- Department of Geology and Physics, Lake Superior State University, Sault Ste. Marie, MI, USA.
| | - David Fisher
- David Eccles School of Business, University of Utah, Salt Lake City, UT, USA.
| | - Sophia Chaudhry
- Department of Biological Sciences, Oakland University, Rochester, MI, USA. .,Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, USA.
| | - Ananias A Escalante
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA.
| |
Collapse
|
8
|
Behura SK, Severson DW. Motif mismatches in microsatellites: insights from genome-wide investigation among 20 insect species. DNA Res 2014; 22:29-38. [PMID: 25378245 PMCID: PMC4379975 DOI: 10.1093/dnares/dsu036] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
We present a detailed genome-wide comparative study of motif mismatches of microsatellites among 20 insect species representing five taxonomic orders. The results show that varying proportions (∼15-46%) of microsatellites identified in these species are imperfect in motif structure, and that they also vary in chromosomal distribution within genomes. It was observed that the genomic abundance of imperfect repeats is significantly associated with the length and number of motif mismatches of microsatellites. Furthermore, microsatellites with a higher number of mismatches tend to have lower abundance in the genome, suggesting that sequence heterogeneity of repeat motifs is a key determinant of genomic abundance of microsatellites. This relationship seems to be a general feature of microsatellites even in unrelated species such as yeast, roundworm, mouse and human. We provide a mechanistic explanation of the evolutionary link between motif heterogeneity and genomic abundance of microsatellites by examining the patterns of motif mismatches and allele sequences of single-nucleotide polymorphisms identified within microsatellite loci. Using Drosophila Reference Genetic Panel data, we further show that pattern of allelic variation modulates motif heterogeneity of microsatellites, and provide estimates of allele age of specific imperfect microsatellites found within protein-coding genes.
Collapse
Affiliation(s)
- Susanta K Behura
- Eck Institute for Global Health and Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA
| | - David W Severson
- Eck Institute for Global Health and Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA
| |
Collapse
|
9
|
Andersen JC, Mills NJ. iMSAT: a novel approach to the development of microsatellite loci using barcoded Illumina libraries. BMC Genomics 2014; 15:858. [PMID: 25281214 PMCID: PMC4195870 DOI: 10.1186/1471-2164-15-858] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2013] [Accepted: 09/26/2014] [Indexed: 12/30/2022] Open
Abstract
Background Illumina sequencing with its high number of reads and low per base pair cost is an attractive technology for development of molecular resources for non-model organisms. While many software packages have been developed to identify short tandem repeats (STRs) from next-generation sequencing data, these methods do not inform the investigator as to whether or not candidate loci are polymorphic in their target populations. Results We provide a python program iMSAT that uses the polymorphism data obtained from mapping individual Illumina sequence reads onto a reference genome to identify polymorphic STRs. Using this approach, we identified 9,119 candidate polymorphic STRs for use with the parasitoid wasp Trioxys pallidus and 2,378 candidate polymorphic STRs for use with the aphid Chromaphis juglandicola. For both organisms we selected 20 candidate tri-nucleotide STRs for validation. Using fluorescent-labeled oligonucleotide primers, we genotyped 91 female T. pallidus collected in nine localities and 46 female C. juglandicola collected in 4 localities and found 15 of the examined markers to be polymorphic for T. pallidus and 12 of the examined markers to be polymorphic for C. juglandicola. Conclusions We present a novel approach that uses standard Illumina barcoding primers and a single Illumina HiSeq run to target polymorphic STR fragments to develop and test STR markers. We validate this approach using the parasitoid wasp T. pallidus and its aphid host C. juglandicola. This approach, which would also be compatible with 454 Sequencing, allowed us to quickly identify markers with known variability. Accordingly, our method constitutes a significant improvement over existing STR identification software packages. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-858) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jeremy C Andersen
- Department of Environmental Science Policy and Management, University of California Berkeley, Wellman Hall, Berkeley, USA.
| | | |
Collapse
|
10
|
A genetic program theory of aging using an RNA population model. Ageing Res Rev 2014; 13:46-54. [PMID: 24263168 DOI: 10.1016/j.arr.2013.11.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2013] [Accepted: 11/08/2013] [Indexed: 12/11/2022]
Abstract
Aging is a common characteristic of multicellular eukaryotes. Copious hypotheses have been proposed to explain the mechanisms of aging, but no single theory is generally acceptable. In this article, we refine the RNA population gene activating model (Lv et al., 2003) based on existing reports as well as on our own latest findings. We propose the RNA population model as a genetic theory of aging. The new model can also be applied to differentiation and tumorigenesis and could explain the biological significance of non-coding DNA, RNA, and repetitive sequence DNA. We provide evidence from the literature as well as from our own findings for the roles of repetitive sequences in gene activation. In addition, we predict several phenomena related to aging and differentiation based on this model.
Collapse
|
11
|
Behura SK, Severson DW. Association of microsatellite pairs with segmental duplications in insect genomes. BMC Genomics 2013; 14:907. [PMID: 24359442 PMCID: PMC3878106 DOI: 10.1186/1471-2164-14-907] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2013] [Accepted: 12/16/2013] [Indexed: 11/30/2022] Open
Abstract
Background Segmental duplications (SDs), also known as low-copy repeats, are DNA sequences of length greater than 1 kb which are duplicated with a high degree of sequence identity (greater than 90%) causing instability in genomes. SDs are generally found in the genome as mosaic forms of duplicated sequences which are generated by a two-step process: first, multiple duplicated sequences are aggregated at specific genomic regions, and then, these primary duplications undergo multiple secondary duplications. However, the mechanism of how duplicated sequences are aggregated in the first place is not well understood. Results By analyzing the distribution of microsatellite sequences among twenty insect species in a genome-wide manner it was found that pairs of microsatellites along with the intervening sequences were duplicated multiple times in each genome. They were found as low copy repeats or segmental duplications when the duplicated loci were greater than 1 kb in length and had greater than 90% sequence similarity. By performing a sliding-window genomic analysis for number of paired microsatellites and number of segmental duplications, it was observed that regions rich in repetitive paired microsatellites tend to get richer in segmental duplication suggesting a “rich-gets-richer” mode of aggregation of the duplicated loci in specific regions of the genome. Results further show that the relationship between number of paired microsatellites and segmental duplications among the species is independent of the known phylogeny suggesting that association of microsatellites with segmental duplications may be a species-specific evolutionary process. It was also observed that the repetitive microsatellite pairs are associated with gene duplications but those sequences are rarely retained in the orthologous genes between species. Although some of the duplicated sequences with microsatellites as termini were found within transposable elements (TEs) of Drosophila, most of the duplications are found in the TE-free and gene-free regions of the genome. Conclusion The study clearly suggests that microsatellites are instrumental in extensive sequence duplications that may contribute to species-specific evolution of genome plasticity in insects.
Collapse
Affiliation(s)
- Susanta K Behura
- Eck Institute for Global Health, Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA.
| | | |
Collapse
|
12
|
Behura SK, Severson DW. Codon usage bias: causative factors, quantification methods and genome-wide patterns: with emphasis on insect genomes. Biol Rev Camb Philos Soc 2012; 88:49-61. [PMID: 22889422 DOI: 10.1111/j.1469-185x.2012.00242.x] [Citation(s) in RCA: 124] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Codon usage bias refers to the phenomenon where specific codons are used more often than other synonymous codons during translation of genes, the extent of which varies within and among species. Molecular evolutionary investigations suggest that codon bias is manifested as a result of balance between mutational and translational selection of such genes and that this phenomenon is widespread across species and may contribute to genome evolution in a significant manner. With the advent of whole-genome sequencing of numerous species, both prokaryotes and eukaryotes, genome-wide patterns of codon bias are emerging in different organisms. Various factors such as expression level, GC content, recombination rates, RNA stability, codon position, gene length and others (including environmental stress and population size) can influence codon usage bias within and among species. Moreover, there has been a continuous quest towards developing new concepts and tools to measure the extent of codon usage bias of genes. In this review, we outline the fundamental concepts of evolution of the genetic code, discuss various factors that may influence biased usage of synonymous codons and then outline different principles and methods of measurement of codon usage bias. Finally, we discuss selected studies performed using whole-genome sequences of different insect species to show how codon bias patterns vary within and among genomes. We conclude with generalized remarks on specific emerging aspects of codon bias studies and highlight the recent explosion of genome-sequencing efforts on arthropods (such as twelve Drosophila species, species of ants, honeybee, Nasonia and Anopheles mosquitoes as well as the recent launch of a genome-sequencing project involving 5000 insects and other arthropods) that may help us to understand better the evolution of codon bias and its biological significance.
Collapse
Affiliation(s)
- Susanta K Behura
- Department of Biological Sciences, Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, USA.
| | | |
Collapse
|