1
|
Sabili Z, Rashidi-Monfard S, Haghi R, Kahrizi D. Comparative analysis of simple sequence repeats and synteny across ten Oryza species: Implications for stress response and genetic diversity. Comput Biol Chem 2025; 116:108379. [PMID: 39978112 DOI: 10.1016/j.compbiolchem.2025.108379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2024] [Revised: 01/24/2025] [Accepted: 02/09/2025] [Indexed: 02/22/2025]
Abstract
Rice is a pivotal food source for most of the global population, necessitating a strategic focus on maximizing its production under diverse conditions through various methods. As molecular markers, simple sequence repeats (SSRs) emerge as instrumental tools in product enhancement and molecular research. This study employs in silico methods to predict the presence of molecular markers across distinct genomic and genic regions within ten Oryza species. Subsequently, we conducted a comprehensive comparison and synteny analysis of common molecular markers shared among most species, particularly those implicated in stress responses, utilizing McscanX. Beyond identifying common SSRs across the ten species under investigation, we delved into the functional analysis of these markers, specifically pinpointing those associated with stress. Additionally, our investigation illustrated the uniform distribution of SSRs along chromosomes and created a physical map showcasing their prevalence. Notably, chromosomes 1, 2, and 3 exhibited a higher density of molecular markers compared to their counterparts. Furthermore, our study highlighted that Oryza glumipatula, Oryza brachyantha, Oryza meridionalis, and Oryza longistaminata species manifested more pronounced differences in SSR markers compared to other Oryza species. The implications of these findings extend to applications in genetic diversity assessment, genetic mapping, and molecular marker-assisted selection breeding, providing valuable insights for future research and development in the field.
Collapse
Affiliation(s)
- Zahra Sabili
- Agricultural Biotechnology Department, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran.
| | - Sajad Rashidi-Monfard
- Agricultural Biotechnology Department, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran.
| | - Reza Haghi
- The Gene Bank Department, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany.
| | - Danial Kahrizi
- Agricultural Biotechnology Department, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran.
| |
Collapse
|
2
|
Kumar S, Singh S, Kumar R, Gupta D. The Genomic SSR Millets Database (GSMDB): enhancing genetic resources for sustainable agriculture. Database (Oxford) 2024; 2024:baae114. [PMID: 39546404 PMCID: PMC11566590 DOI: 10.1093/database/baae114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 09/27/2024] [Accepted: 11/11/2024] [Indexed: 11/17/2024]
Abstract
The global population surge demands increased food production and nutrient-rich options to combat rising food insecurity. Climate-resilient crops are vital, with millets emerging as superfoods due to nutritional richness and stress tolerance. Given limited genomic information, a comprehensive genetic resource is crucial to advance millet research. Whole-genome sequencing provides an unprecedented opportunity, and molecular genetic methodologies, particularly simple sequence repeats (SSRs), play a pivotal role in DNA fingerprinting, constructing linkage maps, and conducting population genetic studies. SSRs are composed of repetitive DNA sequences where one to six nucleotides are repeated in tandem and distributed throughout the genome. Different millet species exhibit genomic variations attributed to the presence of SSRs. While SSRs have been identified in a few millet species, the existing information only covers some of the sequenced genomes. Moreover, there is an absence of complete gene annotation and visualization features for SSRs. Addressing this disparity and leveraging the de-novo millet genome assembly available from the NCBI, we have developed the Genomic SSR Millets Database (GSMDB; https://bioinfo.icgeb.res.in/gsmdb/). This open-access repository provides a web-based tool offering search functionalities and comprehensive details on 6.747645 million SSRs mined from the genomic sequences of seven millet species. The database, featuring unrestricted public access and JBrowse visualization, is a pioneering resource for the research community dedicated to advancing millet cultivars and related species. GSMDB holds immense potential to support myriad studies, including genetic diversity assessments, genetic mapping, marker-assisted selection, and comparative population investigations aiming to facilitate the millet breeding programs geared toward ensuring global food security. Database URL: https://bioinfo.icgeb.res.in/gsmdb/.
Collapse
Affiliation(s)
- Sonu Kumar
- Translational Bioinformatics Group, International Centre for Genetic Engineering and Biotechnology (ICGEB), New Delhi, Delhi 110067, India
| | - Sangeeta Singh
- Translational Bioinformatics Group, International Centre for Genetic Engineering and Biotechnology (ICGEB), New Delhi, Delhi 110067, India
| | - Rakesh Kumar
- Translational Bioinformatics Group, International Centre for Genetic Engineering and Biotechnology (ICGEB), New Delhi, Delhi 110067, India
| | - Dinesh Gupta
- Translational Bioinformatics Group, International Centre for Genetic Engineering and Biotechnology (ICGEB), New Delhi, Delhi 110067, India
| |
Collapse
|
3
|
Hamarsheh O, Guernaoui S, Karakus M, Yaghoobi-Ershadi MR, Kruger A, Amro A, Kenawy MA, Dokhan MR, Shoue DA, McDowell MA. Population structure analysis of Phlebotomus papatasi populations using transcriptome microsatellites: possible implications for leishmaniasis control and vaccine development. Parasit Vectors 2024; 17:410. [PMID: 39358814 PMCID: PMC11448080 DOI: 10.1186/s13071-024-06495-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2024] [Accepted: 09/14/2024] [Indexed: 10/04/2024] Open
Abstract
BACKGROUND Phlebotomus papatasi is considered the primary vector of Leishmania major parasites that cause zoonotic cutaneous leishmaniasis (ZCL) in the Middle East and North Africa. Phlebotomus papatasi populations have been studied extensively, revealing the existence of different genetic populations and subpopulations over its large distribution range. Genetic diversity and population structure analysis using transcriptome microsatellite markers is important to uncover the vector distribution dynamics, essential for controlling ZCL in endemic areas. METHODS In this study, we investigated the level of genetic variation using expressed sequence tag-derived simple sequence repeats (EST-SSRs) among field and colony P. papatasi samples collected from 25 different locations in 11 countries. A total of 302 P. papatasi sand fly individuals were analyzed, including at least 10 flies from each region. RESULTS The analysis revealed a high-level population structure expressed by five distinct populations A through E, with moderate genetic differentiation among all populations. These genetic differences in expressed genes may enable P. papatasi to adapt to different environmental conditions along its distribution range and likely affect dispersal. CONCLUSIONS Elucidating the population structuring of P. papatasi is essential to L. major containment efforts in endemic countries. Moreover, the level of genetic variation among these populations may improve our understanding of Leishmania-sand fly interactions and contribute to the efforts of vaccine development based on P. papatasi salivary proteins.
Collapse
Affiliation(s)
- Omar Hamarsheh
- Department of Biological Sciences, Faculty of Science and Technology, Al-Quds University, Jerusalem, Palestine.
- Department of Biological Sciences, Galvin Life Science, Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46656, USA.
| | - Souad Guernaoui
- Biotechnology, Conservation and Valorization of Natural Resources Laboratory, Faculty of Sciences Dhar El Mahraz, Sidi Mohamed Ben Abdellah University, Fez, Morocco
| | - Mehmet Karakus
- Faculty of Medicine, Department of Medical Microbiology, University of Health Sciences, Istanbul, Turkey
| | - Mohammad Reza Yaghoobi-Ershadi
- Department of Medical Entomology & Vector Control, School of Public Health, Tehran University of Medical Sciences, Tehran, Iran
| | | | - Ahmad Amro
- Faculty of Pharmacy, Al-Quds University, Jerusalem, Palestine
| | - Mohamed Amin Kenawy
- Department of Entomology, Faculty of Science, Ain Shams University, Abbassia, 11566, Cairo, Egypt
| | | | - Douglas A Shoue
- Department of Biological Sciences, Galvin Life Science, Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46656, USA
| | - Mary Ann McDowell
- Department of Biological Sciences, Galvin Life Science, Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46656, USA.
| |
Collapse
|
4
|
Xu S, Shen C, Li C, Dong W, Yang G. Genome sequencing and comparative genome analysis of Rhizoctonia solani AG-3. Front Microbiol 2024; 15:1360524. [PMID: 38638902 PMCID: PMC11024465 DOI: 10.3389/fmicb.2024.1360524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Accepted: 03/19/2024] [Indexed: 04/20/2024] Open
Abstract
Rhizoctonia solani AG-3 is a plant pathogenic fungus that belongs to the group of multinucleate Rhizoctonia. According to its internal transcribed spacer (ITS) cluster analysis and host range, it is divided into TB, PT, and TM subgroups. AG-3 TB mainly causes tobacco target spots, AG-3 PT mainly causes potato black scurf, and AG-3 TM mainly causes tomato leaf blight. In our previous study, we found that all 36 tobacco target spot strains isolated from Yunnan (Southwest China) were classified into AG-3 TB subgroup, while only two of the six tobacco target spot strains isolated from Liaoning (Northeast China) were classified into AG-3 TB subgroup, and the remaining four strains were classified into AG-3 TM subgroup, which had a unique taxonomic status, and there was no previous report on the whole genome information of AG-3 TM subgroup. In this study, the whole genomes of R. solani AG-3 strains 3T-1 (AG-3 TM isolated from Liaoning) and MJ-102 (AG-3 TB isolated from Yunnan) isolated from tobacco target spot in Liaoning and Yunnan were sequenced by IIumina and PacBio sequencing platforms. Comparative genomic analysis was performed with the previously reported AG-3 PT strain Rhs1AP, revealing their differences in genomes and virulence factors. The results indicated that the genome size of 3T-1 was 42,103,597 bp with 11,290 coding genes and 49.74% GC content, and the genome size of MJ-102 was 41,908,281 bp with 10,592 coding genes and 48.91% GC content. Through comparative genomic analysis with the previously reported strain Rhs1AP (AG-3 PT), it was found that the GC content between the genomes was similar, but the strains 3T-1 and MJ-102 contained more repetitive sequences. Similarly, there are similarities between their virulence factors, but there are also some differences. In addition, the results of collinearity analysis showed that 3T-1 and MJ-102 had lower similarity and longer evolutionary distance with Rhs1AP, but the genetic relationship between 3T-1 and MJ-102 was closer. This study can lay a foundation for studying the molecular pathogenesis and virulence factors of R. solani AG-3, and revealing its genomic composition will also help to develop more effective disease control strategies.
Collapse
Affiliation(s)
| | | | | | | | - Genhua Yang
- State Key Laboratory for Protection and Utilization of Bio-Resources in Yunnan, Yunnan Agricultural University, Kunming, Yunnan, China
| |
Collapse
|
5
|
An empirical analysis of mtSSRs: could microsatellite distribution patterns explain the evolution of mitogenomes in plants? Funct Integr Genomics 2021; 22:35-53. [PMID: 34751851 DOI: 10.1007/s10142-021-00815-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 10/18/2021] [Accepted: 10/19/2021] [Indexed: 10/19/2022]
Abstract
Microsatellites (SSRs) are tandem repeat sequences in eukaryote genomes, including plant cytoplasmic genomes. The mitochondrial genome (mtDNA) has been shown to vary in size, number, and distribution of SSRs among different plant groups. Thus, SSRs contribute with genomic diversity in mtDNAs. However, the abundance, distribution, and evolutionary significance of SSRs in mtDNA from a wide range of algae and plants have not been explored. In this study, the mtDNAs of 204 plant and algal species were investigated related to the presence of SSRs. The number of SSRs was positively correlated with genome size. Its distribution is dependent on plant and algal groups analyzed, although the cluster analysis indicates the conservation of some common motifs in algal and terrestrial plants that reflect common ancestry of groups. Many SSRs in coding and non-coding regions can be useful for molecular markers. Moreover, mitochondrial SSRs are highly abundant, representing an important source for natural or induced genetic variation, i.e., for biotechnological approaches that can modulate mtDNA gene regulation. Thus, this comparative study increases the understanding of the plant and algal SSR evolution and brings perspectives for further studies.
Collapse
|
6
|
Askarian H, Akhavan A, González LG, Hwang SF, Strelkov SE. Genetic Structure of Plasmodiophora brassicae Populations Virulent on Clubroot Resistant Canola ( Brassica napus). PLANT DISEASE 2021; 105:3694-3704. [PMID: 33507096 DOI: 10.1094/pdis-09-20-1980-re] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Clubroot, caused by Plasmodiophora brassicae Woronin, is a significant threat to the canola (Brassica napus L.) industry in Canada. Clubroot resistance has been overcome in more than 200 fields since 2013, representing one of the biggest challenges to sustainable canola production. The genetic structure of 36 single-spore isolates derived from 12 field isolates of P. brassicae collected before and after the introduction of clubroot resistant (CR) canola cultivars (2005-2014) was evaluated by simple sequence repeat (SSR) marker analysis. Polymorphisms were detected in 32 loci with the identification of 93 distinct alleles. A low level of genetic diversity was found among the single-spore isolates. Haploid linkage disequilibrium and number of migrants suggested that recombination and migration were rare or almost absent in the tested P. brassicae population. A relatively clear relationship was found between the genetic structure and virulence phenotypes of the pathogen as defined on the differential hosts of Somé et al., Williams, and the Canadian Clubroot Differential (CCD) set. Although genetic variability within each pathotype group, as classified on each differential system, was low, significant genetic differentiation was observed among the pathotypes. The highest correlation between genetic structure and virulence was found among matrices produced with genetic data and the hosts of the CCD set, with a threshold index of disease of 50% to distinguish susceptible from resistant reactions. Genetically homogeneous single-spore isolates provided a more complete and clearer picture of the population genetic structure of P. brassicae, and the results suggest some promise for the development of pathotype-specific primers.
Collapse
Affiliation(s)
- Homa Askarian
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB T6G 2P5, Canada
| | - Alireza Akhavan
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB T6G 2P5, Canada
| | - Leonardo Galindo González
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB T6G 2P5, Canada
| | - Sheau-Fang Hwang
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB T6G 2P5, Canada
| | - Stephen E Strelkov
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB T6G 2P5, Canada
| |
Collapse
|
7
|
Uncovering patterns of the evolution of genomic sequence entropy and complexity. Mol Genet Genomics 2020; 296:289-298. [PMID: 33252723 DOI: 10.1007/s00438-020-01729-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Accepted: 09/22/2020] [Indexed: 10/22/2022]
Abstract
The lack of consensus concerning the biological meaning of entropy and complexity of genomes and the different ways to assess these data hamper conclusions concerning what are the causes of genomic entropy variation among species. This study aims to evaluate the entropy and complexity of genomic sequences of several species without using homologies to assess relationships among these variables and non-molecular data (e.g., the number of individuals) to seek a trigger of interspecific genomic entropy variation. The results indicate a relationship among genomic entropy, genome size, genomic complexity, and the number of individuals: species with a small number of individuals harbors large genome, and hence, low entropy but a higher complexity. We defined that the complexity of a genome relies on the entropy of each DNA segment within genome. Then, the entropy and complexity of a genome reflects its organization solely. Exons of vertebrates harbor smaller entropies than non-exon regions (likely by the repeats that accumulated from duplications), whereas other taxonomic groups do not present this pattern. Our findings suggest that small initial population might have defined current genomic entropy and complexity: actual genomes are less complex than ancestral ones. Besides, our data disagree with the relationship between phenotype and genomic entropies previously established. Finally, by establishing the relationship between genomic entropy/complexity with the number of individuals and genome size, under an evolutive perspective, ideas concerning the genomic variability may emerge.
Collapse
|
8
|
Satyam R, Jha NK, Kar R, Jha SK, Sharma A, Kumar D, Nand P, Ruokolainen J, Kesari KK, Kamal MA. Deciphering the SSR incidences across viral members of Coronaviridae family. Chem Biol Interact 2020; 331:109226. [PMID: 32971122 PMCID: PMC7505113 DOI: 10.1016/j.cbi.2020.109226] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2020] [Revised: 08/05/2020] [Accepted: 08/11/2020] [Indexed: 12/19/2022]
Abstract
Presence of Simple Sequence Repeats (SSRs), both in genic and intergenic regions, have been widely studied in eukaryotes, prokaryotes, and viruses. In the current study, we undertook a survey to analyze the frequency and distribution of microsatellites or SSRs in multiple genomes of Coronaviridae members. We successfully identified 919 SSRs with length ≥12 bp across 55 reference genomes majority of which (838 SSRs) were found abundant in genic regions. The in-silico analysis further identified the preferential abundance of hexameric SSRs than any other size-based motif class. Our analysis shows that the genome size and GC content of the genome had a weak influence on SSR frequency and density. However, we find a positive correlation of SSRs GC content with genomic GC content. We also report relatively low abundances of all theoretically possible 501 repeat motif classes in all the genomes of Coronaviridae. The majority of SSRs were AT-rich. Overall, we see an underrepresentation of SSRs across the genomes of Coronaviridae. Besides, our integrative study highlights the presence of SSRs in ORF1ab (nsp3, nsp4, nsp5A_3CLpro and nsp5B_3CLpro, nsp6, nsp10, nsp12, nsp13, & nsp15 domains), S, ORF3a, ORF7a, N & 3' UTR regions of SARS-CoV-2 and harbours multiple mutations (3'UTR and ORF1ab SSRs serving as major mutational hotspots). This indicates the genic SSRs are under selection pressure against mutations that might alter the reading frame and at the same time responsible for rapid protein evolution. Our preliminary results indicate the significance of the limited repertoire of SSRs in the genomes of Coronaviridae.
Collapse
Affiliation(s)
- Rohit Satyam
- Department of Biotechnology, Noida Institute of Engineering and Technology (NIET), Greater Noida, India
| | - Niraj Kumar Jha
- Department of Biotechnology, School of Engineering & Technology (SET), Sharda University, Greater Noida, 201310, India.
| | - Rohan Kar
- Indian Institute of Management Ahmedabad (IIMA), Gujarat, 380015, India
| | - Saurabh Kumar Jha
- Department of Biotechnology, School of Engineering & Technology (SET), Sharda University, Greater Noida, 201310, India
| | - Ankur Sharma
- Department of Life Science, School of Basic Science & Research, Sharda University, Greater Noida, 201310, India
| | - Dhruv Kumar
- Amity Institute of Molecular Medicine and Stem Cell Research (AIMMSCR), Amity University Uttar Pradesh, Noida, 201313, India
| | - Parma Nand
- Department of Biotechnology, School of Engineering & Technology (SET), Sharda University, Greater Noida, 201310, India
| | | | | | - Mohammad Amjad Kamal
- King Fahd Medical Research Center, King Abdulaziz University, P. O. Box 80216, Jeddah, 21589, Saudi Arabia; Enzymoics, Novel Global Community Educational Foundation, 7 Peterlee Place, Hebersham, NSW, 2770, Australia
| |
Collapse
|
9
|
Qi WH, Lu T, Zheng CL, Jiang XM, Jie H, Zhang XY, Yue BS, Zhao GJ. Distribution patterns of microsatellites and development of its marker in different genomic regions of forest musk deer genome based on high throughput sequencing. Aging (Albany NY) 2020; 12:4445-4462. [PMID: 32155132 PMCID: PMC7093171 DOI: 10.18632/aging.102895] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2019] [Accepted: 02/25/2020] [Indexed: 01/21/2023]
Abstract
Forest musk deer (Moschus berezovskii, FMD) is an endangered artiodactyl species, male FMD produce musk. We have sequenced the whole genome of FMD, completed the genomic assembly and annotation, and performed bioinformatic analyses. Our results showed that microsatellites (SSRs) displayed nonrandomly distribution in genomic regions, and SSR abundances were much higher in the intronic and intergenic regions compared to other genomic regions. Tri- and hexanucleotide perfect (P) SSRs predominated in coding regions (CDSs), whereas, tetra- and pentanucleotide P-SSRs were less abundant. Trifold P-SSRs had more GC-contents in the 5′-untranslated regions (5'UTRs) and CDSs than other genomic regions, whereas mononucleotide P-SSRs had the least GC-contents. The repeat copy numbers (RCN) of the same mono- to hexanucleotide P-SSRs had different distributions in different genomic regions. The RCN of trinucleotide P-SSRs had increased significantly in the CDSs compared to the transposable elements (TEs), intronic and intergenic regions. The analysis of coefficient of variability (CV) of P-SSRs showed that the RCN of mononucleotide P-SSRs had relative higher variation in different genomic regions, followed by the CV pattern of RCN: dinucleotide P-SSRs > trinucleotide P-SSRs > tetranucleotide P-SSRs > pentanucleotide P-SSRs > hexanucleotide P-SSRs. The CV variations of RCN of the same mono- to hexanucleotide P-SSRs were relative higher in the intron and intergenic regions, followed by that in the TEs, and the relative lower was in the 5'UTR, CDSs and 3'UTRs. 58 novel polymorphic SSR loci were detected based on genotyping DNA from 36 captive FMD and 22 SSR markers finally showed polymorphism, stability, and repetition.
Collapse
Affiliation(s)
- Wen-Hua Qi
- Chongqing Engineering Laboratory of Green Planting and Deep Processing of Three Gorges Reservoir Famous-region Drug, College of Biology and Food Engineering, Chongqing Three Gorges University, Chongqing 404120, P. R. China.,Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu 610064, P. R. China
| | - Ting Lu
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu 610064, P. R. China
| | - Cheng-Li Zheng
- Sichuan Institute of Musk Deer Breeding, Chengdu 611830, P. R. China
| | - Xue-Mei Jiang
- College of Environmental and Chemistry Engineering, Chongqing Three Gorges University, Chongqing 404120, P. R. China
| | - Hang Jie
- Chongqing Engineering Technology Research Center for GAP of Genuine Medicinal Materials, Chongqing Institute of Medicinal Plant Cultivation, Chongqing 408435, P. R. China
| | - Xiu-Yue Zhang
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu 610064, P. R. China
| | - Bi-Song Yue
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu 610064, P. R. China
| | - Gui-Jun Zhao
- Chongqing Engineering Technology Research Center for GAP of Genuine Medicinal Materials, Chongqing Institute of Medicinal Plant Cultivation, Chongqing 408435, P. R. China
| |
Collapse
|
10
|
Asadi A, Ebrahimi A, Rashidi-Monfared S, Basiri M, Akbari-Afjani J. Comprehensive functional analysis and mapping of SSR markers in the chickpea genome (Cicer arietinum L.). Comput Biol Chem 2019; 84:107169. [PMID: 31812779 DOI: 10.1016/j.compbiolchem.2019.107169] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2018] [Revised: 11/16/2019] [Accepted: 11/18/2019] [Indexed: 11/19/2022]
Abstract
Plant molecular breeding largely depends on the relationship between molecular markers and major traits. Herein, a total of 32,962 genomic simple sequence repeats (SSRs) were detected in the whole genome of chickpea with an average density of 94.93 SSRs/Mb. Chickpea chromosomes uniformity test indicated that the genomic SSRs (gSSRs) were steadily distributed across the genome. Moreover, 48,667 transcriptome sequences were analyzed and 1949 SSR-containing transcript assembly contigs (TACs) were identified. The analysis showed that di- and trinucleotide SSRs were the most frequent SSR motifs within the transcriptome sequences. Among them, AT and TTA and AG and TTC motifs within the transcriptome showed the highest frequencies among di- and trinucleotide repeat motifs, respectively. The SSRs-containing TACs were compared to the GenBank non-redundant database using BLASTX, and subsequently, gene ontology (GO) analysis was performed using QuickGO browser to reduce complexity and highlight biological processes associated with the SSRs-containing TACs. The identified SSRs-containing TACs were categorized into 35 enriched functional-related gene group. The mapping of characterized SSRs-containing TACs onto chickpea chromosomes was performed using BLASTN. The mapping result showed that, a total of 1798 SSRs-containing TACs were mapped onto the chickpea genome. Based on the functional analysis result, 249 and 242 of the mapped SSRs-containing TACs were found in the genes encoding for putative stress-related proteins and transcription factors, respectively. The results presented here can be applied to improve and speed up the chickpea breeding programs.
Collapse
Affiliation(s)
- AliAkbar Asadi
- Agricultural Biotechnology Department, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran
| | - Amin Ebrahimi
- Agronomy and Plant Breeding Department, Faculty of Agriculture, Shahrood University of Technology, Semnan, Iran
| | - Sajad Rashidi-Monfared
- Agricultural Biotechnology Department, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran.
| | - Mohammad Basiri
- Agricultural Biotechnology Department, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran
| | - Javad Akbari-Afjani
- Agricultural Biotechnology Department, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran
| |
Collapse
|
11
|
Qi WH, Jiang XM, Yan CC, Zhang WQ, Xiao GS, Yue BS, Zhou CQ. Distribution patterns and variation analysis of simple sequence repeats in different genomic regions of bovid genomes. Sci Rep 2018; 8:14407. [PMID: 30258087 PMCID: PMC6158176 DOI: 10.1038/s41598-018-32286-5] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Accepted: 09/04/2018] [Indexed: 01/23/2023] Open
Abstract
As the first examination of distribution, guanine-cytosine (GC) pattern, and variation analysis of microsatellites (SSRs) in different genomic regions of six bovid species, SSRs displayed nonrandomly distribution in different regions. SSR abundances are much higher in the introns, transposable elements (TEs), and intergenic regions compared to the 3′-untranslated regions (3′UTRs), 5′UTRs and coding regions. Trinucleotide perfect SSRs (P-SSRs) were the most frequent in the coding regions, whereas, mononucleotide P-SSRs were the most in the introns, 3′UTRs, TEs, and intergenic regions. Trifold P-SSRs had more GC-contents in the 5′UTRs and coding regions than that in the introns, 3′UTRs, TEs, and intergenic regions, whereas mononucleotide P-SSRs had the least GC-contents in all genomic regions. The repeat copy numbers (RCN) of the same mono- to hexanucleotide P-SSRs showed significantly different distributions in different regions (P < 0.01). Except for the coding regions, mononucleotide P-SSRs had the most RCNs, followed by the pattern: di- > tri- > tetra- > penta- > hexanucleotide P-SSRs in the same regions. The analysis of coefficient of variability (CV) of SSRs showed that the CV variations of RCN of the same mono- to hexanucleotide SSRs were relative higher in the intronic and intergenic regions, followed by the CV variation of RCN in the TEs, and the relative lower was in the 5′UTRs, 3′UTRs, and coding regions. Wide SSR analysis of different genomic regions has helped to reveal biological significances of their distributions.
Collapse
Affiliation(s)
- Wen-Hua Qi
- College of Biology and Food Engineering, Chongqing Three Gorges University, Chongqing, 404100, P. R. China
| | - Xue-Mei Jiang
- College of Environmental and Chemistry Engineering, Chongqing Three Gorges University, Chongqing, 404100, P. R. China
| | - Chao-Chao Yan
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu, 610064, P. R. China
| | - Wan-Qing Zhang
- College of Life Sciences, Sichuan Agricultural University, Ya'an, Sichuan Province, 625014, P. R. China
| | - Guo-Sheng Xiao
- College of Biology and Food Engineering, Chongqing Three Gorges University, Chongqing, 404100, P. R. China
| | - Bi-Song Yue
- Key Laboratory of Bio-resources and Eco-environment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu, 610064, P. R. China
| | - Cai-Quan Zhou
- Key Laboratory of Southwest China Wildlife Resources Conservation (Ministry of Education), China West Normal University, Nanchong, 637009, P. R. China.
| |
Collapse
|
12
|
Distinct patterns of simple sequence repeats and GC distribution in intragenic and intergenic regions of primate genomes. Aging (Albany NY) 2017; 8:2635-2654. [PMID: 27644032 PMCID: PMC5191860 DOI: 10.18632/aging.101025] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2016] [Accepted: 08/22/2016] [Indexed: 01/23/2023]
Abstract
As the first systematic examination of simple sequence repeats (SSRs) and guanine-cytosine (GC) distribution in intragenic and intergenic regions of ten primates, our study showed that SSRs and GC displayed nonrandom distribution for both intragenic and intergenic regions, suggesting that they have potential roles in transcriptional or translational regulation. Our results suggest that the majority of SSRs are distributed in non-coding regions, such as the introns, TEs, and intergenic regions. In these primates, trinucleotide perfect (P) SSRs were the most abundant repeats type in the 5'UTRs and CDSs, whereas, mononucleotide P-SSRs were the most in the intron, 3'UTRs, TEs, and intergenic regions. The GC-contents varied greatly among different intragenic and intergenic regions: 5'UTRs > CDSs > 3'UTRs > TEs > introns > intergenic regions, and high GC-content was frequently distributed in exon-rich regions. Our results also showed that in the same intragenic and intergenic regions, the distribution of GC-contents were great similarity in the different primates. Tri- and hexanucleotide P-SSRs had the most GC-contents in the 5'UTRs and CDSs, whereas mononucleotide P-SSRs had the least GC-contents in the six genomic regions of these primates. The most frequent motifs for different length varied obviously with the different genomic regions.
Collapse
|
13
|
Ma Z. Genome-wide characterization of perfect microsatellites in yak (Bos grunniens). Genetica 2015; 143:515-20. [PMID: 26071092 DOI: 10.1007/s10709-015-9849-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2014] [Accepted: 06/05/2015] [Indexed: 11/25/2022]
Abstract
Microsatellites or simple sequence repeats (SSRs) constitute a significant portion of genomes and play an important role in gene function and genome organization. The availability of a complete genome sequence for yak (Bos grunniens) has made it possible to carry out genome-wide analysis of microsatellites in this species. We analyzed the abundance and density of perfect SSRs in the yak genome. We found a total of 723,172 SSRs with 1-6 bp nucleotide motifs, indicating that about 0.47 % of the yak whole genome sequence (2.66 Gb) comprises perfect SSRs, the average length of which was 17.34 bp/Mb. The average frequency and density of perfect SSRs was 272.18 loci/Mb and 4719.25 bp/Mb, respectively. The proportion of the six classes of perfect SSRs was not evenly distributed in the yak genome. Mononucleotide repeats (44.04 %) with a total number of 318,435 and a average length of 14.71 bp appeared to be the most abundant SSRs class, while the percentages of dinucleotide, trinucleotide, pentanucleotide, tetranucleotide and hexanucleotide repeats was 24.11 %, 15.80 %, 9.50 %, 6.40 % and 0.15 %, respectively. Different repeat classes of SSRs varied in their repeat number with the highest being 1206. Our results suggest that 15 motifs comprised the predominant categories with a frequency above 1 loci/Mb: A, AC, AT, AG, AGC, AAC, AAT, ACC, ATTT, GTTT, AATG, CTTT, ATGG, AACTG and ATCTG.
Collapse
Affiliation(s)
- Zhijie Ma
- Qinghai Academy of Animal Science and Veterinary Medicine, Qinghai University, No. 1 Weier Road, Bio-Science Industrial District, Xining, 810016, Qinghai, People's Republic of China,
| |
Collapse
|
14
|
Sincero TCM, Stoco PH, Steindel M, Vallejo GA, Grisard EC. Trypanosoma rangeli displays a clonal population structure, revealing a subdivision of KP1(-) strains and the ancestry of the Amazonian group. Int J Parasitol 2015; 45:225-35. [PMID: 25592964 DOI: 10.1016/j.ijpara.2014.11.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2014] [Revised: 11/12/2014] [Accepted: 11/24/2014] [Indexed: 12/13/2022]
Abstract
Assessment of the genetic variability and population structure of Trypanosoma rangeli, a non-pathogenic American trypanosome, was carried out through microsatellite and single-nucleotide polymorphism analyses. Two approaches were used for microsatellite typing: data mining in expressed sequence tag /open reading frame expressed sequence tags libraries and PCR-based Isolation of Microsatellite Arrays from genomic libraries. All microsatellites found were evaluated for their abundance, frequency and usefulness as markers. Genotyping of T. rangeli strains and clones was performed for 18 loci amplified by PCR from expressed sequence tag/open reading frame expressed sequence tags libraries. The presence of single-nucleotide polymorphisms in the nuclear, multi-copy, spliced leader gene was assessed in 18 T. rangeli strains, and the results show that T. rangeli has a predominantly clonal population structure, allowing a robust phylogenetic analysis. Microsatellite typing revealed a subdivision of the KP1(-) genetic group, which may be influenced by geographical location and/or by the co-evolution of parasite and vectors occurring within the same geographical areas. The hypothesis of parasite-vector co-evolution was corroborated by single-nucleotide polymorphism analysis of the spliced leader gene. Taken together, the results suggest three T. rangeli groups: (i) the T. rangeli Amazonian group; (ii) the T. rangeli KP1(-) group; and (iii) the T. rangeli KP1(+) group. The latter two groups possibly evolved from the Amazonian group to produce KP1(+) and KP1(-) strains.
Collapse
Affiliation(s)
- Thaís Cristine Marques Sincero
- Universidade Federal de Santa Catarina (UFSC), Centro de Ciências da Saúde (CCS), Departamento de Análises Clínicas (ACL), Setor E, Bloco K, Florianópolis, SC 88.040-970, Brazil.
| | - Patricia Hermes Stoco
- Universidade Federal de Santa Catarina (UFSC), Centro de Ciências Biológicas (CCB), Departamento de Microbiologia, Imunologia e Parasitologia (MIP), Setor F, Bloco A, Florianópolis, SC 88.040-970, Brazil
| | - Mário Steindel
- Universidade Federal de Santa Catarina (UFSC), Centro de Ciências Biológicas (CCB), Departamento de Microbiologia, Imunologia e Parasitologia (MIP), Setor F, Bloco A, Florianópolis, SC 88.040-970, Brazil
| | - Gustavo Adolfo Vallejo
- Laboratorio de Investigaciones en Parasitología Tropical, Universidad del Tolima, Altos de Santa Helena, A.A. 546, Ibagué, Colombia
| | - Edmundo Carlos Grisard
- Universidade Federal de Santa Catarina (UFSC), Centro de Ciências Biológicas (CCB), Departamento de Microbiologia, Imunologia e Parasitologia (MIP), Setor F, Bloco A, Florianópolis, SC 88.040-970, Brazil.
| |
Collapse
|
15
|
Biswas MK, Xu Q, Mayer C, Deng X. Genome wide characterization of short tandem repeat markers in sweet orange (Citrus sinensis). PLoS One 2014; 9:e104182. [PMID: 25148383 PMCID: PMC4141690 DOI: 10.1371/journal.pone.0104182] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2014] [Accepted: 07/09/2014] [Indexed: 11/18/2022] Open
Abstract
Sweet orange (Citrus sinensis) is one of the major cultivated and most-consumed citrus species. With the goal of enhancing the genomic resources in citrus, we surveyed, developed and characterized microsatellite markers in the ≈347 Mb sequence assembly of the sweet orange genome. A total of 50,846 SSRs were identified with a frequency of 146.4 SSRs/Mbp. Dinucleotide repeats are the most frequent repeat class and the highest density of SSRs was found in chromosome 4. SSRs are non-randomly distributed in the genome and most of the SSRs (62.02%) are located in the intergenic regions. We found that AT-rich SSRs are more frequent than GC-rich SSRs. A total number of 21,248 SSR primers were successfully developed, which represents 89 SSR markers per Mb of the genome. A subset of 950 developed SSR primer pairs were synthesized and tested by wet lab experiments on a set of 16 citrus accessions. In total we identified 534 (56.21%) polymorphic SSR markers that will be useful in citrus improvement. The number of amplified alleles ranges from 2 to 12 with an average of 4 alleles per marker and an average PIC value of 0.75. The newly developed sweet orange primer sequences, their in silico PCR products, exact position in the genome assembly and putative function are made publicly available. We present the largest number of SSR markers ever developed for a citrus species. Almost two thirds of the markers are transferable to 16 citrus relatives and may be used for constructing a high density linkage map. In addition, they are valuable for marker-assisted selection studies, population structure analyses and comparative genomic studies of C. sinensis with other citrus related species. Altogether, these markers provide a significant contribution to the citrus research community.
Collapse
Affiliation(s)
- Manosh Kumar Biswas
- Key Laboratory of Horticultural Plant Biology of Ministry of Education (MOE), Huazhong Agricultural University, Wuhan, Hubei, P.R. China
| | - Qiang Xu
- Key Laboratory of Horticultural Plant Biology of Ministry of Education (MOE), Huazhong Agricultural University, Wuhan, Hubei, P.R. China
| | | | - Xiuxin Deng
- Key Laboratory of Horticultural Plant Biology of Ministry of Education (MOE), Huazhong Agricultural University, Wuhan, Hubei, P.R. China
- * E-mail:
| |
Collapse
|
16
|
Chen B, Zhang YJ, He Z, Li W, Si F, Tang Y, He Q, Qiao L, Yan Z, Fu W, Che Y. De novo transcriptome sequencing and sequence analysis of the malaria vector Anopheles sinensis (Diptera: Culicidae). Parasit Vectors 2014; 7:314. [PMID: 25000941 PMCID: PMC4105132 DOI: 10.1186/1756-3305-7-314] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2014] [Accepted: 06/23/2014] [Indexed: 11/10/2022] Open
Abstract
Background Anopheles sinensis is the major malaria vector in China and Southeast Asia. Vector control is one of the most effective measures to prevent malaria transmission. However, there is little transcriptome information available for the malaria vector. To better understand the biological basis of malaria transmission and to develop novel and effective means of vector control, there is a need to build a transcriptome dataset for functional genomics analysis by large-scale RNA sequencing (RNA-seq). Methods To provide a more comprehensive and complete transcriptome of An. sinensis, eggs, larvae, pupae, male adults and female adults RNA were pooled together for cDNA preparation, sequenced using the Illumina paired-end sequencing technology and assembled into unigenes. These unigenes were then analyzed in their genome mapping, functional annotation, homology, codon usage bias and simple sequence repeats (SSRs). Results Approximately 51.6 million clean reads were obtained, trimmed, and assembled into 38,504 unigenes with an average length of 571 bp, an N50 of 711 bp, and an average GC content 51.26%. Among them, 98.4% of unigenes could be mapped onto the reference genome, and 69% of unigenes could be annotated with known biological functions. Homology analysis identified certain numbers of An. sinensis unigenes that showed homology or being putative 1:1 orthologues with genomes of other Dipteran species. Codon usage bias was analyzed and 1,904 SSRs were detected, which will provide effective molecular markers for the population genetics of this species. Conclusions Our data and analysis provide the most comprehensive transcriptomic resource and characteristics currently available for An. sinensis, and will facilitate genetic, genomic studies, and further vector control of An. sinensis.
Collapse
Affiliation(s)
- Bin Chen
- Institute of Entomology and Molecular Biology, College of Life Sciences, Chongqing Normal University, Chongqing, P R, China.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Asadi AA, Rashidi Monfared S. Characterization of EST-SSR markers in durum wheat EST library and functional analysis of SSR-containing EST fragments. Mol Genet Genomics 2014; 289:625-40. [PMID: 24652471 DOI: 10.1007/s00438-014-0839-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2013] [Accepted: 03/01/2014] [Indexed: 11/28/2022]
Abstract
The goal of this study is to identify characterization of expressed sequence tag (EST)-simple sequence repeats (SSR) markers from EST library of durum wheat and functional analysis of SSR-containing EST sequences for application in comparative genomics and breeding. 19,141 sequences were analyzed among which 18,937 ESTs were selected. Consistent with MISA results, 313 EST-SSRs were yielded. The final EST-SSRs were compared to the GenBank non-redundant database using BLASTX and classified based on these functions. Results indicated that the perfect EST-SSRs are the most frequent. The TTG/CTG imperfect EST-SSR had gamma-gliadin putative function that can be appropriate for durum wheat. Also, the mononucleotides and trinucleotides were the most frequent. Findings suggested that the identified EST-SSRs could be categorized into 83 types. Motifs TTG in trinucleotides and TC in dinucleotides had the highest frequency. TTG is the new motif in durum wheat identified in this study. We identified new EST-SSRs with more than trinucleotide and detected motifs that have potential to code amino acids. Arginine was the most frequent amino acid. Enzymes had the highest frequency among predicted functions. EST-SSRs have been identified in this study can be used for developing ESS-SSR-based detection tool for durum wheat in future studies and will be a useful resource for molecular breeding, genetics, genomics, and environmental stress studies. Motifs coding amino acids could be used as a new source of functional markers and biological study. In addition to, designed new PCR primer pairs are new resources for to identify useful alleles in transcription factors, storage proteins, and enzymes which incorporated them again into the cultivated material.
Collapse
Affiliation(s)
- Ali Akbar Asadi
- Plant Breeding and Biotechnology Department, Agriculture College, Tarbiat Modares University, Tehran, Iran,
| | | |
Collapse
|
18
|
Abstract
Tandem repeats (TRs) extensively exist in the genomes of prokaryotes and eukaryotes. Based on the sequenced genomes and gene annotations of 31 plant and algal species in Phytozome version 8.0 (http://www.phytozome.net/), we examined TRs in a genome-wide scale, characterized their distributions and motif features, and explored their putative biological functions. Among the 31 species, no significant correlation was detected between the TR density and genome size. Interestingly, green alga Chlamydomonas reinhardtii (42,059 bp/Mbp) and castor bean Ricinus communis (55,454 bp/Mbp) showed much higher TR densities than all other species (13,209 bp/Mbp on average). In the 29 land plants, including 22 dicots, 5 monocots, and 2 bryophytes, 5′-UTR and upstream intergenic 200-nt (UI200) regions had the first and second highest TR densities, whereas in the two green algae (C. reinhardtii and Volvox carteri) the first and second highest densities were found in intron and coding sequence (CDS) regions, respectively. In CDS regions, trinucleotide and hexanucleotide motifs were those most frequently represented in all species. In intron regions, especially in the two green algae, significantly more TRs were detected near the intron–exon junctions. Within intergenic regions in dicots and monocots, more TRs were found near both the 5′ and 3′ ends of genes. GO annotation in two green algae revealed that the genes with TRs in introns are significantly involved in transcriptional and translational processing. As the first systematic examination of TRs in plant and green algal genomes, our study showed that TRs displayed nonrandom distribution for both intragenic and intergenic regions, suggesting that they have potential roles in transcriptional or translational regulation in plants and green algae.
Collapse
|
19
|
Sahu J, Sarmah R, Dehury B, Sarma K, Sahoo S, Sahu M, Barooah M, Modi MK, Sen P. Mining for SSRs and FDMs from expressed sequence tags of Camellia sinensis. Bioinformation 2012; 8:260-6. [PMID: 22493533 PMCID: PMC3321235 DOI: 10.6026/97320630008260] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2012] [Accepted: 03/21/2012] [Indexed: 11/23/2022] Open
Abstract
Simple Sequence Repeats (SSRs) developed from Expressed Sequence Tags (ESTs), known as EST-SSRs are most widely used and potentially valuable source of gene based markers for their high levels of crosstaxon portability, rapid and less expensive development. The EST sequence information in the publicly available databases is increasing in a faster rate. The emerging computational approach provides a better alternative process of development of SSR markers from the ESTs than the conventional methods. In the present study, 12,851 EST sequences of Camellia sinensis, downloaded from National Center for Biotechnology Information (NCBI) were mined for the development of Microsatellites. 6148 (4779 singletons and 1369 contigs) non redundant EST sequences were found after preprocessing and assembly of these sequences using various computational tools. Out of total 3822.68 kb sequence examined, 1636 (26.61%) EST sequences containing 2371 SSRs were detected with a density of 1 SSR/1.61 kb leading to development of 245 primer pairs. These mined EST-SSR markers will help further in the study of variability, mapping, evolutionary relationship in Camellia sinensis. In addition, these developed SSRs can also be applied for various studies across species.
Collapse
Affiliation(s)
- Jagajjit Sahu
- Agri-Bioinformatics Promotion Programme, Department of Agricultural Biotechnology, Assam Agricultural University, Jorhat-
785013, Assam, India
| | - Ranjan Sarmah
- Agri-Bioinformatics Promotion Programme, Department of Agricultural Biotechnology, Assam Agricultural University, Jorhat-
785013, Assam, India
| | - Budheswar Dehury
- Agri-Bioinformatics Promotion Programme, Department of Agricultural Biotechnology, Assam Agricultural University, Jorhat-
785013, Assam, India
| | - Kishore Sarma
- Agri-Bioinformatics Promotion Programme, Department of Agricultural Biotechnology, Assam Agricultural University, Jorhat-
785013, Assam, India
| | - Smita Sahoo
- Agri-Bioinformatics Promotion Programme, Department of Agricultural Biotechnology, Assam Agricultural University, Jorhat-
785013, Assam, India
| | - Mousumi Sahu
- Agri-Bioinformatics Promotion Programme, Department of Agricultural Biotechnology, Assam Agricultural University, Jorhat-
785013, Assam, India
| | - Madhumita Barooah
- Agri-Bioinformatics Promotion Programme, Department of Agricultural Biotechnology, Assam Agricultural University, Jorhat-
785013, Assam, India
| | - Mahendra Kumar Modi
- Agri-Bioinformatics Promotion Programme, Department of Agricultural Biotechnology, Assam Agricultural University, Jorhat-
785013, Assam, India
| | - Priyabrata Sen
- Agri-Bioinformatics Promotion Programme, Department of Agricultural Biotechnology, Assam Agricultural University, Jorhat-
785013, Assam, India
| |
Collapse
|
20
|
Development and characterization of 18 novel EST-SSRs from the western flower Thrips, Frankliniella occidentalis (Pergande). Int J Mol Sci 2012; 13:2863-2876. [PMID: 22489130 PMCID: PMC3317692 DOI: 10.3390/ijms13032863] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2011] [Revised: 02/27/2012] [Accepted: 02/28/2012] [Indexed: 01/01/2023] Open
Abstract
The western flower thrips, Frankliniella occidentalis (Pergande), is an invasive species and the most economically important pest within the insect order Thysanoptera. For a better understanding of the genetic makeup and migration patterns of F. occidentalis throughout the world, we characterized 18 novel polymorphic EST-derived microsatellites. The mutational mechanism of these EST-SSRs was also investigated to facilitate the selection of appropriate combinations of markers for population genetic studies. Genetic diversity of these novel markers was assessed in 96 individuals from three populations in China (Harbin, Dali, and Guiyang). The results showed that all these 18 loci were highly polymorphic; the number of alleles ranged from 2 to 15, with an average of 5.50 alleles per locus. The observed (HO) and expected (HE) heterozygosities ranged from 0.072 to 0.707 and 0.089 to 0.851, respectively. Furthermore, only two locus/population combinations (WFT144 in Dali and WFT50 in Guiyang) significantly deviated from Hardy–Weinberg equilibrium (HWE). Pairwise FST analysis showed a low but significant differentiation (0.026 < FST < 0.032) among all three pairwise population comparisons. Sequence analysis of alleles per locus revealed a complex mutational pattern of these EST-SSRs. Thus, these EST-SSRs are useful markers but greater attention should be paid to the mutational characteristics of these microsatellites when they are used in population genetic studies.
Collapse
|
21
|
Hamarsheh O, Amro A. Characterization of simple sequence repeats (SSRs) from Phlebotomus papatasi (Diptera: Psychodidae) expressed sequence tags (ESTs). Parasit Vectors 2011; 4:189. [PMID: 21958493 PMCID: PMC3191335 DOI: 10.1186/1756-3305-4-189] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2011] [Accepted: 09/29/2011] [Indexed: 10/31/2022] Open
Abstract
BACKGROUND Phlebotomus papatasi is a natural vector of Leishmania major, which causes cutaneous leishmaniasis in many countries. Simple sequence repeats (SSRs), or microsatellites, are common in eukaryotic genomes and are short, repeated nucleotide sequence elements arrayed in tandem and flanked by non-repetitive regions. The enrichment methods used previously for finding new microsatellite loci in sand flies remain laborious and time consuming; in silico mining, which includes retrieval and screening of microsatellites from large amounts of sequence data from sequence data bases using microsatellite search tools can yield many new candidate markers. RESULTS Simple sequence repeats (SSRs) were characterized in P. papatasi expressed sequence tags (ESTs) derived from a public database, National Center for Biotechnology Information (NCBI). A total of 42,784 sequences were mined, and 1,499 SSRs were identified with a frequency of 3.5% and an average density of 15.55 kb per SSR. Dinucleotide motifs were the most common SSRs, accounting for 67% followed by tri-, tetra-, and penta-nucleotide repeats, accounting for 31.1%, 1.5%, and 0.1%, respectively. The length of microsatellites varied from 5 to 16 repeats. Dinucleotide types; AG and CT have the highest frequency. Dinucleotide SSR-ESTs are relatively biased toward an excess of (AX)n repeats and a low GC base content. Forty primer pairs were designed based on motif lengths for further experimental validation. CONCLUSION The first large-scale survey of SSRs derived from P. papatasi is presented; dinucleotide SSRs identified are more frequent than other types. EST data mining is an effective strategy to identify functional microsatellites in P. papatasi.
Collapse
Affiliation(s)
- Omar Hamarsheh
- Department of Biological Sciences, Faculty of Science and Technology, Al-Quds University, PO Box 51000, Jerusalem, Palestine.
| | | |
Collapse
|
22
|
Miller CA, Buckley KM, Easley RL, Smith LC. An Sp185/333 gene cluster from the purple sea urchin and putative microsatellite-mediated gene diversification. BMC Genomics 2010; 11:575. [PMID: 20955585 PMCID: PMC3091723 DOI: 10.1186/1471-2164-11-575] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2010] [Accepted: 10/18/2010] [Indexed: 11/19/2022] Open
Abstract
Background The immune system of the purple sea urchin, Strongylocentrotus purpuratus, is complex and sophisticated. An important component of sea urchin immunity is the Sp185/333 gene family, which is significantly upregulated in immunologically challenged animals. The Sp185/333 genes are less than 2 kb with two exons and are members of a large diverse family composed of greater than 40 genes. The S. purpuratus genome assembly, however, contains only six Sp185/333 genes. This underrepresentation could be due to the difficulties that large gene families present in shotgun assembly, where multiple similar genes can be collapsed into a single consensus gene. Results To understand the genomic organization of the Sp185/333 gene family, a BAC insert containing Sp185/333 genes was assembled, with careful attention to avoiding artifacts resulting from collapse or artificial duplication/expansion of very similar genes. Twelve candidate BAC assemblies were generated with varying parameters and the optimal assembly was identified by PCR, restriction digests, and subclone sequencing. The validated assembly contained six Sp185/333 genes that were clustered in a 34 kb region at one end of the BAC with five of the six genes tightly clustered within 20 kb. The Sp185/333 genes in this cluster were no more similar to each other than to previously sequenced Sp185/333 genes isolated from three different animals. This was unexpected given their proximity and putative effects of gene homogenization in closely linked, similar genes. All six genes displayed significant similarity including both 5' and 3' flanking regions, which were bounded by microsatellites. Three of the Sp185/333 genes and their flanking regions were tandemly duplicated such that each repeated segment consisted of a gene plus 0.7 kb 5' and 2.4 kb 3' of the gene (4.5 kb total). Both edges of the segmental duplications were bounded by different microsatellites. Conclusions The high sequence similarity of the Sp185/333 genes and flanking regions, suggests that the microsatellites may promote genomic instability and are involved with gene duplication and/or gene conversion and the extraordinary sequence diversity of this family.
Collapse
Affiliation(s)
- Chase A Miller
- Genomics and Bioinformatics Program, Department of Biochemistry, Schoolof Medicine, The George Washington University, Washington, DC 20037, USA
| | | | | | | |
Collapse
|
23
|
Rouchka EC. Database of exact tandem repeats in the Zebrafish genome. BMC Genomics 2010; 11:347. [PMID: 20515480 PMCID: PMC2901318 DOI: 10.1186/1471-2164-11-347] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2009] [Accepted: 06/01/2010] [Indexed: 11/23/2022] Open
Abstract
Background Sequencing of the approximately 1.7 billion bases of the zebrafish genome is currently underway. To date, few high resolution genetic maps exist for the zebrafish genome, based mainly on single nucleotide polymorphisms (SNPs) and short microsatellite repeats. The desire to construct a higher resolution genetic map led to the construction of a database of tandemly repeating elements within the zebrafish Zv8 assembly. Description Exact tandem repeats with a repeat length of at least three bases and a copy number of at least 10 were reported. Repeats with a total length of 250 or fewer bases and their flanking regions were masked for known vertebrate repeats. Optimal primer pairs were computationally designed in the regions flanking the detected repeats. This database of exact tandem repeats can then be used as a resource by molecular biologists with interests in experimentally testing VNTRs within a zebrafish population. Conclusions A total of 116,915 repeats with a base length of at least three nucleotides were detected. The longest of these was a 54-base repeat with fourteen tandem copies. A significant number of repeats with a base length of 18, 24, 27 and 30 were detected, many with potentially novel proline-rich coding regions. Detection of exact tandem repeats in the zebrafish genome leads to a wealth of information regarding potential polymorphic sites for VNTRs. The association of many of these repeats with potentially novel yet similar coding regions yields an exciting potential for disease associated genes. A web interface for querying repeats is available at http://bioinformatics.louisville.edu/zebrafish/. This portal allows for users to search for a repeats of a selected base size from any valid specified region within the 25 linkage groups.
Collapse
Affiliation(s)
- Eric C Rouchka
- Department of Computer Engineering and Computer Science, Speed School of Engineering, University of Louisville, Duthie Center, Room 208, Louisville, KY, USA.
| |
Collapse
|
24
|
da Maia LC, de Souza VQ, Kopp MM, de Carvalho FIF, de Oliveira AC. Tandem repeat distribution of gene transcripts in three plant families. Genet Mol Biol 2009; 32:822-33. [PMID: 21637460 PMCID: PMC3036893 DOI: 10.1590/s1415-47572009005000091] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2009] [Accepted: 06/17/2009] [Indexed: 12/02/2022] Open
Abstract
Tandem repeats (microsatellites or SSRs) are molecular markers with great potential for plant genetic studies. Modern strategies include the transfer of these markers among widely studied and orphan species. In silico analyses allow for studying distribution patterns of microsatellites and predicting which motifs would be more amenable to interspecies transfer. Transcribed sequences (Unigene) from ten species of three plant families were surveyed for the occurrence of micro and minisatellites. Transcripts from different species displayed different rates of tandem repeat occurrence, ranging from 1.47% to 11.28%. Both similar and different patterns were found within and among plant families. The results also indicate a lack of association between genome size and tandem repeat fractions in expressed regions. The conservation of motifs among species and its implication on genome evolution and dynamics are discussed.
Collapse
Affiliation(s)
- Luciano Carlos da Maia
- Centro de Genômica e Fitomelhoramento, Faculdade de Agronomia Eliseu Maciel, Universidade Federal de Pelotas, Pelotas, RS Brazil
| | | | | | | | | |
Collapse
|
25
|
Bagshaw ATM, Pitt JPW, Gemmell NJ. High frequency of microsatellites in S. cerevisiae meiotic recombination hotspots. BMC Genomics 2008; 9:49. [PMID: 18226240 PMCID: PMC2267716 DOI: 10.1186/1471-2164-9-49] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2007] [Accepted: 01/28/2008] [Indexed: 11/25/2022] Open
Abstract
Background Microsatellites are highly abundant in eukaryotic genomes but their function and evolution are not yet well understood. Their elevated mutation rate makes them ideal markers of genetic difference, but high levels of unexplained heterogeneity in mutation rates among microsatellites at different genomic locations need to be elucidated in order to improve the power and accuracy of the many types of study that use them as genetic markers. Recombination could contribute to this heterogeneity, since while replication errors are thought to be the predominant mechanism for microsatellite mutation, meiotic recombination is involved in some mutation events. There is also evidence suggesting that microsatellites could function as recombination signals. The yeast S. cerevisiae is a useful model organism with which to further explore the link between microsatellites and recombination, since it is very amenable to genetic study, and meiotic recombination hotspots have been mapped throughout its entire genome. Results We examined in detail the relationship between microsatellites and hotspots of meiotic double-strand breaks, the precursors of meiotic recombination, throughout the S. cerevisiae genome. We included all tandem repeats with motif length (repeat period) between one and six base pairs. Long, short and two-copy arrays were considered separately. We found that long, mono-, di- and trinucleotide microsatellites are around twice as frequent in hot than non-hot intergenic regions. The associations are weak or absent for repeats with less than six copies, and also for microsatellites with 4–6 base pair motifs, but high-copy arrays with motif length greater than three are relatively very rare throughout the genome. We present evidence that the association between high-copy, short-motif microsatellites and recombination hotspots is not driven by effects on microsatellite distribution of other factors previously linked to both recombination and microsatellites, including transcription, GC-content and transposable elements. Conclusion Our findings suggest that a mutation bias relating to recombination hotspots causing repeats to form and grow, and/or regulation of a subset of hotspots by simple sequences, may be significant processes in yeast. Some previous evidence has cast doubt on both of these possibilities, and as a result they have not been explored on a large scale, but the strength of the association we report suggests that they deserve further experimental testing.
Collapse
|
26
|
da Maia LC, Palmieri DA, de Souza VQ, Kopp MM, de Carvalho FIF, Costa de Oliveira A. SSR Locator: Tool for Simple Sequence Repeat Discovery Integrated with Primer Design and PCR Simulation. INTERNATIONAL JOURNAL OF PLANT GENOMICS 2008; 2008:412696. [PMID: 18670612 PMCID: PMC2486402 DOI: 10.1155/2008/412696] [Citation(s) in RCA: 115] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/05/2007] [Revised: 01/29/2008] [Accepted: 05/20/2008] [Indexed: 05/18/2023]
Abstract
Microsatellites or SSRs (simple sequence repeats) are ubiquitous short tandem duplications occurring in eukaryotic organisms. These sequences are among the best marker technologies applied in plant genetics and breeding. The abundant genomic, BAC, and EST sequences available in databases allow the survey regarding presence and location of SSR loci. Additional information concerning primer sequences is also the target of plant geneticists and breeders. In this paper, we describe a utility that integrates SSR searches, frequency of occurrence of motifs and arrangements, primer design, and PCR simulation against other databases. This simulation allows the performance of global alignments and identity and homology searches between different amplified sequences, that is, amplicons. In order to validate the tool functions, SSR discovery searches were performed in a database containing 28 469 nonredundant rice cDNA sequences.
Collapse
Affiliation(s)
- Luciano Carlos da Maia
- Plant Genomics and Breeding Laboratory, Eliseu Maciel School of Agronomy, Federal University of Pelotas, Pelotas, RS 96.001-970, Brazil
| | - Dario Abel Palmieri
- Laboratory for Environmental Studies, Catholic University of Salvador, Salvador, BA, 40.220-140, Brazil
| | - Velci Queiroz de Souza
- Plant Genomics and Breeding Laboratory, Eliseu Maciel School of Agronomy, Federal University of Pelotas, Pelotas, RS 96.001-970, Brazil
| | - Mauricio Marini Kopp
- Plant Genomics and Breeding Laboratory, Eliseu Maciel School of Agronomy, Federal University of Pelotas, Pelotas, RS 96.001-970, Brazil
| | - Fernando Irajá Félix de Carvalho
- Plant Genomics and Breeding Laboratory, Eliseu Maciel School of Agronomy, Federal University of Pelotas, Pelotas, RS 96.001-970, Brazil
| | - Antonio Costa de Oliveira
- Plant Genomics and Breeding Laboratory, Eliseu Maciel School of Agronomy, Federal University of Pelotas, Pelotas, RS 96.001-970, Brazil
- *Antonio Costa de Oliveira:
| |
Collapse
|
27
|
Zhang L, Chen C, Cheng J, Wang S, Hu X, Hu J, Bao Z. Initial analysis of tandemly repetitive sequences in the genome of Zhikong scallop (Chlamys farreri Jones et Preston). DNA SEQUENCE : THE JOURNAL OF DNA SEQUENCING AND MAPPING 2007; 19:195-205. [PMID: 17852361 DOI: 10.1080/10425170701462316] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Tandemly repetitive sequences are widespread in all eukaryotic genomes, but data on tandem repeats are limited in Zhikong scallop (Chlamys farreri). In the present study, paired-end sequencing of 2016 individual fosmid clones resulted in 3646 sequences. A total of 2,286,986 bp of genomic sequences were generated, representing approximately 1.84 per thousand of the Zhikong scallop genome. Using tandem repeats finder (TRF) software, a total of 2500 tandem repeats were found, including 313 satellites, 1816 minisatellites and 371 microsatellites. The cumulative length of tandem repeats was 552,558 bp, accounting for 24.16% of total length. Specifically, the length of microsatellites, minisatellites and satellites was 9425, 336,001 and 207,132 bp, accounting for 1.71, 60.81 and 37.49% of the length of tandem repeats, and 0.41, 14.69 and 9.06% of total length, respectively. The detailed information on the characteristic of all repeat units was also represented, which will provide a useful resource for physical mapping and better utilization of the existing genomic information in Zhikong scallop.
Collapse
Affiliation(s)
- Lingling Zhang
- Division of Life Science and Technology, Laboratory of Marine Genetics and Breeding, Ocean University of China, Qingdao, People's Republic of China.
| | | | | | | | | | | | | |
Collapse
|
28
|
Hammock EAD. Gene Regulation as a Modulator of Social Preference in Voles. GENETICS OF SEXUAL DIFFERENTIATION AND SEXUALLY DIMORPHIC BEHAVIORS 2007; 59:107-27. [PMID: 17888796 DOI: 10.1016/s0065-2660(07)59004-8] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Most mammalian species are nonmonogamous: the female alone cares for the young and males and females do not share nest sites. Within the genus Microtus, there exists ample diversity in social structure for neuroethological and neurobiological investigation. Prairie voles (Microtus ochrogaster) are socially monogamous: both the males and females contribute to care of the young within a shared nest site as a breeding pair through multiple breeding seasons. Closely related species such as the montane (M. montanus) and meadow (M. pennsylvanicus) voles do not typically show these behaviors. Over a decade of research has demonstrated that species differences in neuropeptide systems play significant roles in the behavioral divergence of these species. In particular, species differences in regional gene expression patterns of neuropeptide receptors in the brain mediate some of the behavioral traits associated with the divergence in social structure. Differences in gene expression patterns of a key gene in mediating social behavior, the arginine vasopressin 1a receptor (avpr1a), appear to be due to species divergence in a repeat locus in the 5' regulatory region of avpr1a. This highly repetitive locus is prone to expansion and contraction over relatively short evolutionary timescales and may give rise to the rapid evolution of sociobehavioral traits.
Collapse
Affiliation(s)
- Elizabeth A D Hammock
- Department of Pharmacology, Vanderbilt Kennedy Center for Research on Human Development, Vanderbilt University, Nashville, Tennessee 37232, USA
| |
Collapse
|
29
|
Miao XX, Xub SJ, Li MH, Li MW, Huang JH, Dai FY, Marino SW, Mills DR, Zeng P, Mita K, Jia SH, Zhang Y, Liu WB, Xiang H, Guo QH, Xu AY, Kong XY, Lin HX, Shi YZ, Lu G, Zhang X, Huang W, Yasukochi Y, Sugasaki T, Shimada T, Nagaraju J, Xiang ZH, Wang SY, Goldsmith MR, Lu C, Zhao GP, Huang YP. Simple sequence repeat-based consensus linkage map of Bombyx mori. Proc Natl Acad Sci U S A 2005; 102:16303-8. [PMID: 16263926 PMCID: PMC1283447 DOI: 10.1073/pnas.0507794102] [Citation(s) in RCA: 72] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2005] [Indexed: 11/18/2022] Open
Abstract
We established a genetic linkage map employing 518 simple sequence repeat (SSR, or microsatellite) markers for Bombyx mori (silkworm), the economically and culturally important lepidopteran insect, as part of an international genomics program. A survey of six representative silkworm strains using 2,500 (CA)n- and (CT)n-based SSR markers revealed 17-24% polymorphism, indicating a high degree of homozygosity resulting from a long history of inbreeding. Twenty-nine SSR linkage groups were established in well characterized Dazao and C108 strains based on genotyping of 189 backcross progeny derived from an F(1) male mated with a C108 female. The clustering was further focused to 28 groups by genotyping 22 backcross progeny derived from an F(1) female mated with a C108 male. This set of SSR linkage groups was further assigned to the 28 chromosomes (established linkage groups) of silkworm aided by visible mutations and cleaved amplified polymorphic sequence markers developed from previously mapped genes, cDNA sequences, and cloned random amplified polymorphic DNAs. By integrating a visible mutation p (plain, larval marking) and 29 well conserved genes of insects onto this SSR-based linkage map, a second generation consensus silkworm genetic map with a range of 7-40 markers per linkage group and a total map length of approximately 3431.9 cM was constructed and its high efficiency for genotyping and potential application for synteny studies of Lepidoptera and other insects was demonstrated.
Collapse
Affiliation(s)
- Xue-Xia Miao
- Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 300 Fenglin Road, Shanghai 200032, China
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|