1
|
Pramanik K, Goswami AK, Kumar C, Singh R, Prabha R, Jha SK, Thakre M, Goswami S, Aditya K, Maurya A, Chanda S, Mishra P, Sarkar S, Kashyap A. Development of genome-wide SSR markers through in silico mining of guava ( Psidium guajava L.) genome for genetic diversity analysis and transferability studies across species and genera. FRONTIERS IN PLANT SCIENCE 2025; 16:1527866. [PMID: 40353228 PMCID: PMC12062180 DOI: 10.3389/fpls.2025.1527866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/13/2024] [Accepted: 04/01/2025] [Indexed: 05/14/2025]
Abstract
Guava (Psidium guajava L.) is one of the economically major fruit crops, abundant in nutrients and found growing in tropical-subtropical regions around the world. Ensuring sufficient genomic resources is crucial for crop species to enhance breeding efficiency and facilitate molecular breeding. However, genomic resources, especially microsatellite or simple sequence repeat (SSR) markers, are limited in guava. Therefore, novel genome-wide SSR markers were developed by utilizing chromosome assembly (GCA_016432845.1) of the "New Age" cultivar through GMATA, a comprehensive software. The software evaluated about 397.8 million base pairs (Mbp) of the guava genome sequence, where 87,372 SSR loci were utilized to design primers, ultimately creating 75,084 new SSR markers. After in silico analysis, a total of 75 g-SSR markers were chosen to screen 35 guava genotypes, encompassing wild Psidium species and five jamun genotypes. Of the 72 amplified novel g-SSR markers (FHTGSSRs), 53 showed polymorphism, suggesting significant genetic variation among the guava genotypes, including wild species. The 53 polymorphic g-SSR markers had an average of 3.04 alleles per locus for 35 selected guava genotypes. Besides, in this study, the mean values recorded for major allele frequency, gene diversity, observed heterozygosity, and polymorphism information content were 0.73, 0.38, 0.13, and 0.33, respectively. Among the wild Psidium species studied, the transferability of these novel g-SSR loci across different species was found to be 45.83% to 90.28%. Furthermore, 17 novel g-SSR markers were successfully amplified in all the selected Syzygium genotypes, of which only four markers could differentiate between two Syzygium species. A neighbour-joining (N-J) tree was constructed using 53 polymorphic g-SSR markers and classified 35 guava genotypes into four clades and one outlier, emphasizing the genetic uniqueness of wild Psidium species compared to cultivated genotypes. Model-based structure analysis divided the guava genotypes into two distinct genetic groups, a classification that was strongly supported by Principal Coordinate Analysis (PCoA). In addition, the AMOVA and PCoA analyses also indicated substantial genetic diversity among the selected guava genotypes, including wild Psidium species. Hence, the developed novel genome-wide genomic SSRs could enhance the availability of genomic resources and assist in the molecular breeding of guava.
Collapse
Affiliation(s)
- Kritidipta Pramanik
- Division of Fruits and Horticultural Technology, ICAR-Indian Agricultural Research Institute, New Delhi, India
| | - Amit Kumar Goswami
- Division of Fruits and Horticultural Technology, ICAR-Indian Agricultural Research Institute, New Delhi, India
| | - Chavlesh Kumar
- Division of Fruits and Horticultural Technology, ICAR-Indian Agricultural Research Institute, New Delhi, India
| | - Rakesh Singh
- Division of Genomic Resources, ICAR- National Bureau of Plant Genetic Resources, New Delhi, India
| | - Ratna Prabha
- Agricultural Knowledge Management Unit, ICAR-Indian Agricultural Research Institute, New Delhi, India
| | - Shailendra Kumar Jha
- Division of Genetics, ICAR-Indian Agricultural Research Institute, New Delhi, India
| | - Madhubala Thakre
- Division of Fruits and Horticultural Technology, ICAR-Indian Agricultural Research Institute, New Delhi, India
| | - Suneha Goswami
- Division of Biochemistry, ICAR- Indian Agricultural Research Institute, New Delhi, India
| | - Kaustav Aditya
- Division of Agricultural Statistics, ICAR- Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Avantika Maurya
- Division of Genomic Resources, ICAR- National Bureau of Plant Genetic Resources, New Delhi, India
| | - Sagnik Chanda
- Division of Molecular Biology and Biotechnology, ICAR- Indian Agricultural Research Institute, New Delhi, India
| | - Prabhanshu Mishra
- Division of Fruits and Horticultural Technology, ICAR-Indian Agricultural Research Institute, New Delhi, India
| | - Shilpa Sarkar
- Department of Horticulture, PGCA, Dr. Rajendra Prasad Central Agricultural University, Pusa, Bihar, India
| | - Ankita Kashyap
- Division of Fruits and Horticultural Technology, ICAR-Indian Agricultural Research Institute, New Delhi, India
| |
Collapse
|
2
|
Alves SIA, Dantas CWD, Macedo DB, Ramos RTJ. What are microsatellites and how to choose the best tool: a user-friendly review of SSR and 74 SSR mining tools. Front Genet 2024; 15:1474611. [PMID: 39606018 PMCID: PMC11599195 DOI: 10.3389/fgene.2024.1474611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2024] [Accepted: 10/30/2024] [Indexed: 11/29/2024] Open
Abstract
Microsatellites, also known as SSR or STR, are essential molecular markers in genomic research, playing crucial roles in genetic mapping, population genetics, and evolutionary studies. Their applications range from plant breeding to forensics, highlighting their diverse utility across disciplines. Despite their widespread use, traditional methods for SSR analysis are often laborious and time-consuming, requiring significant resources and expertise. To address these challenges, a variety of computational tools for SSR analysis have been developed, offering faster and more efficient alternatives to traditional methods. However, selecting the most appropriate tool can be daunting due to rapid technological advancements and the sheer number of options available. This study presents a comprehensive review and analysis of 74 SSR tools, aiming to provide researchers with a valuable resource for SSR analysis tool selection. The methodology employed includes thorough literature reviews, detailed tool comparisons, and in-depth analyses of tool functionality. By compiling and analyzing these tools, this study not only advances the field of genomic research but also contributes to the broader scientific community by facilitating informed decision-making in the selection of SSR analysis tools. Researchers seeking to understand SSRs and select the most appropriate tools for their projects will benefit from this comprehensive guide. Overall, this study enhances our understanding of SSR analysis tools, paving the way for more efficient and effective SSR research in various fields of study.
Collapse
Affiliation(s)
- Sandy Ingrid Aguiar Alves
- Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
- Laboratory of Simulation and Computational Biology — SIMBIC, High Performance Computing Center — CCAD, Federal University of Pará, Belém, Pará, Brazil
- Laboratory of Bioinformatics and Genomics of Microorganisms, Institute of Biological Sciences, Federal University of Pará, Belém, Pará, Brazil
| | - Carlos Willian Dias Dantas
- Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
- Laboratory of Simulation and Computational Biology — SIMBIC, High Performance Computing Center — CCAD, Federal University of Pará, Belém, Pará, Brazil
- Laboratory of Bioinformatics and Genomics of Microorganisms, Institute of Biological Sciences, Federal University of Pará, Belém, Pará, Brazil
| | - Daralyns Borges Macedo
- Laboratory of Simulation and Computational Biology — SIMBIC, High Performance Computing Center — CCAD, Federal University of Pará, Belém, Pará, Brazil
- Laboratory of Bioinformatics and Genomics of Microorganisms, Institute of Biological Sciences, Federal University of Pará, Belém, Pará, Brazil
| | - Rommel Thiago Jucá Ramos
- Laboratory of Simulation and Computational Biology — SIMBIC, High Performance Computing Center — CCAD, Federal University of Pará, Belém, Pará, Brazil
- Laboratory of Bioinformatics and Genomics of Microorganisms, Institute of Biological Sciences, Federal University of Pará, Belém, Pará, Brazil
| |
Collapse
|
3
|
Liu K, Xie N. Pipeline for developing polymorphic microsatellites in species without reference genomes. 3 Biotech 2022; 12:248. [PMID: 36039078 PMCID: PMC9418399 DOI: 10.1007/s13205-022-03313-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Accepted: 08/16/2022] [Indexed: 11/01/2022] Open
Abstract
Microsatellites, also known as simple sequence repeats (SSRs), are the preferred type of marker for many genetic applications. In conjunction with the ongoing development of next-generation sequencing, several bioinformatic tools have been developed for identifying SSRs from genomic or transcriptomic sequences. Although these tools are handy for generating polymorphic SSRs, their application almost always depends on an existing reference genome or self-assembly of the reference genome. With this in mind, we propose a pipeline for developing polymorphic SSRs that may be applied to species without reference genomes. Using a species without a reference genome (black Amur bream; Megalobrama terminalis Richardson, 1846) as a model, our pipeline was able to effectively discover polymorphic SSRs. Under different R parameters of a reference-free single nucleotide polymorphisms (SNPs) caller (ebwt2InDel), a total of 258, 208, 102, and 11 polymorphic SSRs were mined. To quantify the accuracy of the polymorphic SSRs detected using our pipeline, we analyzed 25 SSRs with PCR experiments. All primers were successfully amplified, and most SSRs (23 SSRs, 92%) were polymorphic. From the 36 individual black Amur bream, we acquired an average of 3.36 alleles per locus, ranging from one to 11. This demonstrates the effectiveness of our pipeline in identifying polymorphic SSRs and designing primers for SSR genotyping. Ultimately, our pipeline can effectively mine polymorphic SSRs for species without reference genomes, complementing SSR mining approaches based on reference genomes and helping to resolve biological issues that accompany these methods. Supplementary Information The online version contains supplementary material available at 10.1007/s13205-022-03313-0.
Collapse
Affiliation(s)
- Kai Liu
- Institute of Fishery Science, Hangzhou Academy of Agricultural Sciences, Hangzhou, Zhejiang China
| | - Nan Xie
- Institute of Fishery Science, Hangzhou Academy of Agricultural Sciences, Hangzhou, Zhejiang China
| |
Collapse
|
4
|
AutomAted RepeaT Identifier (AARTI): A tool to identify common, polymorphic, and unique microsatellites. Mitochondrion 2022; 65:161-165. [PMID: 35738354 DOI: 10.1016/j.mito.2022.06.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Accepted: 06/19/2022] [Indexed: 11/21/2022]
Abstract
Here we are presenting an automated computational pipeline used to mine 5976 mitochondrial genomes to identify common, polymorphic, and unique microsatellites also known as simple sequence repeats (SSRs). Microsatellites are repetitive motifs of 1-6 bases in a DNA sequence. Due to their abundance and highly polymorphic nature, microsatellites have become one of the widely used molecular/genetic markers valuable for many studies including gene tagging, genetic diversity, and species identification. Several computational tools dedicated to mine and categorize microsatellites in nucleotide sequences were developed; however, there is no tool which can identify unique, common and polymorphic microsatellites between each pair of nucleotide sequences. To explore such microsatellites, we have developed a fully automated computational pipeline named AutomAted RepeaT Identifier (AARTI). The AARTI is the only tool till date, which identifies common, polymorphic, and unique microsatellites between each pair of nucleotide sequences. The computational pipeline was constructed using the PERL programming language and the web server for the pipeline was developed with the help of PHP, HTML, CSS, and JavaScript. It was successfully tested to reproduce the results of previous study on 7 mitochondrial genomes of genus Orthotrichum. Moreover, the pipeline was also applied on 5846 (Metazoa) and 130 (Viridiplantae) mitochondrial genomes. The AARTI is freely available at https://lms.snu.edu.in/aarti/ and will certainly accelerate the studies of length variation in microsatellites between species. Additionally, it will be useful in comparative genomic studies, for the computational characterization of microsatellites, and has the potential to be a routine genome analysis pipeline for mitochondrial genomes.
Collapse
|
5
|
Automating microsatellite screening and primer design from multi-individual libraries using Micro-Primers. Sci Rep 2022; 12:295. [PMID: 34997147 PMCID: PMC8741888 DOI: 10.1038/s41598-021-04275-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Accepted: 12/10/2021] [Indexed: 11/08/2022] Open
Abstract
Analysis of intra- and inter-population diversity has become important for defining the genetic status and distribution patterns of a species and a powerful tool for conservation programs, as high levels of inbreeding could lead into whole population extinction in few generations. Microsatellites (SSR) are commonly used in population studies but discovering highly variable regions across species' genomes requires demanding computation and laboratorial optimization. In this work, we combine next generation sequencing (NGS) with automatic computing to develop a genomic-oriented tool for characterizing SSRs at the population level. Herein, we describe a new Python pipeline, named Micro-Primers, designed to identify, and design PCR primers for amplification of SSR loci from a multi-individual microsatellite library. By combining commonly used programs for data cleaning and microsatellite mining, this pipeline easily generates, from a fastq file produced by high-throughput sequencing, standard information about the selected microsatellite loci, including the number of alleles in the population subset, and the melting temperature and respective PCR product of each primer set. Additionally, potential polymorphic loci can be identified based on the allele ranges observed in the population, to easily guide the selection of optimal markers for the species. Experimental results show that Micro-Primers significantly reduces processing time in comparison to manual analysis while keeping the same quality of the results. The elapsed times at each step can be longer depending on the number of sequences to analyze and, if not assisted, the selection of polymorphic loci from multiple individuals can represent a major bottleneck in population studies.
Collapse
|