1
|
Kharnaior P, Tamang JP. Microbiome and metabolome in home-made fermented soybean foods of India revealed by metagenome-assembled genomes and metabolomics. Int J Food Microbiol 2023; 407:110417. [PMID: 37774634 DOI: 10.1016/j.ijfoodmicro.2023.110417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 09/10/2023] [Accepted: 09/22/2023] [Indexed: 10/01/2023]
Abstract
Grep-chhurpi, peha, peron namsing and peruñyaan are lesser-known home-made fermented soybean foods prepared by the native people of Arunachal Pradesh in India. Present work aims to study the microbiome, their functional annotations, metabolites and recovery of metagenome-assembled genomes (MAGs) in these four fermented soybean foods. Metagenomes revealed the dominance of bacteria (97.80 %) with minor traces of viruses, eukaryotes and archaea. Bacillota is the most abundant phylum with Bacillus subtilis as the abundant species. Metagenome also revealed the abundance of lactic acid bacteria such as Enterococcus casseliflavus, Enterococcus faecium, Mammaliicoccus sciuri and Staphylococcus saprophyticus in all samples. B. subtilis was the major species found in all products. Predictive metabolic pathways showed the abundance of genes associated with metabolisms. Metabolomics analysis revealed both targeted and untargeted metabolites, which suggested their role in flavour development and therapeutic properties. High-quality MAGs, identified as B. subtilis, Enterococcus faecalis, Pediococcus acidilactici and B. velezensis, showed the presence of several biomarkers corresponding to various bio-functional properties. Gene clusters of secondary metabolites (antimicrobial peptides) and CRISPR-Cas systems were detected in all MAGs. This present work also provides key elements related to the cultivability of identified species of MAGs for future use as starter cultures in fermented soybean food product development. Additionally, comparison of microbiome and metabolites of grep-chhurpi, peron namsing and peruñyaan with that of other fermented soybean foods of Asia revealed a distinct difference.
Collapse
Affiliation(s)
- Pynhunlang Kharnaior
- Department of Microbiology, Sikkim University, Science Building, Tadong 737102, Gangtok, Sikkim, India
| | - Jyoti Prakash Tamang
- Department of Microbiology, Sikkim University, Science Building, Tadong 737102, Gangtok, Sikkim, India.
| |
Collapse
|
2
|
Behboudi R, Nouri-Baygi M, Naghibzadeh M. RPTRF: A rapid perfect tandem repeat finder tool for DNA sequences. Biosystems 2023; 226:104869. [PMID: 36858110 DOI: 10.1016/j.biosystems.2023.104869] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 01/23/2023] [Accepted: 02/23/2023] [Indexed: 03/02/2023]
Abstract
The sequencing of eukaryotic genomes has shown that tandem repeats are abundant in their sequences. In addition to affecting some cellular processes, tandem repeats in the genome may be associated with specific diseases and have been the key to resolving criminal cases. Any tool developed for detecting tandem repeats must be accurate, fast, and useable in thousands of laboratories worldwide, including those with not very advanced computing capabilities. The proposed method, the Rapid Perfect Tandem Repeat Finder (RPTRF), minimizes the need for excess character comparison processing by indexing the input file and significantly helps to accelerate and prepare the output without artifacts by using an interval tree in the filtering section. The experiments demonstrated that the RPTRF is very fast in discovering all perfect tandem repeats of all categories of any genomic sequences. Although the detection of imperfect TRs is not the focus of the RPTRF, comparisons show that it even outperforms some other tools (in five selected gold standards) designed explicitly for this purpose. The implemented tool and how to use it are available on GitHub.
Collapse
Affiliation(s)
- Reza Behboudi
- Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Mostafa Nouri-Baygi
- Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran.
| | - Mahmoud Naghibzadeh
- Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
| |
Collapse
|
3
|
Kharnaior P, Tamang JP. Metagenomic-Metabolomic Mining of Kinema, a Naturally Fermented Soybean Food of the Eastern Himalayas. Front Microbiol 2022; 13:868383. [PMID: 35572705 PMCID: PMC9106393 DOI: 10.3389/fmicb.2022.868383] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Accepted: 02/24/2022] [Indexed: 12/18/2022] Open
Abstract
Kinema is a popular sticky fermented soybean food of the Eastern Himalayan regions of North East India, east Nepal, and south Bhutan. We hypothesized that some dominant bacteria in kinema may contribute to the formation of targeted and non-targeted metabolites for health benefits; hence, we studied the microbiome-metabolite mining of kinema. A total of 1,394,094,912 bp with an average of 464,698,304 ± 120,720,392 bp was generated from kinema metagenome, which resulted in the identification of 47 phyla, 331 families, 709 genera, and 1,560 species. Bacteria (97.78%) were the most abundant domain with the remaining domains of viruses, eukaryote, and archaea. Firmicutes (93.36%) was the most abundant phylum with 280 species of Bacillus, among which Bacillus subtilis was the most dominant species in kinema followed by B. glycinifermentans, B. cereus, B. licheniformis, B. thermoamylovorans, B. coagulans, B. circulans, B. paralicheniformis, and Brevibacillus borstelensis. Predictive metabolic pathways revealed the abundance of genes associated with metabolism (60.66%), resulting in 216 sub-pathways. A total of 361 metabolites were identified by metabolomic analysis (liquid chromatography-mass spectrophotometry, LC-MS). The presence of metabolites, such as chrysin, swainsonine, and 3-hydroxy-L-kynurenine (anticancer activity) and benzimidazole (antimicrobial, anticancer, and anti-HIV activities), and compounds with immunomodulatory effects in kinema supports its therapeutic potential. The correlation between the abundant species of Bacillus and primary and secondary metabolites was constructed with a bivariate result. This study proves that Bacillus spp. contribute to the formation of many targeted and untargeted metabolites in kinema for health-promoting benefits.
Collapse
Affiliation(s)
| | - Jyoti Prakash Tamang
- Department of Microbiology, School of Life Sciences, Sikkim University, Gangtok, India
| |
Collapse
|
4
|
Tamang JP, Kharnaior P, Pariyar P, Thapa N, Lar N, Win KS, Mar A, Nyo N. Shotgun sequence-based metataxonomic and predictive functional profiles of Pe poke, a naturally fermented soybean food of Myanmar. PLoS One 2021; 16:e0260777. [PMID: 34919575 PMCID: PMC8682898 DOI: 10.1371/journal.pone.0260777] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 11/09/2021] [Indexed: 11/19/2022] Open
Abstract
Pe poke is a naturally fermented sticky soybean food of Myanmar. The present study was aimed to profile the whole microbial community structure and their predictive gene functionality of pe poke samples prepared in different fermentation periods viz. 3 day (3ds), 4 days (4ds), 5 days (5ds) and sun-dried sample (Sds). The pH of samples was 7.6 to 8.7, microbial load was 2.1-3.9 x 108 cfu/g with dynamic viscosity of 4.0±1.0 to 8.0±1.0cP. Metataxonomic profile of pe poke samples showed different domains viz. bacteria (99.08%), viruses (0.65%), eukaryota (0.08%), archaea (0.03%) and unclassified sequences (0.16%). Firmicutes (63.78%) was the most abundant phylum followed by Proteobacteria (29.54%) and Bacteroidetes (5.44%). Bacillus thermoamylovorans was significantly abundant in 3ds and 4ds (p<0.05); Ignatzschineria larvae was significantly abundant in 5ds (p<0.05), whereas, Bacillus subtilis was significantly abundant in Sds (p <0.05). A total of 172 species of Bacillus was detected. In minor abundance, the existence of bacteriophages, archaea, and eukaryotes were also detected. Alpha diversity analysis showed the highest Simpson's diversity index in Sds comparable to other samples. Similarly, a non-parametric Shannon's diversity index was also highest in Sds. Good's coverage of 0.99 was observed in all samples. Beta diversity analysis using PCoA showed no significant clustering. Several species were shared between samples and many species were unique to each sample. In KEGG database, a total number of 33 super-pathways and 173 metabolic sub-pathways were annotated from the metagenomic Open Reading Frames. Predictive functional features of pe poke metagenome revealed the genes for the synthesis and metabolism of wide range of bioactive compounds including various essential amino acids, different vitamins, and enzymes. Spearman's correlation was inferred between the abundant species and functional features.
Collapse
Affiliation(s)
- Jyoti Prakash Tamang
- Department of Microbiology, DAICENTER (DBT-AIST International Centre for Translational and Environmental Research) and Bioinformatics Centre, School of Life Sciences, Sikkim University, Gangtok, Sikkim, India
| | - Pynhunlang Kharnaior
- Department of Microbiology, DAICENTER (DBT-AIST International Centre for Translational and Environmental Research) and Bioinformatics Centre, School of Life Sciences, Sikkim University, Gangtok, Sikkim, India
| | - Priyambada Pariyar
- Department of Microbiology, DAICENTER (DBT-AIST International Centre for Translational and Environmental Research) and Bioinformatics Centre, School of Life Sciences, Sikkim University, Gangtok, Sikkim, India
| | - Namrata Thapa
- Department of Zoology, Biotech Hub, Nar Bahadur Bhandari Degree College, Sikkim University, Tadong, Sikkim, India
| | - Ni Lar
- Department of Industrial Chemistry, University of Mandalay, Mandalay, Myanmar
| | - Khin Si Win
- Department of Industrial Chemistry, University of Mandalay, Mandalay, Myanmar
| | - Ae Mar
- Department of Industrial Chemistry, University of Mandalay, Mandalay, Myanmar
| | - Nyo Nyo
- Department of Geography, University of Mandalay, Mandalay, Myanmar
| |
Collapse
|
5
|
Genovese LM, Mosca MM, Pellegrini M, Geraci F. Dot2dot: accurate whole-genome tandem repeats discovery. Bioinformatics 2019; 35:914-922. [PMID: 30165507 PMCID: PMC6419916 DOI: 10.1093/bioinformatics/bty747] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2018] [Revised: 08/03/2018] [Accepted: 08/24/2018] [Indexed: 01/18/2023] Open
Abstract
MOTIVATION Large-scale sequencing projects have confirmed the hypothesis that eukaryotic DNA is rich in repetitions whose functional role needs to be elucidated. In particular, tandem repeats (TRs) (i.e. short, almost identical sequences that lie adjacent to each other) have been associated to many cellular processes and, indeed, are also involved in several genetic disorders. The need of comprehensive lists of TRs for association studies and the absence of a computational model able to capture their variability have revived research on discovery algorithms. RESULTS Building upon the idea that sequence similarities can be easily displayed using graphical methods, we formalized the structure that TRs induce in dot-plot matrices where a sequence is compared with itself. Leveraging on the observation that a compact representation of these matrices can be built and searched in linear time, we developed Dot2dot: an accurate algorithm fast enough to be suitable for whole-genome discovery of TRs. Experiments on five manually curated collections of TRs have shown that Dot2dot is more accurate than other established methods, and completes the analysis of the biggest known reference genome in about one day on a standard PC. AVAILABILITY AND IMPLEMENTATION Source code and datasets are freely available upon paper acceptance at the URL: https://github.com/Gege7177/Dot2dot. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Marco M Mosca
- Department of Computer Science, University of Liverpool, Liverpool, UK
| | - Marco Pellegrini
- Institute for Informatics and Telematics, CNR, Pisa, Italy.,Laboratory of Integrative Systems Medicine (LISM), Institute of Informatics and Telematics and Institute of Clinical Physiology, Pisa, Italy
| | - Filippo Geraci
- Institute for Informatics and Telematics, CNR, Pisa, Italy
| |
Collapse
|
6
|
Shamanskiy VA, Timonina VN, Popadin KY, Gunbin KV. ImtRDB: a database and software for mitochondrial imperfect interspersed repeats annotation. BMC Genomics 2019; 20:295. [PMID: 31284879 PMCID: PMC6614062 DOI: 10.1186/s12864-019-5536-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Mitochondria is a powerhouse of all eukaryotic cells that have its own circular DNA (mtDNA) encoding various RNAs and proteins. Somatic perturbations of mtDNA are accumulating with age thus it is of great importance to uncover the main sources of mtDNA instability. Recent analyses demonstrated that somatic mtDNA deletions depend on imperfect repeats of various nature between distant mtDNA segments. However, till now there are no comprehensive databases annotating all types of imperfect repeats in numerous species with sequenced complete mitochondrial genome as well as there are no algorithms capable to call all types of imperfect repeats in circular mtDNA. RESULTS We implemented naïve algorithm of pattern recognition by analogy to standard dot-plot construction procedures allowing us to find both perfect and imperfect repeats of four main types: direct, inverted, mirror and complementary. Our algorithm is adapted to specific characteristics of mtDNA such as circularity and an excess of short repeats - it calls imperfect repeats starting from the length of 10 b.p. We constructed interactive web available database ImtRDB depositing perfect and imperfect repeats positions in mtDNAs of more than 3500 Vertebrate species. Additional tools, such as visualization of repeats within a genome, comparison of repeat densities among different genomes and a possibility to download all results make this database useful for many biologists. Our first analyses of the database demonstrated that mtDNA imperfect repeats (i) are usually short; (ii) associated with unfolded DNA structures; (iii) four types of repeats positively correlate with each other forming two equivalent pairs: direct and mirror versus inverted and complementary, with identical nucleotide content and similar distribution between species; (iv) abundance of repeats is negatively associated with GC content; (v) dinucleotides GC versus CG are overrepresented on light chain of mtDNA covered by repeats. CONCLUSIONS ImtRDB is available at http://bioinfodbs.kantiana.ru/ImtRDB/ . It is accompanied by the software calling all types of interspersed repeats with different level of degeneracy in circular DNA. This database and software can become a very useful tool in various areas of mitochondrial and chloroplast DNA research.
Collapse
Affiliation(s)
- Viktor A Shamanskiy
- Center for Mitochondrial Functional Genomics, School of Life Science, Immanuel Kant Baltic Federal University, Kaliningrad, Russia
| | - Valeria N Timonina
- Center for Mitochondrial Functional Genomics, School of Life Science, Immanuel Kant Baltic Federal University, Kaliningrad, Russia
| | - Konstantin Yu Popadin
- Center for Mitochondrial Functional Genomics, School of Life Science, Immanuel Kant Baltic Federal University, Kaliningrad, Russia.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Konstantin V Gunbin
- Center for Mitochondrial Functional Genomics, School of Life Science, Immanuel Kant Baltic Federal University, Kaliningrad, Russia. .,Center of Brain Neurobiology and Neurogenetics, Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia.
| |
Collapse
|
7
|
Database of Periodic DNA Regions in Major Genomes. BIOMED RESEARCH INTERNATIONAL 2017; 2017:7949287. [PMID: 28182099 PMCID: PMC5274682 DOI: 10.1155/2017/7949287] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Revised: 12/07/2016] [Accepted: 12/21/2016] [Indexed: 12/11/2022]
Abstract
Summary. We analyzed several prokaryotic and eukaryotic genomes looking for the periodicity sequences availability and employing a new mathematical method. The method envisaged using the random position weight matrices and dynamic programming. Insertions and deletions were allowed inside periodicities, thus adding a novelty to the results we obtained. A periodicity length, one of the key periodicity features, varied from 2 to 50 nt. Totally over 60,000 periodicity sequences were found in 15 genomes including some chromosomes of the H. sapiens (partial), C. elegans, D. melanogaster, and A. thaliana genomes.
Collapse
|
8
|
Fertin G, Jean G, Radulescu A, Rusu I. Hybrid de novo tandem repeat detection using short and long reads. BMC Med Genomics 2015; 8 Suppl 3:S5. [PMID: 26399998 PMCID: PMC4582210 DOI: 10.1186/1755-8794-8-s3-s5] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Background As one of the most studied genome rearrangements, tandem repeats have a considerable impact on genetic backgrounds of inherited diseases. Many methods designed for tandem repeat detection on reference sequences obtain high quality results. However, in the case of a de novo context, where no reference sequence is available, tandem repeat detection remains a difficult problem. The short reads obtained with the second-generation sequencing methods are not long enough to span regions that contain long repeats. This length limitation was tackled by the long reads obtained with the third-generation sequencing platforms such as Pacific Biosciences technologies. Nevertheless, the gain on the read length came with a significant increase of the error rate. The main objective of nowadays studies on long reads is to handle the high error rate up to 16%. Methods In this paper we present MixTaR, the first de novo method for tandem repeat detection that combines the high-quality of short reads and the large length of long reads. Our hybrid algorithm uses the set of short reads for tandem repeat pattern detection based on a de Bruijn graph. These patterns are then validated using the long reads, and the tandem repeat sequences are constructed using local greedy assemblies. Results MixTaR is tested with both simulated and real reads from complex organisms. For a complete analysis of its robustness to errors, we use short and long reads with different error rates. The results are then analysed in terms of number of tandem repeats detected and the length of their patterns. Conclusions Our method shows high precision and sensitivity. With low false positive rates even for highly erroneous reads, MixTaR is able to detect accurate tandem repeats with pattern lengths varying within a significant interval.
Collapse
|
9
|
Brooks JL, Jefferson KK. Phase variation of poly-N-acetylglucosamine expression in Staphylococcus aureus. PLoS Pathog 2014; 10:e1004292. [PMID: 25077798 PMCID: PMC4117637 DOI: 10.1371/journal.ppat.1004292] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2013] [Accepted: 06/23/2014] [Indexed: 11/18/2022] Open
Abstract
Polysaccharide intercellular adhesin (PIA), also known as poly-N-acetyl-β-(1–6)-glucosamine (PIA/PNAG) is an important component of Staphylococcus aureus biofilms and also contributes to resistance to phagocytosis. The proteins IcaA, IcaD, IcaB, and IcaC are encoded within the intercellular adhesin (ica) operon and synthesize PIA/PNAG. We discovered a mechanism of phase variation in PIA/PNAG expression that appears to involve slipped-strand mispairing. The process is reversible and RecA-independent, and involves the expansion and contraction of a simple tetranucleotide tandem repeat within icaC. Inactivation of IcaC results in a PIA/PNAG-negative phenotype. A PIA/PNAG-hyperproducing strain gained a fitness advantage in vitro following the icaC mutation and loss of PIA/PNAG production. The mutation was also detected in two clinical isolates, suggesting that under certain conditions, loss of PIA/PNAG production may be advantageous during infection. There was also a survival advantage for an icaC-negative strain harboring intact icaADB genes relative to an isogenic icaADBC deletion mutant. Together, these results suggest that inactivation of icaC is a mode of phase variation for PIA/PNAG expression, that high-level production of PIA/PNAG carries a fitness cost, and that icaADB may contribute to bacterial fitness, by an unknown mechanism, in the absence of an intact icaC gene and PIA/PNAG production. Staphylococcal polysaccharide intercellular adhesin (PIA), also known as β-1-6-linked N-acetylglucosamine (PNAG) plays a role in immune evasion and biofilm formation. Evidence suggests that under certain circumstances PIA/PNAG production is beneficial, whereas at times, it may be advantageous for the bacteria to turn production off. In S. epidermidis, PIA/PNAG can be switched off when an insertion sequence recombines into the intercellular adhesin locus (ica). In this study, we have found a short tandem repeat sequence in the ica locus of S. aureus that can undergo expansion and contraction. The addition or subtraction of non-multiples of three of this repeat shifts the reading frame of the icaC gene, resulting in the complete loss of PIA/PNAG production. We hypothesize that certain conditions that make the PIA/PNAG-negative phenotype advantageous during infection, such as the development of an effective immune response to PIA/PNAG on the bacterial surface, would select for repeat mutants. In support of this hypothesis, we found clinical isolates with expansion and deletion of the repeat. These findings reveal a new on-off switch for the expression of PIA/PNAG.
Collapse
Affiliation(s)
- Jamie L. Brooks
- Department of Microbiology and Immunology, Virginia Commonwealth University School of Medicine, Richmond, Virginia, United States of America
| | - Kimberly K. Jefferson
- Department of Microbiology and Immunology, Virginia Commonwealth University School of Medicine, Richmond, Virginia, United States of America
- * E-mail:
| |
Collapse
|
10
|
Girgis HZ, Sheetlin SL. MsDetector: toward a standard computational tool for DNA microsatellites detection. Nucleic Acids Res 2013; 41:e22. [PMID: 23034809 PMCID: PMC3592430 DOI: 10.1093/nar/gks881] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2012] [Revised: 08/29/2012] [Accepted: 08/30/2012] [Indexed: 11/12/2022] Open
Abstract
Microsatellites (MSs) are DNA regions consisting of repeated short motif(s). MSs are linked to several diseases and have important biomedical applications. Thus, researchers have developed several computational tools to detect MSs. However, the currently available tools require adjusting many parameters, or depend on a list of motifs or on a library of known MSs. Therefore, two laboratories analyzing the same sequence with the same computational tool may obtain different results due to the user-adjustable parameters. Recent studies have indicated the need for a standard computational tool for detecting MSs. To this end, we applied machine-learning algorithms to develop a tool called MsDetector. The system is based on a hidden Markov model and a general linear model. The user is not obligated to optimize the parameters of MsDetector. Neither a list of motifs nor a library of known MSs is required. MsDetector is memory- and time-efficient. We applied MsDetector to several species. MsDetector located the majority of MSs found by other widely used tools. In addition, MsDetector identified novel MSs. Furthermore, the system has a very low false-positive rate resulting in a precision of up to 99%. MsDetector is expected to produce consistent results across studies analyzing the same sequence.
Collapse
Affiliation(s)
| | - Sergey L. Sheetlin
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 9600 Rockville Pike, Bethesda, MD 20896, USA
| |
Collapse
|
11
|
McKinnon C, Drouin G. Chromatin diminution in the copepod Mesocyclops edax: elimination of both highly repetitive and nonhighly repetitive DNA. Genome 2013; 56:1-8. [PMID: 23379333 DOI: 10.1139/gen-2012-0097] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Chromatin diminution, a developmentally regulated process of DNA elimination, is found in numerous eukaryotic species. In the copepod Mesocyclops edax, some 90% of its genomic DNA is eliminated during the differentiation of embryonic cells into somatic cells. Previous studies have shown that the eliminated DNA contains highly repetitive sequences. Here, we sequenced DNA fragments from pre- and postdiminution cells to determine whether nonhighly repetitive sequences are also eliminated during the process of chromatin diminution. Comparative analyses of these sequences, as well as the sequences eliminated from the genome of the copepod Cyclops kolensis, show that they all share similar abundances of tandem repeats, dispersed repeats, transposable elements, and various coding and noncoding sequences. This suggests that, in the chromatin diminution observed in M. edax, both highly repetitive and nonhighly repetitive sequences are eliminated and that there is no bias in the type of nonhighly repetitive DNA being eliminated.
Collapse
Affiliation(s)
- Christian McKinnon
- Département de biologie et Centre de recherche avancée en génomique environnementale, Université d'Ottawa, Ottawa, ON K1N 6N5, Canada
| | | |
Collapse
|
12
|
To detect and analyze sequence repeats whatever be their origin. Methods Mol Biol 2012; 859:69-90. [PMID: 22367866 DOI: 10.1007/978-1-61779-603-6_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/11/2023]
Abstract
The development of numerous programs for the identification of mobile elements raises the issue of the founding concepts that are shared in their design. This is necessary for at least three reasons. First, the cost of designing, developing, debugging, and maintaining software could present a danger of distracting biologists from their main bioanalysis tasks that require a lot of energy. Some key concepts on exact repeats are always underlying the search for genomic repeats and we recall the most important ones. All along the chapter, we try to select practical tools that may help the design of new identification pipelines. Second, the huge increase of sequence production capacities requires to use the most efficient data structures and algorithms to scale up tools in front of the data deluge. This paper provides an up-to-date glimpse on the art of string indexing and string matching. Third, there exists a growing knowledge on the architecture of mobile elements built from literature and the analysis of results generated by these pipelines. Besides data management which has led to the discovery of new families or new elements of a family, the community has an increasing need in knowledge management tools in order to compare, validate, or simply keep trace of mobile element models. We end the paper with first considerations on what could help the near future of such research on models.
Collapse
|