Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: DeSantis TZ, Keller K, Karaoz U, Alekseyenko AV, Singh NNS, Brodie EL, Pei Z, Andersen GL, Larsen N. Simrank: Rapid and sensitive general-purpose k-mer search tool. BMC Ecol 2011;11:11. [PMID: 21524302 PMCID: PMC3097142 DOI: 10.1186/1472-6785-11-11] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2010] [Accepted: 04/27/2011] [Indexed: 02/01/2023] Open

For:	DeSantis TZ, Keller K, Karaoz U, Alekseyenko AV, Singh NNS, Brodie EL, Pei Z, Andersen GL, Larsen N. Simrank: Rapid and sensitive general-purpose k-mer search tool. BMC Ecol 2011;11:11. [PMID: 21524302 PMCID: PMC3097142 DOI: 10.1186/1472-6785-11-11] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2010] [Accepted: 04/27/2011] [Indexed: 02/01/2023] Open

Number

Cited by Other Article(s)

Konopka T, Ng S, Smedley D. Diffusion enables integration of heterogeneous data and user-driven learning in a desktop knowledge-base. PLoS Comput Biol 2021;17:e1009283. [PMID: 34379637 PMCID: PMC8382188 DOI: 10.1371/journal.pcbi.1009283] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 08/23/2021] [Accepted: 07/16/2021] [Indexed: 11/20/2022] Open

A Computational Bipartite Graph-Based Drug Repurposing Method. Methods Mol Biol 2019;1903:115-127. [PMID: 30547439 DOI: 10.1007/978-1-4939-8955-3_7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]

Ames NJ, Barb JJ, Ranucci A, Kim H, Mudra SE, Cashion AK, Townsley DM, Childs R, Paster BJ, Faller LL, Wallen GR. The oral microbiome of patients undergoing treatment for severe aplastic anemia: a pilot study. Ann Hematol 2019;98:1351-1365. [PMID: 30919073 DOI: 10.1007/s00277-019-03599-w] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Accepted: 01/07/2019] [Indexed: 12/11/2022]

Abstract

The microbiome, an intriguing component of the human body, composed of trillions of microorganisms, has prompted scientific exploration to identify and understand its function and role in health and disease. As associations between microbiome composition, disease, and symptoms accumulate, the future of medicine hinges upon a comprehensive knowledge of these microorganisms for patient care. The oral microbiome may provide valuable and efficient insight for predicting future changes in disease status, infection, or treatment course. The main aim of this pilot study was to characterize the oral microbiome in patients with severe aplastic anemia (SAA) during their therapeutic course. SAA is a hematologic disease characterized by bone marrow failure which if untreated is fatal. Treatment includes either hematopoietic stem cell transplantation (HSCT) or immunosuppressive therapy (IST). In this study, we examined the oral microbiome composition of 24 patients admitted to the National Institutes of Health (NIH) Clinical Center for experimental SAA treatment. Tongue brushings were collected to assess the effects of treatment on the oral microbiome. Twenty patients received standard IST (equine antithymocyte globulin and cyclosporine) plus eltrombopag. Four patients underwent HSCT. Oral specimens were obtained at three time points during treatment and clinical follow-up. Using a novel approach to 16S rRNA gene sequence analysis encompassing seven hypervariable regions, results demonstrated a predictable decrease in microbial diversity over time among the transplant patients. Linear discriminant analysis or LefSe reported a total of 14 statistically significant taxa (p < 0.05) across time points in the HSCT patients. One-way plots of relative abundance for two bacterial species (Haemophilus parainfluenzae and Rothia mucilaginosa) in the HSCT group, show the differences in abundance between time points. Only one bacterial species (Prevotella histicola) was noted in the IST group with a p value of 0.065. The patients receiving immunosuppressive therapy did not exhibit a clear change in diversity over time; however, patient-specific changes were noted. In addition, we compared our findings to tongue dorsum samples from healthy participants in the Human Microbiome Project (HMP) database and found among HSCT patients, approximately 35% of bacterial identifiers (N = 229) were unique to this study population and were not present in tongue dorsum specimens obtained from the HMP. Among IST-treated patients, 45% (N = 351) were unique to these patients and not identified by the HMP. Although antibiotic use may have likely influenced bacterial composition and diversity, some literature suggests a decreased impact of antimicrobials on the oral microbiome as compared to their effect on the gut microbiome. Future studies with larger sample sizes that focus on the oral microbiome and the effects of antibiotics in an immunosuppressed patient population may help establish these potential associations.

Collapse

Adamberg K, Kolk K, Jaagura M, Vilu R, Adamberg S. The composition and metabolism of faecal microbiota is specifically modulated by different dietary polysaccharides and mucin: an isothermal microcalorimetry study. Benef Microbes 2018;9:21-34. [DOI: 10.3920/bm2016.0198] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Abstract The metabolic activity of colon microbiota is specifically affected by fibres with various monomer compositions, degree of polymerisation and branching. The supply of a variety of dietary fibres assures the diversity of gut microbial communities considered important for the well-being of the host. The aim of this study was to compare the impact of different oligo- and polysaccharides (galacto- and fructooligosaccharides, resistant starch, levan, inulin, arabinogalactan, xylan, pectin and chitin), and a glycoprotein mucin on the growth and metabolism of faecal microbiota in vitro by using isothermal microcalorimetry (IMC). Faecal samples from healthy donors were incubated in a phosphate-buffered defined medium with or without supplementation of a single substrate. The generation of heat was followed on-line, microbiota composition (V3-V4 region of the 16S rRNA using Illumina MiSeq v2) and concentrations of metabolites (HPLC) were determined at the end of growth. The multiauxic power-time curves obtained were substrate-specific. More than 70% of all substrates except chitin were fermented by faecal microbiota with total heat generation of up to 8 J/ml. The final metabolite patterns were in accordance with the microbiota changes. For arabinogalactan, xylan and levan, the fibre-affected distribution of bacterial taxa showed clear similarities (e.g. increase of Bacteroides ovatus and decrease of Bifidobacterium adolescentis). The formation of propionic acid, an important colon metabolite, was enhanced by arabinogalactan, xylan and mucin but not by galacto- and fructooligosaccharides or inulin. Mucin fermentation resulted in acetate, propionate and butyrate production in ratios previously observed for faecal samples, indicating that mucins may serve as major substrates for colon microbial population. IMC combined with analytical methods was shown to be an effective method for screening the impact of specific dietary fibres on functional changes in faecal microbiota. Collapse

Jünemann S, Kleinbölting N, Jaenicke S, Henke C, Hassa J, Nelkner J, Stolze Y, Albaum SP, Schlüter A, Goesmann A, Sczyrba A, Stoye J. Bioinformatics for NGS-based metagenomics and the application to biogas research. J Biotechnol 2017;261:10-23. [PMID: 28823476 DOI: 10.1016/j.jbiotec.2017.08.012] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Revised: 08/08/2017] [Accepted: 08/09/2017] [Indexed: 12/19/2022]

Bogachev MI, Markelov OA, Kayumov AR, Bunde A. Superstatistical model of bacterial DNA architecture. Sci Rep 2017;7:43034. [PMID: 28225058 PMCID: PMC5320525 DOI: 10.1038/srep43034] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2016] [Accepted: 01/18/2017] [Indexed: 12/15/2022] Open

Ferretti P, Farina S, Cristofolini M, Girolomoni G, Tett A, Segata N. Experimental metagenomics and ribosomal profiling of the human skin microbiome. Exp Dermatol 2017;26:211-219. [DOI: 10.1111/exd.13210] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/06/2016] [Indexed: 02/06/2023]

La Rosa M, Fiannaca A, Rizzo R, Urso A. Probabilistic topic modeling for the analysis and classification of genomic sequences. BMC Bioinformatics 2015;16 Suppl 6:S2. [PMID: 25916734 PMCID: PMC4416183 DOI: 10.1186/1471-2105-16-s6-s2] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open

Abstract

Background

Studies on genomic sequences for classification and taxonomic identification have a leading role in the biomedical field and in the analysis of biodiversity. These studies are focusing on the so-called barcode genes, representing a well defined region of the whole genome. Recently, alignment-free techniques are gaining more importance because they are able to overcome the drawbacks of sequence alignment techniques. In this paper a new alignment-free method for DNA sequences clustering and classification is proposed. The method is based on k-mers representation and text mining techniques.

Methods

The presented method is based on Probabilistic Topic Modeling, a statistical technique originally proposed for text documents. Probabilistic topic models are able to find in a document corpus the topics (recurrent themes) characterizing classes of documents. This technique, applied on DNA sequences representing the documents, exploits the frequency of fixed-length k-mers and builds a generative model for a training group of sequences. This generative model, obtained through the Latent Dirichlet Allocation (LDA) algorithm, is then used to classify a large set of genomic sequences.

Results and conclusions

We performed classification of over 7000 16S DNA barcode sequences taken from Ribosomal Database Project (RDP) repository, training probabilistic topic models. The proposed method is compared to the RDP tool and Support Vector Machine (SVM) classification algorithm in a extensive set of trials using both complete sequences and short sequence snippets (from 400 bp to 25 bp). Our method reaches very similar results to RDP classifier and SVM for complete sequences. The most interesting results are obtained when short sequence snippets are considered. In these conditions the proposed method outperforms RDP and SVM with ultra short sequences and it exhibits a smooth decrease of performance, at every taxonomic level, when the sequence length is decreased.

Collapse

Wall K, Cornell J, Bizzoco RW, Kelley ST. Biodiversity hot spot on a hot spot: novel extremophile diversity in Hawaiian fumaroles. Microbiologyopen 2015;4:267-281. [PMID: 25565172 PMCID: PMC4398508 DOI: 10.1002/mbo3.236] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2014] [Revised: 11/18/2014] [Accepted: 11/24/2014] [Indexed: 02/01/2023] Open

Evaluation of whole genome sequencing for outbreak detection of Salmonella enterica. PLoS One 2014;9:e87991. [PMID: 24505344 PMCID: PMC3913712 DOI: 10.1371/journal.pone.0087991] [Citation(s) in RCA: 186] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2013] [Accepted: 01/02/2014] [Indexed: 11/19/2022] Open

Abstract

Salmonella enterica is a common cause of minor and large food borne outbreaks. To achieve successful and nearly ‘real-time’ monitoring and identification of outbreaks, reliable sub-typing is essential. Whole genome sequencing (WGS) shows great promises for using as a routine epidemiological typing tool. Here we evaluate WGS for typing of S. Typhimurium including different approaches for analyzing and comparing the data. A collection of 34 S. Typhimurium isolates was sequenced. This consisted of 18 isolates from six outbreaks and 16 epidemiologically unrelated background strains. In addition, 8 S. Enteritidis and 5 S. Derby were also sequenced and used for comparison. A number of different bioinformatics approaches were applied on the data; including pan-genome tree, k-mer tree, nucleotide difference tree and SNP tree. The outcome of each approach was evaluated in relation to the association of the isolates to specific outbreaks. The pan-genome tree clustered 65% of the S. Typhimurium isolates according to the pre-defined epidemiology, the k-mer tree 88%, the nucleotide difference tree 100% and the SNP tree 100% of the strains within S. Typhimurium. The resulting outcome of the four phylogenetic analyses were also compared to PFGE reveling that WGS typing achieved the greater performance than the traditional method. In conclusion, for S. Typhimurium, SNP analysis and nucleotide difference approach of WGS data seem to be the superior methods for epidemiological typing compared to other phylogenetic analytic approaches that may be used on WGS. These approaches were also superior to the more classical typing method, PFGE. Our study also indicates that WGS alone is insufficient to determine whether strains are related or un-related to outbreaks. This still requires the combination of epidemiological data and whole genome sequencing results.

Collapse

Argimón S, Konganti K, Chen H, Alekseyenko AV, Brown S, Caufield PW. Comparative genomics of oral isolates of Streptococcus mutans by in silico genome subtraction does not reveal accessory DNA associated with severe early childhood caries. INFECTION GENETICS AND EVOLUTION 2013;21:269-78. [PMID: 24291226 DOI: 10.1016/j.meegid.2013.11.003] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2013] [Revised: 11/07/2013] [Accepted: 11/08/2013] [Indexed: 11/29/2022]

Abstract

Comparative genomics is a popular method for the identification of microbial virulence determinants, especially since the sequencing of a large number of whole bacterial genomes from pathogenic and non-pathogenic strains has become relatively inexpensive. The bioinformatics pipelines for comparative genomics usually include gene prediction and annotation and can require significant computer power. To circumvent this, we developed a rapid method for genome-scale in silico subtractive hybridization, based on blastn and independent of feature identification and annotation. Whole genome comparisons by in silico genome subtraction were performed to identify genetic loci specific to Streptococcus mutans strains associated with severe early childhood caries (S-ECC), compared to strains isolated from caries-free (CF) children. The genome similarity of the 20 S. mutans strains included in this study, calculated by Simrank k-mer sharing, ranged from 79.5% to 90.9%, confirming this is a genetically heterogeneous group of strains. We identified strain-specific genetic elements in 19 strains, with sizes ranging from 200 to 39 kb. These elements contained protein-coding regions with functions mostly associated with mobile DNA. We did not, however, identify any genetic loci consistently associated with dental caries, i.e., shared by all the S-ECC strains and absent in the CF strains. Conversely, we did not identify any genetic loci specific with the healthy group. Comparison of previously published genomes from pathogenic and carriage strains of Neisseria meningitidis with our in silico genome subtraction yielded the same set of genes specific to the pathogenic strains, thus validating our method. Our results suggest that S. mutans strains derived from caries active or caries free dentitions cannot be differentiated based on the presence or absence of specific genetic elements. Our in silico genome subtraction method is available as the Microbial Genome Comparison (MGC) tool, with a user-friendly JAVA graphical interface.

Collapse

Hermann-Bank ML, Skovgaard K, Stockmarr A, Larsen N, Mølbak L. The Gut Microbiotassay: a high-throughput qPCR approach combinable with next generation sequencing to study gut microbial diversity. BMC Genomics 2013;14:788. [PMID: 24225361 PMCID: PMC3879714 DOI: 10.1186/1471-2164-14-788] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2013] [Accepted: 10/14/2013] [Indexed: 12/12/2022] Open

Nagar A, Hahsler M. Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment. BMC Bioinformatics 2013;14 Suppl 11:S2. [PMID: 24564200 PMCID: PMC3846703 DOI: 10.1186/1471-2105-14-s11-s2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Abstract

Background

Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences.

Results

In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly.

Conclusion

Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA.

Collapse

Mizrahi-Man O, Davenport ER, Gilad Y. Taxonomic classification of bacterial 16S rRNA genes using short sequencing reads: evaluation of effective study designs. PLoS One 2013;8:e53608. [PMID: 23308262 PMCID: PMC3538547 DOI: 10.1371/journal.pone.0053608] [Citation(s) in RCA: 198] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2012] [Accepted: 12/03/2012] [Indexed: 01/12/2023] Open

Werner JJ, Koren O, Hugenholtz P, DeSantis TZ, Walters WA, Caporaso JG, Angenent LT, Knight R, Ley RE. Impact of training sets on classification of high-throughput bacterial 16s rRNA gene surveys. THE ISME JOURNAL 2012;6:94-103. [PMID: 21716311 PMCID: PMC3217155 DOI: 10.1038/ismej.2011.82] [Citation(s) in RCA: 358] [Impact Index Per Article: 29.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2011] [Revised: 05/10/2011] [Accepted: 05/12/2011] [Indexed: 01/10/2023]

Abstract

Taxonomic classification of the thousands-millions of 16S rRNA gene sequences generated in microbiome studies is often achieved using a naïve Bayesian classifier (for example, the Ribosomal Database Project II (RDP) classifier), due to favorable trade-offs among automation, speed and accuracy. The resulting classification depends on the reference sequences and taxonomic hierarchy used to train the model; although the influence of primer sets and classification algorithms have been explored in detail, the influence of training set has not been characterized. We compared classification results obtained using three different publicly available databases as training sets, applied to five different bacterial 16S rRNA gene pyrosequencing data sets generated (from human body, mouse gut, python gut, soil and anaerobic digester samples). We observed numerous advantages to using the largest, most diverse training set available, that we constructed from the Greengenes (GG) bacterial/archaeal 16S rRNA gene sequence database and the latest GG taxonomy. Phylogenetic clusters of previously unclassified experimental sequences were identified with notable improvements (for example, 50% reduction in reads unclassified at the phylum level in mouse gut, soil and anaerobic digester samples), especially for phylotypes belonging to specific phyla (Tenericutes, Chloroflexi, Synergistetes and Candidate phyla TM6, TM7). Trimming the reference sequences to the primer region resulted in systematic improvements in classification depth, and greatest gains at higher confidence thresholds. Phylotypes unclassified at the genus level represented a greater proportion of the total community variation than classified operational taxonomic units in mouse gut and anaerobic digester samples, underscoring the need for greater diversity in existing reference databases.

Collapse