Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Pandey P, Bender MA, Johnson R, Patro R. deBGR: an efficient and near-exact representation of the weighted de Bruijn graph. Bioinformatics 2017;33:i133-i141. [PMID: 28881995 PMCID: PMC5870571 DOI: 10.1093/bioinformatics/btx261] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open

For:	Pandey P, Bender MA, Johnson R, Patro R. deBGR: an efficient and near-exact representation of the weighted de Bruijn graph. Bioinformatics 2017;33:i133-i141. [PMID: 28881995 PMCID: PMC5870571 DOI: 10.1093/bioinformatics/btx261] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open

Number

Cited by Other Article(s)

Xie P, Guo Y, Teng Y, Zhou W, Yu Y. GeneMiner: A tool for extracting phylogenetic markers from next-generation sequencing data. Mol Ecol Resour 2024;24:e13924. [PMID: 38197287 DOI: 10.1111/1755-0998.13924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Revised: 12/14/2023] [Accepted: 12/21/2023] [Indexed: 01/11/2024]

Pibiri GE. On weighted k-mer dictionaries. Algorithms Mol Biol 2023;18:3. [PMID: 37328897 DOI: 10.1186/s13015-023-00226-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Accepted: 05/13/2023] [Indexed: 06/18/2023] Open

Lu Y, Ge C, Cai B, Xu Q, Kong R, Chang S. Antibody sequences assembly method based on weighted de Bruijn graph. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023;20:6174-6190. [PMID: 37161102 DOI: 10.3934/mbe.2023266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]

Zhang Z, Xie P, Guo Y, Zhou W, Liu E, Yu Y. Easy353: A Tool to Get Angiosperms353 Genes for Phylogenomic Research. Mol Biol Evol 2022;39:6862883. [PMID: 36458838 PMCID: PMC9757696 DOI: 10.1093/molbev/msac261] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 10/28/2022] [Accepted: 11/29/2022] [Indexed: 12/04/2022] Open

Almodaresi F, Khan J, Madaminov S, Ferdman M, Johnson R, Pandey P, Patro R. An incrementally updatable and scalable system for large-scale sequence search using the Bentley-Saxe transformation. Bioinformatics 2022;38:3155-3163. [PMID: 35325039 PMCID: PMC9191210 DOI: 10.1093/bioinformatics/btac142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 01/10/2022] [Accepted: 03/22/2022] [Indexed: 11/14/2022] Open

Abstract

MOTIVATION

In the past few years, researchers have proposed numerous indexing schemes for searching large datasets of raw sequencing experiments. Most of these proposed indexes are approximate (i.e. with one-sided errors) in order to save space. Recently, researchers have published exact indexes-Mantis, VariMerge and Bifrost-that can serve as colored de Bruijn graph representations in addition to serving as k-mer indexes. This new type of index is promising because it has the potential to support more complex analyses than simple searches. However, in order to be useful as indexes for large and growing repositories of raw sequencing data, they must scale to thousands of experiments and support efficient insertion of new data.

RESULTS

In this paper, we show how to build a scalable and updatable exact raw sequence-search index. Specifically, we extend Mantis using the Bentley-Saxe transformation to support efficient updates, called Dynamic Mantis. We demonstrate Dynamic Mantis's scalability by constructing an index of ≈40K samples from SRA by adding samples one at a time to an initial index of 10K samples. Compared to VariMerge and Bifrost, Dynamic Mantis is more efficient in terms of index-construction time and memory, query time and memory and index size. In our benchmarks, VariMerge and Bifrost scaled to only 5K and 80 samples, respectively, while Dynamic Mantis scaled to more than 39K samples. Queries were over 24× faster in Mantis than in Bifrost (VariMerge does not immediately support general search queries we require). Dynamic Mantis indexes were about 2.5× smaller than Bifrost's indexes and about half as big as VariMerge's indexes.

AVAILABILITY AND IMPLEMENTATION

Dynamic Mantis implementation is available at https://github.com/splatlab/mantis/tree/mergeMSTs.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Dufault‐Thompson K, Jiang X. Applications of de Bruijn graphs in microbiome research. IMETA 2022;1:e4. [PMID: 38867733 PMCID: PMC10989854 DOI: 10.1002/imt2.4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 01/24/2022] [Accepted: 01/24/2022] [Indexed: 06/14/2024]

Břinda K, Baym M, Kucherov G. Simplitigs as an efficient and scalable representation of de Bruijn graphs. Genome Biol 2021;22:96. [PMID: 33823902 PMCID: PMC8025321 DOI: 10.1186/s13059-021-02297-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 02/10/2021] [Indexed: 12/30/2022] Open

Rahman A, Medevedev P. Representation of k-Mer Sets Using Spectrum-Preserving String Sets. J Comput Biol 2021;28:381-394. [PMID: 33290137 PMCID: PMC8066325 DOI: 10.1089/cmb.2020.0431] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open

Jiang P, Luo J, Wang Y, Deng P, Schmidt B, Tang X, Chen N, Wong L, Zhao L. kmcEx: memory-frugal and retrieval-efficient encoding of counted k-mers. Bioinformatics 2020;35:4871-4878. [PMID: 31038666 DOI: 10.1093/bioinformatics/btz299] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Revised: 04/02/2019] [Accepted: 04/19/2019] [Indexed: 12/25/2022] Open

Almodaresi F, Pandey P, Ferdman M, Johnson R, Patro R. An Efficient, Scalable, and Exact Representation of High-Dimensional Color Information Enabled Using de Bruijn Graph Search. J Comput Biol 2020;27:485-499. [PMID: 32176522 PMCID: PMC7185321 DOI: 10.1089/cmb.2019.0322] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Rowe WPM. When the levee breaks: a practical guide to sketching algorithms for processing the flood of genomic data. Genome Biol 2019;20:199. [PMID: 31519212 PMCID: PMC6744645 DOI: 10.1186/s13059-019-1809-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Accepted: 09/02/2019] [Indexed: 01/21/2023] Open

Marçais G, Solomon B, Patro R, Kingsford C. Sketching and Sublinear Data Structures in Genomics. Annu Rev Biomed Data Sci 2019. [DOI: 10.1146/annurev-biodatasci-072018-021156] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Mustafa H, Schilken I, Karasikov M, Eickhoff C, Rätsch G, Kahles A. Dynamic compression schemes for graph coloring. Bioinformatics 2019;35:407-414. [PMID: 30020403 PMCID: PMC6530811 DOI: 10.1093/bioinformatics/bty632] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2018] [Revised: 06/22/2018] [Accepted: 07/16/2018] [Indexed: 11/30/2022] Open

Pandey P, Almodaresi F, Bender MA, Ferdman M, Johnson R, Patro R. Mantis: A Fast, Small, and Exact Large-Scale Sequence-Search Index. Cell Syst 2018;7:201-207.e4. [PMID: 29936185 PMCID: PMC10964368 DOI: 10.1016/j.cels.2018.05.021] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2018] [Revised: 05/08/2018] [Accepted: 05/25/2018] [Indexed: 01/08/2023]