Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Muggli MD, Bowe A, Noyes NR, Morley PS, Belk KE, Raymond R, Gagie T, Puglisi SJ, Boucher C. Succinct colored de Bruijn graphs. Bioinformatics 2017;33:3181-3187. [PMID: 28200001 PMCID: PMC5872255 DOI: 10.1093/bioinformatics/btx067] [Citation(s) in RCA: 51] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2016] [Revised: 01/16/2017] [Accepted: 02/10/2017] [Indexed: 02/06/2023] Open

For:	Muggli MD, Bowe A, Noyes NR, Morley PS, Belk KE, Raymond R, Gagie T, Puglisi SJ, Boucher C. Succinct colored de Bruijn graphs. Bioinformatics 2017;33:3181-3187. [PMID: 28200001 PMCID: PMC5872255 DOI: 10.1093/bioinformatics/btx067] [Citation(s) in RCA: 51] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2016] [Revised: 01/16/2017] [Accepted: 02/10/2017] [Indexed: 02/06/2023] Open

Number

Cited by Other Article(s)

Song L, Langmead B. Centrifuger: lossless compression of microbial genomes for efficient and accurate metagenomic sequence classification. Genome Biol 2024;25:106. [PMID: 38664753 PMCID: PMC11046777 DOI: 10.1186/s13059-024-03244-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 04/10/2024] [Indexed: 04/28/2024] Open

Schulz T, Parmigiani L, Rempel A, Stoye J. Methods for Pangenomic Core Detection. Methods Mol Biol 2024;2802:73-106. [PMID: 38819557 DOI: 10.1007/978-1-0716-3838-5_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]

Depuydt L, Renders L, Abeel T, Fostier J. Pan-genome de Bruijn graph using the bidirectional FM-index. BMC Bioinformatics 2023;24:400. [PMID: 37884897 PMCID: PMC10605969 DOI: 10.1186/s12859-023-05531-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 10/12/2023] [Indexed: 10/28/2023] Open

Abstract

BACKGROUND

Pan-genome graphs are gaining importance in the field of bioinformatics as data structures to represent and jointly analyze multiple genomes. Compacted de Bruijn graphs are inherently suited for this purpose, as their graph topology naturally reveals similarity and divergence within the pan-genome. Most state-of-the-art pan-genome graphs are represented explicitly in terms of nodes and edges. Recently, an alternative, implicit graph representation was proposed that builds directly upon the unidirectional FM-index. As such, a memory-efficient graph data structure is obtained that inherits the FM-index' backward search functionality. However, this representation suffers from a number of shortcomings in terms of functionality and algorithmic performance.

RESULTS

We present a data structure for a pan-genome, compacted de Bruijn graph that aims to address these shortcomings. It is built on the bidirectional FM-index, extending the ability of its unidirectional counterpart to navigate and search the graph in both directions. All basic graph navigation steps can be performed in constant time. Based on these features, we implement subgraph visualization as well as lossless approximate pattern matching to the graph using search schemes. We demonstrate that we can retrieve all occurrences corresponding to a read within a certain edit distance in a very efficient manner. Through a case study, we show the potential of exploiting the information embedded in the graph's topology through visualization and sequence alignment.

CONCLUSIONS

We propose a memory-efficient representation of the pan-genome graph that supports subgraph visualization and lossless approximate pattern matching of reads against the graph using search schemes. The C++ source code of our software, called Nexus, is available at https://github.com/biointec/nexus under AGPL-3.0 license.

Collapse

Chao KH, Chen PW, Seshia SA, Langmead B. WGT: Tools and algorithms for recognizing, visualizing, and generating Wheeler graphs. iScience 2023;26:107402. [PMID: 37575187 PMCID: PMC10415921 DOI: 10.1016/j.isci.2023.107402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 06/29/2023] [Accepted: 07/12/2023] [Indexed: 08/15/2023] Open

Schmidt S, Alanko JN. Eulertigs: minimum plain text representation of k-mer sets without repetitions in linear time. Algorithms Mol Biol 2023;18:5. [PMID: 37403080 DOI: 10.1186/s13015-023-00227-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 05/13/2023] [Indexed: 07/06/2023] Open

Marchet C, Limasset A. Scalable sequence database search using partitioned aggregated Bloom comb trees. Bioinformatics 2023;39:i252-i259. [PMID: 37387170 DOI: 10.1093/bioinformatics/btad225] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open

Alanko JN, Vuohtoniemi J, Mäklin T, Puglisi SJ. Themisto: a scalable colored k-mer index for sensitive pseudoalignment against hundreds of thousands of bacterial genomes. Bioinformatics 2023;39:i260-i269. [PMID: 37387143 DOI: 10.1093/bioinformatics/btad233] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open

Noll N, Molari M, Shaw LP, Neher RA. PanGraph: scalable bacterial pan-genome graph construction. Microb Genom 2023;9:mgen001034. [PMID: 37278719 PMCID: PMC10327495 DOI: 10.1099/mgen.0.001034] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Accepted: 04/14/2023] [Indexed: 06/07/2023] Open

Hasegawa N, Shimizu K. Efficient Colored de Bruijn Graph for Indexing Reads. J Comput Biol 2023. [PMID: 37115583 DOI: 10.1089/cmb.2022.0259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023] Open

Lu TY, Smaruj PN, Fudenberg G, Mancuso N, Chaisson MJP. The motif composition of variable number tandem repeats impacts gene expression. Genome Res 2023;33:511-524. [PMID: 37037626 PMCID: PMC10234305 DOI: 10.1101/gr.276768.122] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Accepted: 03/29/2023] [Indexed: 04/12/2023]

Schmidt S, Alanko JN. Eulertigs: minimum plain text representation of k-mer sets without repetitions in linear time. RESEARCH SQUARE 2023:rs.3.rs-2581995. [PMID: 36824947 PMCID: PMC9949180 DOI: 10.21203/rs.3.rs-2581995/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]

Chromosome-scale haplotype-resolved pangenomics. Trends Genet 2022;38:1103-1107. [PMID: 35817620 DOI: 10.1016/j.tig.2022.06.011] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 06/14/2022] [Accepted: 06/16/2022] [Indexed: 01/24/2023]

Karasikov M, Mustafa H, Rätsch G, Kahles A. Lossless indexing with counting de Bruijn graphs. Genome Res 2022;32:1754-1764. [PMID: 35609994 PMCID: PMC9528980 DOI: 10.1101/gr.276607.122] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Accepted: 05/05/2022] [Indexed: 11/25/2022]

Calabrese FM, Ameur H, Nikoloudaki O, Celano G, Vacca M, Junior WJFL, Manzari C, Vertè F, Di Cagno R, Pesole G, De Angelis M, Gobbetti M. Metabolic framework of spontaneous and synthetic sourdough metacommunities to reveal microbial players responsible for resilience and performance. MICROBIOME 2022;10:148. [PMID: 36104726 PMCID: PMC9472446 DOI: 10.1186/s40168-022-01301-3] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 05/29/2022] [Indexed: 06/15/2023]

Abstract

BACKGROUND

In nature, microbial communities undergo changes in composition that threaten their resiliency. Here, we interrogated sourdough, a natural cereal-fermenting metacommunity, as a dynamic ecosystem in which players are subjected to continuous environmental and spatiotemporal stimuli.

RESULTS

The inspection of spontaneous sourdough metagenomes and transcriptomes revealed dominant, subdominant and satellite players that are engaged in different functional pathways. The highest microbial richness was associated with the highest number of gene copies per pathway. Based on meta-omics data collected from 8 spontaneous sourdoughs and their identified microbiota, we de novo reconstructed a synthetic microbial community SDG. We also reconstructed SMC-SD43 from scratch using the microbial composition of its spontaneous sourdough equivalent for comparison. The KEGG number of dominant players in the SDG was not affected by depletion of a single player, whereas the subdominant and satellite species fluctuated, revealing unique contributions. Compared to SMC-SD43, SDG exhibited broader transcriptome redundancy. The invariant volatilome profile of SDG after in situ long-term back slopping revealed its stability. In contrast, SMC-SD43 lost many taxon members. Dominant, subdominant and satellite players together ensured gene and transcript redundancy.

CONCLUSIONS

Our study demonstrates how, by starting from spontaneous sourdoughs and reconstructing these communities synthetically, it was possible to unravel the metabolic contributions of individual players. For resilience and good performance, the sourdough metacommunity must include dominant, subdominant and satellite players, which together ensure gene and transcript redundancy. Overall, our study changes the paradigm and introduces theoretical foundations for directing food fermentations. Video Abstract.

Collapse

Almodaresi F, Khan J, Madaminov S, Ferdman M, Johnson R, Pandey P, Patro R. An incrementally updatable and scalable system for large-scale sequence search using the Bentley-Saxe transformation. Bioinformatics 2022;38:3155-3163. [PMID: 35325039 PMCID: PMC9191210 DOI: 10.1093/bioinformatics/btac142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 01/10/2022] [Accepted: 03/22/2022] [Indexed: 11/14/2022] Open

Abstract

MOTIVATION

In the past few years, researchers have proposed numerous indexing schemes for searching large datasets of raw sequencing experiments. Most of these proposed indexes are approximate (i.e. with one-sided errors) in order to save space. Recently, researchers have published exact indexes-Mantis, VariMerge and Bifrost-that can serve as colored de Bruijn graph representations in addition to serving as k-mer indexes. This new type of index is promising because it has the potential to support more complex analyses than simple searches. However, in order to be useful as indexes for large and growing repositories of raw sequencing data, they must scale to thousands of experiments and support efficient insertion of new data.

RESULTS

In this paper, we show how to build a scalable and updatable exact raw sequence-search index. Specifically, we extend Mantis using the Bentley-Saxe transformation to support efficient updates, called Dynamic Mantis. We demonstrate Dynamic Mantis's scalability by constructing an index of ≈40K samples from SRA by adding samples one at a time to an initial index of 10K samples. Compared to VariMerge and Bifrost, Dynamic Mantis is more efficient in terms of index-construction time and memory, query time and memory and index size. In our benchmarks, VariMerge and Bifrost scaled to only 5K and 80 samples, respectively, while Dynamic Mantis scaled to more than 39K samples. Queries were over 24× faster in Mantis than in Bifrost (VariMerge does not immediately support general search queries we require). Dynamic Mantis indexes were about 2.5× smaller than Bifrost's indexes and about half as big as VariMerge's indexes.

AVAILABILITY AND IMPLEMENTATION

Dynamic Mantis implementation is available at https://github.com/splatlab/mantis/tree/mergeMSTs.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Dufault‐Thompson K, Jiang X. Applications of de Bruijn graphs in microbiome research. IMETA 2022;1:e4. [PMID: 38867733 PMCID: PMC10989854 DOI: 10.1002/imt2.4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 01/24/2022] [Accepted: 01/24/2022] [Indexed: 06/14/2024]

Krannich T, White WTJ, Niehus S, Holley G, Halldórsson BV, Kehr B. Population-scale detection of non-reference sequence variants using colored de Bruijn graphs. Bioinformatics 2021;38:604-611. [PMID: 34726732 PMCID: PMC8756200 DOI: 10.1093/bioinformatics/btab749] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Revised: 09/27/2021] [Accepted: 10/28/2021] [Indexed: 02/03/2023] Open

Marchet C, Kerbiriou M, Limasset A. BLight: efficient exact associative structure for k-mers. Bioinformatics 2021;37:2858-2865. [PMID: 33821954 DOI: 10.1093/bioinformatics/btab217] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Revised: 02/18/2021] [Accepted: 04/01/2021] [Indexed: 02/02/2023] Open

Alipanahi B, Kuhnle A, Puglisi SJ, Salmela L, Boucher C. Succinct dynamic de Bruijn graphs. Bioinformatics 2021;37:1946-1952. [PMID: 32462192 DOI: 10.1093/bioinformatics/btaa546] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open

Danciu D, Karasikov M, Mustafa H, Kahles A, Rätsch G. Topology-based sparsification of graph annotations. Bioinformatics 2021;37:i169-i176. [PMID: 34252940 PMCID: PMC8346655 DOI: 10.1093/bioinformatics/btab330] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/03/2021] [Indexed: 01/03/2023] Open

Abstract

Motivation

Since the amount of published biological sequencing data is growing exponentially, efficient methods for storing and indexing this data are more needed than ever to truly benefit from this invaluable resource for biomedical research. Labeled de Bruijn graphs are a frequently-used approach for representing large sets of sequencing data. While significant progress has been made to succinctly represent the graph itself, efficient methods for storing labels on such graphs are still rapidly evolving.

Results

In this article, we present RowDiff, a new technique for compacting graph labels by leveraging expected similarities in annotations of vertices adjacent in the graph. RowDiff can be constructed in linear time relative to the number of vertices and labels in the graph, and in space proportional to the graph size. In addition, construction can be efficiently parallelized and distributed, making the technique applicable to graphs with trillions of nodes. RowDiff can be viewed as an intermediary sparsification step of the original annotation matrix and can thus naturally be combined with existing generic schemes for compressed binary matrices. Experiments on 10 000 RNA-seq datasets show that RowDiff combined with multi-BRWT results in a 30% reduction in annotation footprint over Mantis-MST, the previously known most compact annotation representation. Experiments on the sparser Fungi subset of the RefSeq collection show that applying RowDiff sparsification reduces the size of individual annotation columns stored as compressed bit vectors by an average factor of 42. When combining RowDiff with a multi-BRWT representation, the resulting annotation is 26 times smaller than Mantis-MST.

Availability and implementation

RowDiff is implemented in C++ within the MetaGraph framework. The source code and the data used in the experiments are publicly available at https://github.com/ratschlab/row_diff.

Collapse

Khan J, Patro R. Cuttlefish: fast, parallel and low-memory compaction of de Bruijn graphs from large-scale genome collections. Bioinformatics 2021;37:i177-i186. [PMID: 34252958 PMCID: PMC8275350 DOI: 10.1093/bioinformatics/btab309] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open

Abstract

Motivation

The construction of the compacted de Bruijn graph from collections of reference genomes is a task of increasing interest in genomic analyses. These graphs are increasingly used as sequence indices for short- and long-read alignment. Also, as we sequence and assemble a greater diversity of genomes, the colored compacted de Bruijn graph is being used more and more as the basis for efficient methods to perform comparative genomic analyses on these genomes. Therefore, time- and memory-efficient construction of the graph from reference sequences is an important problem.

Results

We introduce a new algorithm, implemented in the tool Cuttlefish, to construct the (colored) compacted de Bruijn graph from a collection of one or more genome references. Cuttlefish introduces a novel approach of modeling de Bruijn graph vertices as finite-state automata, and constrains these automata’s state-space to enable tracking their transitioning states with very low memory usage. Cuttlefish is also fast and highly parallelizable. Experimental results demonstrate that it scales much better than existing approaches, especially as the number and the scale of the input references grow. On a typical shared-memory machine, Cuttlefish constructed the graph for 100 human genomes in under 9 h, using ∼29 GB of memory. On 11 diverse conifer plant genomes, the compacted graph was constructed by Cuttlefish in under 9 h, using ∼84 GB of memory. The only other tool completing these tasks on the hardware took over 23 h using ∼126 GB of memory, and over 16 h using ∼289 GB of memory, respectively.

Availability and implementation

Cuttlefish is implemented in C++14, and is available under an open source license at https://github.com/COMBINE-lab/cuttlefish.

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Alanko J, Alipanahi B, Settle J, Boucher C, Gagie T. Buffering updates enables efficient dynamic de Bruijn graphs. Comput Struct Biotechnol J 2021;19:4067-4078. [PMID: 34377371 PMCID: PMC8326735 DOI: 10.1016/j.csbj.2021.06.047] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 06/29/2021] [Accepted: 06/29/2021] [Indexed: 12/24/2022] Open

Almodaresi F, Zakeri M, Patro R. Puffaligner : A Fast, Efficient, and Accurate Aligner Based on the Pufferfish Index. Bioinformatics 2021;37:4048-4055. [PMID: 34117875 PMCID: PMC9502150 DOI: 10.1093/bioinformatics/btab408] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Revised: 04/30/2021] [Accepted: 06/11/2021] [Indexed: 12/22/2022] Open

Guo J, Pang E, Song H, Lin K. A tri-tuple coordinate system derived for fast and accurate analysis of the colored de Bruijn graph-based pangenomes. BMC Bioinformatics 2021;22:282. [PMID: 34044757 PMCID: PMC8161984 DOI: 10.1186/s12859-021-04149-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Accepted: 04/25/2021] [Indexed: 11/25/2022] Open

Abstract

Background

With the rapid development of accurate sequencing and assembly technologies, an increasing number of high-quality chromosome-level and haplotype-resolved assemblies of genomic sequences have been derived, from which there will be great opportunities for computational pangenomics. Although genome graphs are among the most useful models for pangenome representation, their structural complexity makes it difficult to present genome information intuitively, such as the linear reference genome. Thus, efficiently and accurately analyzing the genome graph spatial structure and coordinating the information remains a substantial challenge.

Results

We developed a new method, a colored superbubble (cSupB), that can overcome the complexity of graphs and organize a set of species- or population-specific haplotype sequences of interest. Based on this model, we propose a tri-tuple coordinate system that combines an offset value, topological structure and sample information. Additionally, cSupB provides a novel method that utilizes complete topological information and efficiently detects small indels (< 50 bp) for highly similar samples, which can be validated by simulated datasets. Moreover, we demonstrated that cSupB can adapt to the complex cycle structure.

Conclusions

Although the solution is made suitable for increasingly complex genome graphs by relaxing the constraint, the directed acyclic graph, the motif cSupB and the cSupB method can be extended to any colored directed acyclic graph. We anticipate that our method will facilitate the analysis of individual haplotype variants and population genomic diversity. We have developed a C + + program for implementing our method that is available at https://github.com/eggleader/cSupB.

Supplementary information

The online version contains supplementary material available at 10.1186/s12859-021-04149-w.

Collapse

Alipanahi B, Muggli MD, Jundi M, Noyes NR, Boucher C. Metagenome SNP calling via read-colored de Bruijn graphs. Bioinformatics 2021;36:5275-5281. [PMID: 32049324 DOI: 10.1093/bioinformatics/btaa081] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Revised: 01/08/2020] [Accepted: 02/03/2020] [Indexed: 11/13/2022] Open

Abstract

MOTIVATION

Metagenomics refers to the study of complex samples containing of genetic contents of multiple individual organisms and, thus, has been used to elucidate the microbiome and resistome of a complex sample. The microbiome refers to all microbial organisms in a sample, and the resistome refers to all of the antimicrobial resistance (AMR) genes in pathogenic and non-pathogenic bacteria. Single-nucleotide polymorphisms (SNPs) can be effectively used to 'fingerprint' specific organisms and genes within the microbiome and resistome and trace their movement across various samples. However, to effectively use these SNPs for this traceability, a scalable and accurate metagenomics SNP caller is needed. Moreover, such an SNP caller should not be reliant on reference genomes since 95% of microbial species is unculturable, making the determination of a reference genome extremely challenging. In this article, we address this need.

RESULTS

We present LueVari, a reference-free SNP caller based on the read-colored de Bruijn graph, an extension of the traditional de Bruijn graph that allows repeated regions longer than the k-mer length and shorter than the read length to be identified unambiguously. LueVari is able to identify SNPs in both AMR genes and chromosomal DNA from shotgun metagenomics data with reliable sensitivity (between 91% and 99%) and precision (between 71% and 99%) as the performance of competing methods varies widely. Furthermore, we show that LueVari constructs sequences containing the variation, which span up to 97.8% of genes in datasets, which can be helpful in detecting distinct AMR genes in large metagenomic datasets.

AVAILABILITY AND IMPLEMENTATION

Code and datasets are publicly available at https://github.com/baharpan/cosmo/tree/LueVari.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Břinda K, Baym M, Kucherov G. Simplitigs as an efficient and scalable representation of de Bruijn graphs. Genome Biol 2021;22:96. [PMID: 33823902 PMCID: PMC8025321 DOI: 10.1186/s13059-021-02297-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 02/10/2021] [Indexed: 12/30/2022] Open

Shokrof M, Brown CT, Mansour TA. MQF and buffered MQF: quotient filters for efficient storage of k-mers with their counts and metadata. BMC Bioinformatics 2021;22:71. [PMID: 33593271 PMCID: PMC7885209 DOI: 10.1186/s12859-021-03996-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 02/04/2021] [Indexed: 11/30/2022] Open

Schulz T, Wittler R, Rahmann S, Hach F, Stoye J. Detecting High Scoring Local Alignments in Pangenome Graphs. Bioinformatics 2021;37:2266-2274. [PMID: 33532821 PMCID: PMC8388040 DOI: 10.1093/bioinformatics/btab077] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Revised: 12/02/2020] [Accepted: 01/29/2021] [Indexed: 11/23/2022] Open

Marchet C, Boucher C, Puglisi SJ, Medvedev P, Salson M, Chikhi R. Data structures based on k-mers for querying large collections of sequencing data sets. Genome Res 2021;31:1-12. [PMID: 33328168 PMCID: PMC7849385 DOI: 10.1101/gr.260604.119] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2019] [Accepted: 09/14/2020] [Indexed: 12/19/2022]

Holley G, Melsted P. Bifrost: highly parallel construction and indexing of colored and compacted de Bruijn graphs. Genome Biol 2020;21:249. [PMID: 32943081 PMCID: PMC7499882 DOI: 10.1186/s13059-020-02135-8] [Citation(s) in RCA: 95] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Accepted: 08/06/2020] [Indexed: 02/07/2023] Open

Garimella KV, Iqbal Z, Krause MA, Campino S, Kekre M, Drury E, Kwiatkowski D, Sá JM, Wellems TE, McVean G. Detection of simple and complex de novo mutations with multiple reference sequences. Genome Res 2020;30:1154-1169. [PMID: 32817236 PMCID: PMC7462078 DOI: 10.1101/gr.255505.119] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Accepted: 07/17/2020] [Indexed: 12/25/2022]

Affiliation(s)

Kiran V Garimella Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA.,Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, Oxfordshire, OX3 7BN, United Kingdom.,Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, Oxfordshire, OX3 7LF, United Kingdom
Zamin Iqbal Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, Oxfordshire, OX3 7BN, United Kingdom.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, United Kingdom
Michael A Krause Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, Oxfordshire, OX3 7BN, United Kingdom.,The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, United Kingdom.,Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
Susana Campino The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, United Kingdom
Mihir Kekre The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, United Kingdom
Eleanor Drury The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, United Kingdom
Dominic Kwiatkowski Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, Oxfordshire, OX3 7LF, United Kingdom.,The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, United Kingdom
Juliana M Sá Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
Thomas E Wellems Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
Gil McVean Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, Oxfordshire, OX3 7BN, United Kingdom.,Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, Oxfordshire, OX3 7LF, United Kingdom

Collapse

Marchet C, Iqbal Z, Gautheret D, Salson M, Chikhi R. REINDEER: efficient indexing of k-mer presence and abundance in sequencing datasets. Bioinformatics 2020;36:i177-i185. [PMID: 32657392 PMCID: PMC7355249 DOI: 10.1093/bioinformatics/btaa487] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open

Minkin I, Medvedev P. Scalable Pairwise Whole-Genome Homology Mapping of Long Genomes with BubbZ. iScience 2020;23:101224. [PMID: 32563153 PMCID: PMC7303978 DOI: 10.1016/j.isci.2020.101224] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 05/25/2020] [Accepted: 05/28/2020] [Indexed: 11/17/2022] Open

Wittler R. Alignment- and reference-free phylogenomics with colored de Bruijn graphs. Algorithms Mol Biol 2020;15:4. [PMID: 32280365 PMCID: PMC7137503 DOI: 10.1186/s13015-020-00164-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2019] [Accepted: 03/21/2020] [Indexed: 11/10/2022] Open

Almodaresi F, Pandey P, Ferdman M, Johnson R, Patro R. An Efficient, Scalable, and Exact Representation of High-Dimensional Color Information Enabled Using de Bruijn Graph Search. J Comput Biol 2020;27:485-499. [PMID: 32176522 PMCID: PMC7185321 DOI: 10.1089/cmb.2019.0322] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Karasikov M, Mustafa H, Joudaki A, Javadzadeh-no S, Rätsch G, Kahles A. Sparse Binary Relation Representations for Genome Graph Annotation. J Comput Biol 2020;27:626-639. [PMID: 31891531 PMCID: PMC7185347 DOI: 10.1089/cmb.2019.0324] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open

Sherman RM, Salzberg SL. Pan-genomics in the human genome era. Nat Rev Genet 2020;21:243-254. [PMID: 32034321 PMCID: PMC7752153 DOI: 10.1038/s41576-020-0210-7] [Citation(s) in RCA: 175] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/02/2020] [Indexed: 12/25/2022]

San JE, Baichoo S, Kanzi A, Moosa Y, Lessells R, Fonseca V, Mogaka J, Power R, de Oliveira T. Current Affairs of Microbial Genome-Wide Association Studies: Approaches, Bottlenecks and Analytical Pitfalls. Front Microbiol 2020;10:3119. [PMID: 32082269 PMCID: PMC7002396 DOI: 10.3389/fmicb.2019.03119] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2019] [Accepted: 12/24/2019] [Indexed: 12/12/2022] Open

Overlap graphs and de Bruijn graphs: data structures for de novo genome assembly in the big data era. QUANTITATIVE BIOLOGY 2019. [DOI: 10.1007/s40484-019-0181-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Bansal V, Boucher C. Sequencing Technologies and Analyses: Where Have We Been and Where Are We Going? iScience 2019;18:37-41. [PMID: 31472161 PMCID: PMC6733309 DOI: 10.1016/j.isci.2019.06.035] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

Almodaresi F, Sarkar H, Srivastava A, Patro R. A space and time-efficient index for the compacted colored de Bruijn graph. Bioinformatics 2019;34:i169-i177. [PMID: 29949982 PMCID: PMC6022659 DOI: 10.1093/bioinformatics/bty292] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open

Abstract

Motivation

Indexing reference sequences for search—both individual genomes and collections of genomes—is an important building block for many sequence analysis tasks. Much work has been dedicated to developing full-text indices for genomic sequences, based on data structures such as the suffix array, the BWT and the FM-index. However, the de Bruijn graph, commonly used for sequence assembly, has recently been gaining attention as an indexing data structure, due to its natural ability to represent multiple references using a graphical structure, and to collapse highly-repetitive sequence regions. Yet, much less attention has been given as to how to best index such a structure, such that queries can be performed efficiently and memory usage remains practical as the size and number of reference sequences being indexed grows large.

Results

We present a novel data structure for representing and indexing the compacted colored de Bruijn graph, which allows for efficient pattern matching and retrieval of the reference information associated with each k-mer. As the popularity of the de Bruijn graph as an index has increased over the past few years, so have the number of proposed representations of this structure. Existing structures typically fall into two categories; those that are hashing-based and provide very fast access to the underlying k-mer information, and those that are space-frugal and provide asymptotically efficient but practically slower pattern search. Our representation achieves a compromise between these two extremes. By building upon minimum perfect hashing and making use of succinct representations where applicable, our data structure provides practically fast lookup while greatly reducing the space compared to traditional hashing-based implementations. Further, we describe a sampling scheme for this index, which provides the ability to trade off query speed for a reduction in the index size. We believe this representation strikes a desirable balance between speed and space usage, and allows for fast search on large reference sequences.

Finally, we describe an application of this index to the taxonomic read assignment problem. We show that by adopting, essentially, the approach of Kraken, but replacing k-mer presence with coverage by chains of consistent unique maximal matches, we can improve the space, speed and accuracy of taxonomic read assignment.

Availability and implementation

pufferfish is written in C++11, is open source, and is available at https://github.com/COMBINE-lab/pufferfish.

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Harris RS, Medvedev P. Improved representation of sequence bloom trees. Bioinformatics 2019;36:721-727. [PMID: 31504157 PMCID: PMC8215923 DOI: 10.1093/bioinformatics/btz662] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Revised: 08/15/2019] [Accepted: 08/20/2019] [Indexed: 01/31/2023] Open

Muggli MD, Alipanahi B, Boucher C. Building large updatable colored de Bruijn graphs via merging. Bioinformatics 2019;35:i51-i60. [PMID: 31510647 PMCID: PMC6612864 DOI: 10.1093/bioinformatics/btz350] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Abstract

MOTIVATION

There exist several large genomic and metagenomic data collection efforts, including GenomeTrakr and MetaSub, which are routinely updated with new data. To analyze such datasets, memory-efficient methods to construct and store the colored de Bruijn graph were developed. Yet, a problem that has not been considered is constructing the colored de Bruijn graph in a scalable manner that allows new data to be added without reconstruction. This problem is important for large public datasets as scalability is needed but also the ability to update the construction is also needed.

RESULTS

We create a method for constructing the colored de Bruijn graph for large datasets that is based on partitioning the data into smaller datasets, building the colored de Bruijn graph using a FM-index based representation, and succinctly merging these representations to build a single graph. The last step, merging succinctly, is the algorithmic challenge which we solve in this article. We refer to the resulting method as VariMerge. This construction method also allows the graph to be updated with new data. We validate our approach and show it produces a three-fold reduction in working space when constructing a colored de Bruijn graph for 8000 strains. Lastly, we compare VariMerge to other competing methods-including Vari, Rainbowfish, Mantis, Bloom Filter Trie, the method of Almodaresi et al. and Multi-BRWT-and illustrate that VariMerge is the only method that is capable of building the colored de Bruijn graph for 16 000 strains in a manner that allows it to be updated. Competing methods either did not scale to this large of a dataset or do not allow for additions without reconstruction.

AVAILABILITY AND IMPLEMENTATION

VariMerge is available at https://github.com/cosmo-team/cosmo/tree/VARI-merge under GPLv3 license.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Egidi L, Louza FA, Manzini G, Telles GP. External memory BWT and LCP computation for sequence collections with applications. Algorithms Mol Biol 2019;14:6. [PMID: 30899322 PMCID: PMC6408864 DOI: 10.1186/s13015-019-0140-0] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Accepted: 02/23/2019] [Indexed: 11/10/2022] Open

Abstract

Background

Sequencing technologies produce larger and larger collections of biosequences that have to be stored in compressed indices supporting fast search operations. Many compressed indices are based on the Burrows–Wheeler Transform (BWT) and the longest common prefix (LCP) array. Because of the sheer size of the input it is important to build these data structures in external memory and time using in the best possible way the available RAM.

Results

We propose a space-efficient algorithm to compute the BWT and LCP array for a collection of sequences in the external or semi-external memory setting. Our algorithm splits the input collection into subcollections sufficiently small that it can compute their BWT in RAM using an optimal linear time algorithm. Next, it merges the partial BWTs in external or semi-external memory and in the process it also computes the LCP values. Our algorithm can be modified to output two additional arrays that, combined with the BWT and LCP array, provide simple, scan-based, external memory algorithms for three well known problems in bioinformatics: the computation of maximal repeats, the all pairs suffix–prefix overlaps, and the construction of succinct de Bruijn graphs.

Conclusions

We prove that our algorithm performs \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {O}}(n\, \mathsf {maxlcp})$$\end{document}O(nmaxlcp) sequential I/Os, where n is the total length of the collection and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathsf {maxlcp}$$\end{document}maxlcp is the maximum LCP value. The experimental results show that our algorithm is only slightly slower than the state of the art for short sequences but it is up to 40 times faster for longer sequences or when the available RAM is at least equal to the size of the input.

Collapse

Ultrafast search of all deposited bacterial and viral genomic data. Nat Biotechnol 2019;37:152-159. [PMID: 30718882 PMCID: PMC6420049 DOI: 10.1038/s41587-018-0010-1] [Citation(s) in RCA: 71] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2017] [Accepted: 12/20/2018] [Indexed: 02/07/2023]

Mustafa H, Schilken I, Karasikov M, Eickhoff C, Rätsch G, Kahles A. Dynamic compression schemes for graph coloring. Bioinformatics 2019;35:407-414. [PMID: 30020403 PMCID: PMC6530811 DOI: 10.1093/bioinformatics/bty632] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2018] [Revised: 06/22/2018] [Accepted: 07/16/2018] [Indexed: 11/30/2022] Open

Bolger AM, Poorter H, Dumschott K, Bolger ME, Arend D, Osorio S, Gundlach H, Mayer KFX, Lange M, Scholz U, Usadel B. Computational aspects underlying genome to phenome analysis in plants. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2019;97:182-198. [PMID: 30500991 PMCID: PMC6849790 DOI: 10.1111/tpj.14179] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2018] [Revised: 11/06/2018] [Accepted: 11/16/2018] [Indexed: 05/18/2023]

Pandey P, Almodaresi F, Bender MA, Ferdman M, Johnson R, Patro R. Mantis: A Fast, Small, and Exact Large-Scale Sequence-Search Index. Cell Syst 2018;7:201-207.e4. [PMID: 29936185 PMCID: PMC10964368 DOI: 10.1016/j.cels.2018.05.021] [Citation(s) in RCA: 69] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2018] [Revised: 05/08/2018] [Accepted: 05/25/2018] [Indexed: 01/08/2023]

Turner I, Garimella KV, Iqbal Z, McVean G. Integrating long-range connectivity information into de Bruijn graphs. Bioinformatics 2018;34:2556-2565. [PMID: 29554215 PMCID: PMC6061703 DOI: 10.1093/bioinformatics/bty157] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2017] [Revised: 11/25/2017] [Accepted: 03/14/2018] [Indexed: 12/27/2022] Open

Farruggia A, Gagie T, Navarro G, Puglisi SJ, Sirén J. Relative Suffix Trees. THE COMPUTER JOURNAL 2018;61:773-788. [PMID: 29795706 PMCID: PMC5956352 DOI: 10.1093/comjnl/bxx108] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/12/2017] [Revised: 09/01/2017] [Indexed: 06/08/2023]