Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Chu J, Mohamadi H, Erhan E, Tse J, Chiu R, Yeo S, Birol I. Mismatch-tolerant, alignment-free sequence classification using multiple spaced seeds and multiindex Bloom filters. Proc Natl Acad Sci U S A 2020;117:16961-8. [PMID: 32641514 DOI: 10.1073/pnas.1903436117] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

For:	Chu J, Mohamadi H, Erhan E, Tse J, Chiu R, Yeo S, Birol I. Mismatch-tolerant, alignment-free sequence classification using multiple spaced seeds and multiindex Bloom filters. Proc Natl Acad Sci U S A 2020;117:16961-8. [PMID: 32641514 DOI: 10.1073/pnas.1903436117] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Number

Cited by Other Article(s)

Zhang E, Coombe L, Wong J, Warren RL, Birol I. GoldPolish-target: targeted long-read genome assembly polishing. BMC Bioinformatics 2025;26:78. [PMID: 40055584 PMCID: PMC11887200 DOI: 10.1186/s12859-025-06091-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2024] [Accepted: 02/19/2025] [Indexed: 03/12/2025] Open

Chen Y, Wang G, Zhang T. Utilizing Deep Neural Networks to Fill Gaps in Small Genomes. Int J Mol Sci 2024;25:8502. [PMID: 39126071 PMCID: PMC11313336 DOI: 10.3390/ijms25158502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Revised: 07/24/2024] [Accepted: 08/02/2024] [Indexed: 08/12/2024] Open

Abstract

With the widespread adoption of next-generation sequencing technologies, the speed and convenience of genome sequencing have significantly improved, and many biological genomes have been sequenced. However, during the assembly of small genomes, we still face a series of challenges, including repetitive fragments, inverted repeats, low sequencing coverage, and the limitations of sequencing technologies. These challenges lead to unknown gaps in small genomes, hindering complete genome assembly. Although there are many existing assembly software options, they do not fully utilize the potential of artificial intelligence technologies, resulting in limited improvement in gap filling. Here, we propose a novel method, DLGapCloser, based on deep learning, aimed at assisting traditional tools in further filling gaps in small genomes. Firstly, we created four datasets based on the original genomes of Saccharomyces cerevisiae, Schizosaccharomyces pombe, Neurospora crassa, and Micromonas pusilla. To further extract effective information from the gene sequences, we also added homologous genomes to enrich the datasets. Secondly, we proposed the DGCNet model, which effectively extracts features and learns context from sequences flanking gaps. Addressing issues with early pruning and high memory usage in the Beam Search algorithm, we developed a new prediction algorithm, Wave-Beam Search. This algorithm alternates between expansion and contraction phases, enhancing efficiency and accuracy. Experimental results showed that the Wave-Beam Search algorithm improved the gap-filling performance of assembly tools by 7.35%, 28.57%, 42.85%, and 8.33% on the original results. Finally, we established new gap-filling standards and created and implemented a novel evaluation method. Validation on the genomes of Saccharomyces cerevisiae, Schizosaccharomyces pombe, Neurospora crassa, and Micromonas pusilla showed that DLGapCloser increased the number of filled gaps by 8.05%, 15.3%, 1.4%, and 7% compared to traditional assembly tools.

Collapse

Ponsero AJ, Miller M, Hurwitz BL. Comparison of k-mer-based de novo comparative metagenomic tools and approaches. MICROBIOME RESEARCH REPORTS 2023;2:27. [PMID: 38058765 PMCID: PMC10696585 DOI: 10.20517/mrr.2023.26] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 06/28/2023] [Accepted: 07/12/2023] [Indexed: 12/08/2023]

Abstract

Aim: Comparative metagenomic analysis requires measuring a pairwise similarity between metagenomes in the dataset. Reference-based methods that compute a beta-diversity distance between two metagenomes are highly dependent on the quality and completeness of the reference database, and their application on less studied microbiota can be challenging. On the other hand, de-novo comparative metagenomic methods only rely on the sequence composition of metagenomes to compare datasets. While each one of these approaches has its strengths and limitations, their comparison is currently limited. Methods: We developed sets of simulated short-reads metagenomes to (1) compare k-mer-based and taxonomy-based distances and evaluate the impact of technical and biological variables on these metrics and (2) evaluate the effect of k-mer sketching and filtering. We used a real-world metagenomic dataset to provide an overview of the currently available tools for de novo metagenomic comparative analysis. Results: Using simulated metagenomes of known composition and controlled error rate, we showed that k-mer-based distance metrics were well correlated to the taxonomic distance metric for quantitative Beta-diversity metrics, but the correlation was low for presence/absence distances. The community complexity in terms of taxa richness and the sequencing depth significantly affected the quality of the k-mer-based distances, while the impact of low amounts of sequence contamination and sequencing error was limited. Finally, we benchmarked currently available de-novo comparative metagenomic tools and compared their output on two datasets of fecal metagenomes and showed that most k-mer-based tools were able to recapitulate the data structure observed using taxonomic approaches. Conclusion: This study expands our understanding of the strength and limitations of k-mer-based de novo comparative metagenomic approaches and aims to provide concrete guidelines for researchers interested in applying these approaches to their metagenomic datasets.

Collapse

Wong J, Coombe L, Nikolić V, Zhang E, Nip KM, Sidhu P, Warren RL, Birol I. Linear time complexity de novo long read genome assembly with GoldRush. Nat Commun 2023;14:2906. [PMID: 37217507 PMCID: PMC10202940 DOI: 10.1038/s41467-023-38716-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 05/11/2023] [Indexed: 05/24/2023] Open

Kazemi P, Wong J, Nikolić V, Mohamadi H, Warren RL, Birol I. ntHash2: recursive spaced seed hashing for nucleotide sequences. Bioinformatics 2022;38:4812-4813. [PMID: 36000872 PMCID: PMC9563681 DOI: 10.1093/bioinformatics/btac564] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 07/21/2022] [Indexed: 11/29/2022] Open

Chen E, Chu J, Zhang J, Warren RL, Birol I. GapPredict - A Language Model for Resolving Gaps in Draft Genome Assemblies. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:2802-2808. [PMID: 34478378 PMCID: PMC8772386 DOI: 10.1109/tcbb.2021.3109557] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]