Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Salmela L, Walve R, Rivals E, Ukkonen E. Accurate self-correction of errors in long reads using de Bruijn graphs. Bioinformatics 2017;33:799-806. [PMID: 27273673 PMCID: PMC5351550 DOI: 10.1093/bioinformatics/btw321] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2016] [Revised: 05/03/2016] [Accepted: 05/16/2016] [Indexed: 12/04/2022] Open

For:	Salmela L, Walve R, Rivals E, Ukkonen E. Accurate self-correction of errors in long reads using de Bruijn graphs. Bioinformatics 2017;33:799-806. [PMID: 27273673 PMCID: PMC5351550 DOI: 10.1093/bioinformatics/btw321] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2016] [Revised: 05/03/2016] [Accepted: 05/16/2016] [Indexed: 12/04/2022] Open

Number

Cited by Other Article(s)

Pandey P, Bender MA, Johnson R, Patro R. deBGR: an efficient and near-exact representation of the weighted de Bruijn graph. Bioinformatics 2017;33:i133-i141. [PMID: 28881995 PMCID: PMC5870571 DOI: 10.1093/bioinformatics/btx261] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open

Bao E, Lan L. HALC: High throughput algorithm for long read error correction. BMC Bioinformatics 2017;18:204. [PMID: 28381259 PMCID: PMC5382505 DOI: 10.1186/s12859-017-1610-3] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2016] [Accepted: 03/24/2017] [Indexed: 11/24/2022] Open

Abstract

BACKGROUND

The third generation PacBio SMRT long reads can effectively address the read length issue of the second generation sequencing technology, but contain approximately 15% sequencing errors. Several error correction algorithms have been designed to efficiently reduce the error rate to 1%, but they discard large amounts of uncorrected bases and thus lead to low throughput. This loss of bases could limit the completeness of downstream assemblies and the accuracy of analysis.

RESULTS

Here, we introduce HALC, a high throughput algorithm for long read error correction. HALC aligns the long reads to short read contigs from the same species with a relatively low identity requirement so that a long read region can be aligned to at least one contig region, including its true genome region's repeats in the contigs sufficiently similar to it (similar repeat based alignment approach). It then constructs a contig graph and, for each long read, references the other long reads' alignments to find the most accurate alignment and correct it with the aligned contig regions (long read support based validation approach). Even though some long read regions without the true genome regions in the contigs are corrected with their repeats, this approach makes it possible to further refine these long read regions with the initial insufficient short reads and correct the uncorrected regions in between. In our performance tests on E. coli, A. thaliana and Maylandia zebra data sets, HALC was able to obtain 6.7-41.1% higher throughput than the existing algorithms while maintaining comparable accuracy. The HALC corrected long reads can thus result in 11.4-60.7% longer assembled contigs than the existing algorithms.

CONCLUSIONS

The HALC software can be downloaded for free from this site: https://github.com/lanl001/halc .

Collapse

Hrebien S, O’Leary B, Beaney M, Schiavon G, Fribbens C, Bhambra A, Johnson R, Garcia-Murillas I, Turner N. Reproducibility of Digital PCR Assays for Circulating Tumor DNA Analysis in Advanced Breast Cancer. PLoS One 2016;11:e0165023. [PMID: 27760227 PMCID: PMC5070760 DOI: 10.1371/journal.pone.0165023] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2016] [Accepted: 10/05/2016] [Indexed: 02/05/2023] Open