Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Salmela L. Correction of sequencing errors in a mixed set of reads. Bioinformatics 2010;26:1284-90. [PMID: 20378555 DOI: 10.1093/bioinformatics/btq151] [Citation(s) in RCA: 80] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Number

Cited by Other Article(s)

Bae H, Min S, Choi HS, Yoon S. DNA Privacy: Analyzing Malicious DNA Sequences Using Deep Neural Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:888-898. [PMID: 32809941 DOI: 10.1109/tcbb.2020.3017191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Tahir M, Sardaraz M, Mehmood Z, Khan MS. ESREEM: Efficient Short Reads Error Estimation Computational Model for Next-generation Genome Sequencing. Curr Bioinform 2021. [DOI: 10.2174/1574893615999200614171832] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Abstract Aims: To assess the error profile in NGS data, generated from high throughput sequencing machines. Background: Short-read sequencing data from Next Generation Sequencing (NGS) are currently being generated by a number of research projects. Depicting the errors produced by NGS platforms and expressing accurate genetic variation from reads are two inter-dependent phases. It has high significance in various analyses, such as genome sequence assembly, SNPs calling, evolutionary studies, and haplotype inference. The systematic and random errors show incidence profile for each of the sequencing platforms i.e. Illumina sequencing, Pacific Biosciences, 454 pyrosequencing, Complete Genomics DNA nanoball sequencing, Ion Torrent sequencing, and Oxford Nanopore sequencing. Advances in NGS deliver galactic data with the addition of errors. Some ratio of these errors may emulate genuine true biological signals i.e., mutation, and may subsequently negate the results. Various independent applications have been proposed to correct the sequencing errors. Systematic analysis of these algorithms shows that state-of-the-art models are missing. Objective: In this paper, an effcient error estimation computational model called ESREEM is proposed to assess the error rates in NGS data. Methods: The proposed model prospects the analysis that there exists a true linear regression association between the number of reads containing errors and the number of reads sequenced. The model is based on a probabilistic error model integrated with the Hidden Markov Model (HMM). Result: The proposed model is evaluated on several benchmark datasets and the results obtained are compared with state-of-the-art algorithms. Conclusions: Experimental results analyses show that the proposed model efficiently estimates errors and runs in less time as compared to others. Collapse

Zhao L, Xie J, Bai L, Chen W, Wang M, Zhang Z, Wang Y, Zhao Z, Li J. Mining statistically-solid k-mers for accurate NGS error correction. BMC Genomics 2018;19:912. [PMID: 30598110 PMCID: PMC6311904 DOI: 10.1186/s12864-018-5272-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Huang YT, Huang YW. An efficient error correction algorithm using FM-index. BMC Bioinformatics 2017;18:524. [PMID: 29179672 PMCID: PMC5704532 DOI: 10.1186/s12859-017-1940-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2017] [Accepted: 11/14/2017] [Indexed: 11/10/2022] Open

Savel D, LaFramboise T, Grama A, Koyuturk M. Pluribus-Exploring the Limits of Error Correction Using a Suffix Tree. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017;14:1378-1388. [PMID: 27362987 PMCID: PMC5754272 DOI: 10.1109/tcbb.2016.2586060] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Lee B, Moon T, Yoon S, Weissman T. DUDE-Seq: Fast, flexible, and robust denoising for targeted amplicon sequencing. PLoS One 2017;12:e0181463. [PMID: 28749987 PMCID: PMC5531809 DOI: 10.1371/journal.pone.0181463] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Accepted: 06/30/2017] [Indexed: 11/29/2022] Open

Ahola V, Wahlberg N, Frilander MJ. Butterfly Genomics: Insights from the Genome ofMelitaea cinxia. ANN ZOOL FENN 2017. [DOI: 10.5735/086.054.0123] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]

Garrido-Cardenas JA, Garcia-Maroto F, Alvarez-Bermejo JA, Manzano-Agugliaro F. DNA Sequencing Sensors: An Overview. SENSORS 2017;17:s17030588. [PMID: 28335417 PMCID: PMC5375874 DOI: 10.3390/s17030588] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/28/2017] [Revised: 03/09/2017] [Accepted: 03/11/2017] [Indexed: 12/23/2022]

Zhao L, Chen Q, Li W, Jiang P, Wong L, Li J. MapReduce for accurate error correction of next-generation sequencing data. Bioinformatics 2017;33:3844-3851. [PMID: 28205674 DOI: 10.1093/bioinformatics/btx089] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Accepted: 02/14/2017] [Indexed: 11/14/2022] Open

From next-generation resequencing reads to a high-quality variant data set. Heredity (Edinb) 2016;118:111-124. [PMID: 27759079 DOI: 10.1038/hdy.2016.102] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2016] [Revised: 09/03/2016] [Accepted: 09/06/2016] [Indexed: 12/11/2022] Open

Akogwu I, Wang N, Zhang C, Gong P. A comparative study of k-spectrum-based error correction methods for next-generation sequencing data analysis. Hum Genomics 2016;10 Suppl 2:20. [PMID: 27461106 PMCID: PMC4965716 DOI: 10.1186/s40246-016-0068-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open

Abstract

BACKGROUND

Innumerable opportunities for new genomic research have been stimulated by advancement in high-throughput next-generation sequencing (NGS). However, the pitfall of NGS data abundance is the complication of distinction between true biological variants and sequence error alterations during downstream analysis. Many error correction methods have been developed to correct erroneous NGS reads before further analysis, but independent evaluation of the impact of such dataset features as read length, genome size, and coverage depth on their performance is lacking. This comparative study aims to investigate the strength and weakness as well as limitations of some newest k-spectrum-based methods and to provide recommendations for users in selecting suitable methods with respect to specific NGS datasets.

METHODS

Six k-spectrum-based methods, i.e., Reptile, Musket, Bless, Bloocoo, Lighter, and Trowel, were compared using six simulated sets of paired-end Illumina sequencing data. These NGS datasets varied in coverage depth (10× to 120×), read length (36 to 100 bp), and genome size (4.6 to 143 MB). Error Correction Evaluation Toolkit (ECET) was employed to derive a suite of metrics (i.e., true positives, false positive, false negative, recall, precision, gain, and F-score) for assessing the correction quality of each method.

RESULTS

Results from computational experiments indicate that Musket had the best overall performance across the spectra of examined variants reflected in the six datasets. The lowest accuracy of Musket (F-score = 0.81) occurred to a dataset with a medium read length (56 bp), a medium coverage (50×), and a small-sized genome (5.4 MB). The other five methods underperformed (F-score < 0.80) and/or failed to process one or more datasets.

CONCLUSIONS

This study demonstrates that various factors such as coverage depth, read length, and genome size may influence performance of individual k-spectrum-based error correction methods. Thus, efforts have to be paid in choosing appropriate methods for error correction of specific NGS datasets. Based on our comparative study, we recommend Musket as the top choice because of its consistently superior performance across all six testing datasets. Further extensive studies are warranted to assess these methods using experimental datasets generated by NGS platforms (e.g., 454, SOLiD, and Ion Torrent) under more diversified parameter settings (k-mer values and edit distances) and to compare them against other non-k-spectrum-based classes of error correction methods.

Collapse

Zhu X, Wang J, Peng B, Shete S. Empirical estimation of sequencing error rates using smoothing splines. BMC Bioinformatics 2016;17:177. [PMID: 27102907 PMCID: PMC4840868 DOI: 10.1186/s12859-016-1052-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2015] [Accepted: 04/14/2016] [Indexed: 01/24/2023] Open

Abstract

Background

Next-generation sequencing has been used by investigators to address a diverse range of biological problems through, for example, polymorphism and mutation discovery and microRNA profiling. However, compared to conventional sequencing, the error rates for next-generation sequencing are often higher, which impacts the downstream genomic analysis. Recently, Wang et al. (BMC Bioinformatics 13:185, 2012) proposed a shadow regression approach to estimate the error rates for next-generation sequencing data based on the assumption of a linear relationship between the number of reads sequenced and the number of reads containing errors (denoted as shadows). However, this linear read-shadow relationship may not be appropriate for all types of sequence data. Therefore, it is necessary to estimate the error rates in a more reliable way without assuming linearity. We proposed an empirical error rate estimation approach that employs cubic and robust smoothing splines to model the relationship between the number of reads sequenced and the number of shadows.

Results

We performed simulation studies using a frequency-based approach to generate the read and shadow counts directly, which can mimic the real sequence counts data structure. Using simulation, we investigated the performance of the proposed approach and compared it to that of shadow linear regression. The proposed approach provided more accurate error rate estimations than the shadow linear regression approach for all the scenarios tested. We also applied the proposed approach to assess the error rates for the sequence data from the MicroArray Quality Control project, a mutation screening study, the Encyclopedia of DNA Elements project, and bacteriophage PhiX DNA samples.

Conclusions

The proposed empirical error rate estimation approach does not assume a linear relationship between the error-free read and shadow counts and provides more accurate estimations of error rates for next-generation, short-read sequencing data.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-016-1052-3) contains supplementary material, which is available to authorized users.

Collapse

Feng S, Lo CC, Li PE, Chain PSG. ADEPT, a dynamic next generation sequencing data error-detection program with trimming. BMC Bioinformatics 2016;17:109. [PMID: 26928302 PMCID: PMC4772517 DOI: 10.1186/s12859-016-0967-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2015] [Accepted: 02/22/2016] [Indexed: 01/16/2023] Open

Alic AS, Tomas A, Medina I, Blanquer I. MuffinEc: Error correction for de Novo assembly via greedy partitioning and sequence alignment. Inf Sci (N Y) 2016. [DOI: 10.1016/j.ins.2015.09.012] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Alic AS, Ruzafa D, Dopazo J, Blanquer I. Objective review ofde novostand-alone error correction methods for NGS data. WILEY INTERDISCIPLINARY REVIEWS: COMPUTATIONAL MOLECULAR SCIENCE 2016. [DOI: 10.1002/wcms.1239] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Pathogen Discovery. Mol Microbiol 2016. [DOI: 10.1128/9781555819071.ch7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Laehnemann D, Borkhardt A, McHardy AC. Denoising DNA deep sequencing data-high-throughput sequencing errors and their correction. Brief Bioinform 2016;17:154-79. [PMID: 26026159 PMCID: PMC4719071 DOI: 10.1093/bib/bbv029] [Citation(s) in RCA: 190] [Impact Index Per Article: 21.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2015] [Revised: 04/09/2015] [Indexed: 12/23/2022] Open

Pal S, Aluru S. In search of perfect reads. BMC Bioinformatics 2015;16 Suppl 17:S7. [PMID: 26679555 PMCID: PMC4674851 DOI: 10.1186/1471-2105-16-s17-s7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open

Abstract

BACKGROUND

Continued advances in next generation short-read sequencing technologies are increasing throughput and read lengths, while driving down error rates. Taking advantage of the high coverage sampling used in many applications, several error correction algorithms have been developed to improve data quality further. However, correcting errors in high coverage sequence data requires significant computing resources.

METHODS

We propose a different approach to handle erroneous sequence data. Presently, error rates of high-throughput platforms such as the Illumina HiSeq are within 1%. Moreover, the errors are not uniformly distributed in all reads, and a large percentage of reads are indeed error-free. Ability to predict such perfect reads can significantly impact the run-time complexity of applications. We present a simple and fast k-spectrum analysis based method to identify error-free reads. The filtration process to identify and weed out erroneous reads can be customized at several levels of stringency depending upon the downstream application need.

RESULTS

Our experiments show that if around 80% of the reads in a dataset are perfect, then our method retains almost 99.9% of them with more than 90% precision rate. Though filtering out reads identified as erroneous by our method reduces the average coverage by about 7%, we found the remaining reads provide as uniform a coverage as the original dataset. We demonstrate the effectiveness of our approach on an example downstream application: we show that an error correction algorithm, Reptile, which rely on collectively analyzing the reads in a dataset to identify and correct erroneous bases, instead use reads predicted to be perfect by our method to correct the other reads, the overall accuracy improves further by up to 10%.

CONCLUSIONS

Thanks to the continuous technological improvements, the coverage and accuracy of reads from dominant sequencing platforms have now reached an extent where we can envision just filtering out reads with errors, thus making error correction less important. Our algorithm is a first attempt to propose and demonstrate this new paradigm. Moreover, our demonstration is applicable to any error correction algorithm as a downstream application, this in turn gives a new class of error correcting algorithms as a by product.

Collapse

Saha S, Rajasekaran S. EC: an efficient error correction algorithm for short reads. BMC Bioinformatics 2015;16 Suppl 17:S2. [PMID: 26678663 PMCID: PMC4674864 DOI: 10.1186/1471-2105-16-s17-s2] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/30/2023] Open

Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads. Gigascience 2015;4:48. [PMID: 26500767 PMCID: PMC4615873 DOI: 10.1186/s13742-015-0089-y] [Citation(s) in RCA: 329] [Impact Index Per Article: 32.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2015] [Accepted: 10/09/2015] [Indexed: 11/10/2022] Open

Kowalski T, Grabowski S, Deorowicz S. Indexing Arbitrary-Length k-Mers in Sequencing Reads. PLoS One 2015;10:e0133198. [PMID: 26182400 PMCID: PMC4504488 DOI: 10.1371/journal.pone.0133198] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2015] [Accepted: 06/24/2015] [Indexed: 11/25/2022] Open

Allam A, Kalnis P, Solovyev V. Karect: accurate correction of substitution, insertion and deletion errors for next-generation sequencing data. Bioinformatics 2015;31:3421-8. [DOI: 10.1093/bioinformatics/btv415] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2014] [Accepted: 07/08/2015] [Indexed: 11/12/2022] Open

Sheikhizadeh S, de Ridder D. ACE: accurate correction of errors usingK-mer tries. Bioinformatics 2015;31:3216-8. [DOI: 10.1093/bioinformatics/btv332] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2014] [Accepted: 05/22/2015] [Indexed: 11/13/2022] Open

Salehi F, Baronio R, Idrogo-Lam R, Vu H, Hall LV, Kaiser P, Lathrop RH. CHOPER filters enable rare mutation detection in complex mutagenesis populations by next-generation sequencing. PLoS One 2015;10:e0116877. [PMID: 25692681 PMCID: PMC4333345 DOI: 10.1371/journal.pone.0116877] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2014] [Accepted: 12/08/2014] [Indexed: 01/12/2023] Open

Schulz MH, Weese D, Holtgrewe M, Dimitrova V, Niu S, Reinert K, Richard H. Fiona: a parallel and automatic strategy for read error correction. ACTA ACUST UNITED AC 2015;30:i356-63. [PMID: 25161220 PMCID: PMC4147893 DOI: 10.1093/bioinformatics/btu440] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Affiliation(s)

Marcel H Schulz 'Multimodal Computing and Interaction', Saarland University & Department for Computational Biology and Applied Computing, Max Planck Institute for Informatics, Saarbrücken, 66123 Saarland, Germany, Ray and Stephanie Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, 15206 PA, USA, Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany, Université Pierre et Marie Curie, UMR7238, CNRS-UPMC, Paris, France and CNRS, UMR7238, Laboratory of Computational and Quantitative Biology, Paris, France 'Multimodal Computing and Interaction', Saarland University & Department for Computational Biology and Applied Computing, Max Planck Institute for Informatics, Saarbrücken, 66123 Saarland, Germany, Ray and Stephanie Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, 15206 PA, USA, Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany, Université Pierre et Marie Curie, UMR7238, CNRS-UPMC, Paris, France and CNRS, UMR7238, Laboratory of Computational and Quantitative Biology, Paris, France
David Weese 'Multimodal Computing and Interaction', Saarland University & Department for Computational Biology and Applied Computing, Max Planck Institute for Informatics, Saarbrücken, 66123 Saarland, Germany, Ray and Stephanie Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, 15206 PA, USA, Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany, Université Pierre et Marie Curie, UMR7238, CNRS-UPMC, Paris, France and CNRS, UMR7238, Laboratory of Computational and Quantitative Biology, Paris, France
Manuel Holtgrewe 'Multimodal Computing and Interaction', Saarland University & Department for Computational Biology and Applied Computing, Max Planck Institute for Informatics, Saarbrücken, 66123 Saarland, Germany, Ray and Stephanie Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, 15206 PA, USA, Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany, Université Pierre et Marie Curie, UMR7238, CNRS-UPMC, Paris, France and CNRS, UMR7238, Laboratory of Computational and Quantitative Biology, Paris, France
Viktoria Dimitrova 'Multimodal Computing and Interaction', Saarland University & Department for Computational Biology and Applied Computing, Max Planck Institute for Informatics, Saarbrücken, 66123 Saarland, Germany, Ray and Stephanie Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, 15206 PA, USA, Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany, Université Pierre et Marie Curie, UMR7238, CNRS-UPMC, Paris, France and CNRS, UMR7238, Laboratory of Computational and Quantitative Biology, Paris, France 'Multimodal Computing and Interaction', Saarland University & Department for Computational Biology and Applied Computing, Max Planck Institute for Informatics, Saarbrücken, 66123 Saarland, Germany, Ray and Stephanie Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, 15206 PA, USA, Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany, Université Pierre et Marie Curie, UMR7238, CNRS-UPMC, Paris, France and CNRS, UMR7238, Laboratory of Computational and Quantitative Biology, Paris, France
Sijia Niu 'Multimodal Computing and Interaction', Saarland University & Department for Computational Biology and Applied Computing, Max Planck Institute for Informatics, Saarbrücken, 66123 Saarland, Germany, Ray and Stephanie Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, 15206 PA, USA, Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany, Université Pierre et Marie Curie, UMR7238, CNRS-UPMC, Paris, France and CNRS, UMR7238, Laboratory of Computational and Quantitative Biology, Paris, France 'Multimodal Computing and Interaction', Saarland University & Department for Computational Biology and Applied Computing, Max Planck Institute for Informatics, Saarbrücken, 66123 Saarland, Germany, Ray and Stephanie Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, 15206 PA, USA, Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany, Université Pierre et Marie Curie, UMR7238, CNRS-UPMC, Paris, France and CNRS, UMR7238, Laboratory of Computational and Quantitative Biology, Paris, France
Knut Reinert 'Multimodal Computing and Interaction', Saarland University & Department for Computational Biology and Applied Computing, Max Planck Institute for Informatics, Saarbrücken, 66123 Saarland, Germany, Ray and Stephanie Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, 15206 PA, USA, Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany, Université Pierre et Marie Curie, UMR7238, CNRS-UPMC, Paris, France and CNRS, UMR7238, Laboratory of Computational and Quantitative Biology, Paris, France
Hugues Richard 'Multimodal Computing and Interaction', Saarland University & Department for Computational Biology and Applied Computing, Max Planck Institute for Informatics, Saarbrücken, 66123 Saarland, Germany, Ray and Stephanie Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, 15206 PA, USA, Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany, Université Pierre et Marie Curie, UMR7238, CNRS-UPMC, Paris, France and CNRS, UMR7238, Laboratory of Computational and Quantitative Biology, Paris, France 'Multimodal Computing and Interaction', Saarland University & Department for Computational Biology and Applied Computing, Max Planck Institute for Informatics, Saarbrücken, 66123 Saarland, Germany, Ray and Stephanie Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, 15206 PA, USA, Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany, Université Pierre et Marie Curie, UMR7238, CNRS-UPMC, Paris, France and CNRS, UMR7238, Laboratory of Computational and Quantitative Biology, Paris, France

Collapse

Rare biosphere exploration using high-throughput sequencing: research progress and perspectives. CONSERV GENET 2014. [DOI: 10.1007/s10592-014-0678-9] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Ahola V, Lehtonen R, Somervuo P, Salmela L, Koskinen P, Rastas P, Välimäki N, Paulin L, Kvist J, Wahlberg N, Tanskanen J, Hornett EA, Ferguson LC, Luo S, Cao Z, de Jong MA, Duplouy A, Smolander OP, Vogel H, McCoy RC, Qian K, Chong WS, Zhang Q, Ahmad F, Haukka JK, Joshi A, Salojärvi J, Wheat CW, Grosse-Wilde E, Hughes D, Katainen R, Pitkänen E, Ylinen J, Waterhouse RM, Turunen M, Vähärautio A, Ojanen SP, Schulman AH, Taipale M, Lawson D, Ukkonen E, Mäkinen V, Goldsmith MR, Holm L, Auvinen P, Frilander MJ, Hanski I. The Glanville fritillary genome retains an ancient karyotype and reveals selective chromosomal fusions in Lepidoptera. Nat Commun 2014;5:4737. [PMID: 25189940 PMCID: PMC4164777 DOI: 10.1038/ncomms5737] [Citation(s) in RCA: 158] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2014] [Accepted: 07/17/2014] [Indexed: 12/30/2022] Open

Affiliation(s)

Virpi Ahola 1] Department of Biosciences, University of Helsinki, FI-00014 Helsinki, Finland [2]
Rainer Lehtonen 1] Department of Biosciences, University of Helsinki, FI-00014 Helsinki, Finland [2] Genome-Scale Biology Research Program, University of Helsinki, FI-00014 Helsinki, Finland [3] Institute of Biomedicine, University of Helsinki, FI-00014 Helsinki, Finland [4] Center of Excellence in Cancer Genetics, University of Helsinki, FI-00014 Helsinki, Finland [5] [6]
Panu Somervuo 1] Department of Biosciences, University of Helsinki, FI-00014 Helsinki, Finland [2] Institute of Biotechnology, University of Helsinki, FI-00014 Helsinki, Finland [3]
Leena Salmela Department of Computer Science &Helsinki Institute for Information Technology HIIT, University of Helsinki, FI-00014 Helsinki, Finland
Patrik Koskinen 1] Department of Biosciences, University of Helsinki, FI-00014 Helsinki, Finland [2] Institute of Biotechnology, University of Helsinki, FI-00014 Helsinki, Finland
Pasi Rastas Department of Biosciences, University of Helsinki, FI-00014 Helsinki, Finland
Niko Välimäki 1] Genome-Scale Biology Research Program, University of Helsinki, FI-00014 Helsinki, Finland [2] Institute of Biomedicine, University of Helsinki, FI-00014 Helsinki, Finland
Lars Paulin Institute of Biotechnology, University of Helsinki, FI-00014 Helsinki, Finland
Jouni Kvist Institute of Biotechnology, University of Helsinki, FI-00014 Helsinki, Finland
Niklas Wahlberg Department of Biology, University of Turku, FI-20014 Turku, Finland
Jaakko Tanskanen 1] Institute of Biotechnology, University of Helsinki, FI-00014 Helsinki, Finland [2] Biotechnology and Food Research, MTT Agrifood Research Finland, FI-31600 Jokioinen, Finland
Emily A Hornett 1] Department of Zoology, University of Cambridge, Cambridge CB2 3EJ, UK [2] Department of Biology, Pennsylvania State University, Pennsylvania 16802, USA
Laura C Ferguson Department of Zoology, University of Oxford, Oxford OX1 3PS, UK
Shiqi Luo College of Life Sciences, Peking University, Beijing 100871, P.R. China
Zijuan Cao College of Life Sciences, Peking University, Beijing 100871, P.R. China
Maaike A de Jong 1] Department of Biosciences, University of Helsinki, FI-00014 Helsinki, Finland [2] School of Biological Sciences, University of Bristol, Bristol BS8 1UG, UK
Anne Duplouy Department of Biosciences, University of Helsinki, FI-00014 Helsinki, Finland
Olli-Pekka Smolander Institute of Biotechnology, University of Helsinki, FI-00014 Helsinki, Finland
Heiko Vogel Department of Entomology, Max Planck Institute for Chemical Ecology, D-07745 Jena, Germany
Rajiv C McCoy Department of Biology, Stanford University, Stanford, California 94305, USA
Kui Qian Institute of Biotechnology, University of Helsinki, FI-00014 Helsinki, Finland
Wong Swee Chong Department of Biosciences, University of Helsinki, FI-00014 Helsinki, Finland
Qin Zhang BioMediTech, University of Tampere, FI-33520 Tampere, Finland
Freed Ahmad Department of Information Technology, University of Turku, FI-20014 Turku, Finland
Jani K Haukka BioMediTech, University of Tampere, FI-33520 Tampere, Finland
Aruj Joshi BioMediTech, University of Tampere, FI-33520 Tampere, Finland
Jarkko Salojärvi Department of Biosciences, University of Helsinki, FI-00014 Helsinki, Finland
Christopher W Wheat Department of Zoology, Stockholm University, SE-10691 Stockholm, Sweden
Ewald Grosse-Wilde Department of Evolutionary Neuroethology, Max Planck Institute for Chemical Ecology, D-07745 Jena, Germany
Daniel Hughes 1] European Bioinformatics Institute, Hinxton CB10 1SD, UK [2] Baylor College of Medicine, Human Genome Sequencing Center, Houston, Texas 77030-3411, USA
Riku Katainen 1] Genome-Scale Biology Research Program, University of Helsinki, FI-00014 Helsinki, Finland [2] Institute of Biomedicine, University of Helsinki, FI-00014 Helsinki, Finland
Esa Pitkänen 1] Genome-Scale Biology Research Program, University of Helsinki, FI-00014 Helsinki, Finland [2] Institute of Biomedicine, University of Helsinki, FI-00014 Helsinki, Finland
Johannes Ylinen Department of Computer Science &Helsinki Institute for Information Technology HIIT, University of Helsinki, FI-00014 Helsinki, Finland
Robert M Waterhouse 1] Department of Genetic Medicine and Development, University of Geneva Medical School &Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland [2] Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA [3] The Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
Mikko Turunen Genome-Scale Biology Research Program, University of Helsinki, FI-00014 Helsinki, Finland
Anna Vähärautio 1] Genome-Scale Biology Research Program, University of Helsinki, FI-00014 Helsinki, Finland [2] Department of Pathology, University of Helsinki, FI-00014 Helsinki, Finland [3] Science for Life Laboratory, Department of Biosciences and Nutrition, Karolinska Institutet, SE-14183 Stockholm, Sweden
Sami P Ojanen Department of Biosciences, University of Helsinki, FI-00014 Helsinki, Finland
Alan H Schulman 1] Institute of Biotechnology, University of Helsinki, FI-00014 Helsinki, Finland [2] Biotechnology and Food Research, MTT Agrifood Research Finland, FI-31600 Jokioinen, Finland
Minna Taipale 1] Genome-Scale Biology Research Program, University of Helsinki, FI-00014 Helsinki, Finland [2] Science for Life Laboratory, Department of Biosciences and Nutrition, Karolinska Institutet, SE-14183 Stockholm, Sweden
Daniel Lawson European Bioinformatics Institute, Hinxton CB10 1SD, UK
Esko Ukkonen Department of Computer Science &Helsinki Institute for Information Technology HIIT, University of Helsinki, FI-00014 Helsinki, Finland
Veli Mäkinen Department of Computer Science &Helsinki Institute for Information Technology HIIT, University of Helsinki, FI-00014 Helsinki, Finland
Marian R Goldsmith Department of Biological Sciences, University of Rhode Island, Kingston, Rhode Island 02881-0816, USA
Liisa Holm 1] Department of Biosciences, University of Helsinki, FI-00014 Helsinki, Finland [2] Institute of Biotechnology, University of Helsinki, FI-00014 Helsinki, Finland [3]
Petri Auvinen 1] Institute of Biotechnology, University of Helsinki, FI-00014 Helsinki, Finland [2]
Mikko J Frilander 1] Institute of Biotechnology, University of Helsinki, FI-00014 Helsinki, Finland [2]
Ilkka Hanski Department of Biosciences, University of Helsinki, FI-00014 Helsinki, Finland

Collapse

Molnar M, Ilie L. Correcting Illumina data. Brief Bioinform 2014;16:588-99. [PMID: 25183248 DOI: 10.1093/bib/bbu029] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2014] [Accepted: 08/02/2014] [Indexed: 11/12/2022] Open

Salmela L, Rivals E. LoRDEC: accurate and efficient long read error correction. ACTA ACUST UNITED AC 2014;30:3506-14. [PMID: 25165095 PMCID: PMC4253826 DOI: 10.1093/bioinformatics/btu538] [Citation(s) in RCA: 476] [Impact Index Per Article: 43.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Lim EC, Müller J, Hagmann J, Henz SR, Kim ST, Weigel D. Trowel: a fast and accurate error correction module for Illumina sequencing reads. ACTA ACUST UNITED AC 2014;30:3264-5. [PMID: 25075116 DOI: 10.1093/bioinformatics/btu513] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Greenfield P, Duesing K, Papanicolaou A, Bauer DC. Blue: correcting sequencing errors using consensus and context. ACTA ACUST UNITED AC 2014;30:2723-32. [PMID: 24919879 DOI: 10.1093/bioinformatics/btu368] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Abstract

MOTIVATION

Bioinformatics tools, such as assemblers and aligners, are expected to produce more accurate results when given better quality sequence data as their starting point. This expectation has led to the development of stand-alone tools whose sole purpose is to detect and remove sequencing errors. A good error-correcting tool would be a transparent component in a bioinformatics pipeline, simply taking sequence data in any of the standard formats and producing a higher quality version of the same data containing far fewer errors. It should not only be able to correct all of the types of errors found in real sequence data (substitutions, insertions, deletions and uncalled bases), but it has to be both fast enough and scalable enough to be usable on the large datasets being produced by current sequencing technologies, and work on data derived from both haploid and diploid organisms.

RESULTS

This article presents Blue, an error-correction algorithm based on k-mer consensus and context. Blue can correct substitution, deletion and insertion errors, as well as uncalled bases. It accepts both FASTQ and FASTA formats, and corrects quality scores for corrected bases. Blue also maintains the pairing of reads, both within a file and between pairs of files, making it compatible with downstream tools that depend on read pairing. Blue is memory efficient, scalable and faster than other published tools, and usable on large sequencing datasets. On the tests undertaken, Blue also proved to be generally more accurate than other published algorithms, resulting in more accurately aligned reads and the assembly of longer contigs containing fewer errors. One significant feature of Blue is that its k-mer consensus table does not have to be derived from the set of reads being corrected. This decoupling makes it possible to correct one dataset, such as small set of 454 mate-pair reads, with the consensus derived from another dataset, such as Illumina reads derived from the same DNA sample. Such cross-correction can greatly improve the quality of small (and expensive) sets of long reads, leading to even better assemblies and higher quality finished genomes.

AVAILABILITY AND IMPLEMENTATION

The code for Blue and its related tools are available from http://www.bioinformatics.csiro.au/Blue. These programs are written in C# and run natively under Windows and under Mono on Linux.

Collapse

Knief C. Analysis of plant microbe interactions in the era of next generation sequencing technologies. FRONTIERS IN PLANT SCIENCE 2014;5:216. [PMID: 24904612 PMCID: PMC4033234 DOI: 10.3389/fpls.2014.00216] [Citation(s) in RCA: 120] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2014] [Accepted: 04/30/2014] [Indexed: 05/18/2023]

Wirawan A, Harris RS, Liu Y, Schmidt B, Schröder J. HECTOR: a parallel multistage homopolymer spectrum based error corrector for 454 sequencing data. BMC Bioinformatics 2014;15:131. [PMID: 24885381 PMCID: PMC4023493 DOI: 10.1186/1471-2105-15-131] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2013] [Accepted: 04/24/2014] [Indexed: 01/29/2023] Open

Wang C, Grohme MA, Mali B, Schill RO, Frohme M. Towards decrypting cryptobiosis--analyzing anhydrobiosis in the tardigrade Milnesium tardigradum using transcriptome sequencing. PLoS One 2014;9:e92663. [PMID: 24651535 PMCID: PMC3961413 DOI: 10.1371/journal.pone.0092663] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2013] [Accepted: 02/25/2014] [Indexed: 11/18/2022] Open

Abstract

Background

Many tardigrade species are capable of anhydrobiosis; however, mechanisms underlying their extreme desiccation resistance remain elusive. This study attempts to quantify the anhydrobiotic transcriptome of the limno-terrestrial tardigrade Milnesium tardigradum.

Results

A prerequisite for differential gene expression analysis was the generation of a reference hybrid transcriptome atlas by assembly of Sanger, 454 and Illumina sequence data. The final assembly yielded 79,064 contigs (>100 bp) after removal of ribosomal RNAs. Around 50% of them could be annotated by SwissProt and NCBI non-redundant protein sequences. Analysis using CEGMA predicted 232 (93.5%) out of the 248 highly conserved eukaryotic genes in the assembly. We used this reference transcriptome for mapping and quantifying the expression of transcripts regulated under anhdydrobiosis in a time-series during dehydration and rehydration. 834 of the transcripts were found to be differentially expressed in a single stage (dehydration/inactive tun/rehydration) and 184 were overlapping in two stages while 74 were differentially expressed in all three stages. We have found interesting patterns of differentially expressed transcripts that are in concordance with a common hypothesis of metabolic shutdown during anhydrobiosis. This included down-regulation of several proteins of the DNA replication and translational machinery and protein degradation. Among others, heat shock proteins Hsp27 and Hsp30c were up-regulated in response to dehydration and rehydration. In addition, we observed up-regulation of ployubiquitin-B upon rehydration together with a higher expression level of several DNA repair proteins during rehydration than in the dehydration stage.

Conclusions

Most of the transcripts identified to be differentially expressed had distinct cellular function. Our data suggest a concerted molecular adaptation in M. tardigradum that permits extreme forms of ametabolic states such as anhydrobiosis. It is temping to surmise that the desiccation tolerance of tradigrades can be achieved by a constitutive cellular protection system, probably in conjunction with other mechanisms such as rehydration-induced cellular repair.

Collapse

Heo Y, Wu XL, Chen D, Ma J, Hwu WM. BLESS: bloom filter-based error correction solution for high-throughput sequencing reads. ACTA ACUST UNITED AC 2014;30:1354-62. [PMID: 24451628 DOI: 10.1093/bioinformatics/btu030] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

McElroy K, Thomas T, Luciani F. Deep sequencing of evolving pathogen populations: applications, errors, and bioinformatic solutions. MICROBIAL INFORMATICS AND EXPERIMENTATION 2014;4:1. [PMID: 24428920 PMCID: PMC3902414 DOI: 10.1186/2042-5783-4-1] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/14/2013] [Accepted: 01/07/2014] [Indexed: 12/15/2022]

From Indexing Data Structures to de Bruijn Graphs. COMBINATORIAL PATTERN MATCHING 2014. [DOI: 10.1007/978-3-319-07566-2_10] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]

El-Metwally S, Ouda OM, Helmy M. Approaches and Challenges of Next-Generation Sequence Assembly Stages. NEXT GENERATION SEQUENCING TECHNOLOGIES AND CHALLENGES IN SEQUENCE ASSEMBLY 2014. [DOI: 10.1007/978-1-4939-0715-1_9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]

El-Metwally S, Hamza T, Zakaria M, Helmy M. Next-generation sequence assembly: four stages of data processing and computational challenges. PLoS Comput Biol 2013;9:e1003345. [PMID: 24348224 PMCID: PMC3861042 DOI: 10.1371/journal.pcbi.1003345] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open

Dorn C, Grunert M, Sperling SR. Application of high-throughput sequencing for studying genomic variations in congenital heart disease. Brief Funct Genomics 2013;13:51-65. [PMID: 24095982 DOI: 10.1093/bfgp/elt040] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

Farrer RA, Henk DA, MacLean D, Studholme DJ, Fisher MC. Using false discovery rates to benchmark SNP-callers in next-generation sequencing projects. Sci Rep 2013;3:1512. [PMID: 23518929 PMCID: PMC3604800 DOI: 10.1038/srep01512] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2012] [Accepted: 02/25/2013] [Indexed: 12/16/2022] Open

Ilie L, Molnar M. RACER: Rapid and accurate correction of errors in reads. ACTA ACUST UNITED AC 2013;29:2490-3. [PMID: 23853064 DOI: 10.1093/bioinformatics/btt407] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Eren AM, Morrison HG, Huse SM, Sogin ML. DRISEE overestimates errors in metagenomic sequencing data. Brief Bioinform 2013;15:783-7. [PMID: 23698723 PMCID: PMC4171678 DOI: 10.1093/bib/bbt010] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Liu Y, Schröder J, Schmidt B. Musket: a multistage k-mer spectrum-based error corrector for Illumina sequence data. ACTA ACUST UNITED AC 2012. [PMID: 23202746 DOI: 10.1093/bioinformatics/bts690] [Citation(s) in RCA: 180] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]

Carneiro AR, Ramos RTJ, Barbosa HPM, Schneider MPC, Barh D, Azevedo V, Silva A. Quality of prokaryote genome assembly: Indispensable issues of factors affecting prokaryote genome assembly quality. Gene 2012;505:365-7. [DOI: 10.1016/j.gene.2012.06.016] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2012] [Revised: 06/09/2012] [Accepted: 06/11/2012] [Indexed: 12/21/2022]

Wang XV, Blades N, Ding J, Sultana R, Parmigiani G. Estimation of sequencing error rates in short reads. BMC Bioinformatics 2012;13:185. [PMID: 22846331 PMCID: PMC3495688 DOI: 10.1186/1471-2105-13-185] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2011] [Accepted: 07/13/2012] [Indexed: 11/10/2022] Open

Yang X, Chockalingam SP, Aluru S. A survey of error-correction methods for next-generation sequencing. Brief Bioinform 2012;14:56-66. [DOI: 10.1093/bib/bbs015] [Citation(s) in RCA: 177] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Burriesci MS, Lehnert EM, Pringle JR. Fulcrum: condensing redundant reads from high-throughput sequencing studies. ACTA ACUST UNITED AC 2012;28:1324-7. [PMID: 22419786 DOI: 10.1093/bioinformatics/bts123] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Pickrell WO, Rees MI, Chung SK. Next Generation Sequencing Methodologies - An Overview. CHALLENGES AND OPPORTUNITIES OF NEXT-GENERATION SEQUENCING FOR BIOMEDICAL RESEARCH 2012;89:1-26. [DOI: 10.1016/b978-0-12-394287-6.00001-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

Wijaya E, Frith MC, Asai K, Horton P. RecountDB: a database of mapped and count corrected transcribed sequences. Nucleic Acids Res 2011;40:D1089-92. [PMID: 22139942 PMCID: PMC3245132 DOI: 10.1093/nar/gkr1172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open