Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Whiteford N, Haslam N, Weber G, Prügel-Bennett A, Essex JW, Roach PL, Bradley M, Neylon C. An analysis of the feasibility of short read sequencing. Nucleic Acids Res 2005;33:e171. [PMID: 16275781 PMCID: PMC1278949 DOI: 10.1093/nar/gni170] [Citation(s) in RCA: 93] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

For:	Whiteford N, Haslam N, Weber G, Prügel-Bennett A, Essex JW, Roach PL, Bradley M, Neylon C. An analysis of the feasibility of short read sequencing. Nucleic Acids Res 2005;33:e171. [PMID: 16275781 PMCID: PMC1278949 DOI: 10.1093/nar/gni170] [Citation(s) in RCA: 93] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Number

Cited by Other Article(s)

Ruohan W, Yuwei Z, Mengbo W, Xikang F, Jianping W, Shuai Cheng L. Resolving single-cell copy number profiling for large datasets. Brief Bioinform 2022;23:6633647. [PMID: 35801503 DOI: 10.1093/bib/bbac264] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 05/29/2022] [Accepted: 06/06/2022] [Indexed: 11/14/2022] Open

Baratta AM, Brandner AJ, Plasil SL, Rice RC, Farris SP. Advancements in Genomic and Behavioral Neuroscience Analysis for the Study of Normal and Pathological Brain Function. Front Mol Neurosci 2022;15:905328. [PMID: 35813067 PMCID: PMC9259865 DOI: 10.3389/fnmol.2022.905328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2022] [Accepted: 06/06/2022] [Indexed: 11/16/2022] Open

Waters NR, Abram F, Brennan F, Holmes A, Pritchard L. riboSeed: leveraging prokaryotic genomic architecture to assemble across ribosomal regions. Nucleic Acids Res 2019;46:e68. [PMID: 29608703 PMCID: PMC6009695 DOI: 10.1093/nar/gky212] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2017] [Accepted: 03/12/2018] [Indexed: 11/12/2022] Open

Goldstein S, Beka L, Graf J, Klassen JL. Evaluation of strategies for the assembly of diverse bacterial genomes using MinION long-read sequencing. BMC Genomics 2019;20:23. [PMID: 30626323 PMCID: PMC6325685 DOI: 10.1186/s12864-018-5381-7] [Citation(s) in RCA: 86] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2018] [Accepted: 12/16/2018] [Indexed: 11/23/2022] Open

Abstract

Background

Short-read sequencing technologies have made microbial genome sequencing cheap and accessible. However, closing genomes is often costly and assembling short reads from genomes that are repetitive and/or have extreme %GC content remains challenging. Long-read, single-molecule sequencing technologies such as the Oxford Nanopore MinION have the potential to overcome these difficulties, although the best approach for harnessing their potential remains poorly evaluated.

Results

We sequenced nine bacterial genomes spanning a wide range of GC contents using Illumina MiSeq and Oxford Nanopore MinION sequencing technologies to determine the advantages of each approach, both individually and combined. Assemblies using only MiSeq reads were highly accurate but lacked contiguity, a deficiency that was partially overcome by adding MinION reads to these assemblies. Even more contiguous genome assemblies were generated by using MinION reads for initial assembly, but these assemblies were more error-prone and required further polishing. This was especially pronounced when Illumina libraries were biased, as was the case for our strains with both high and low GC content. Increased genome contiguity dramatically improved the annotation of insertion sequences and secondary metabolite biosynthetic gene clusters, likely because long-reads can disambiguate these highly repetitive but biologically important genomic regions.

Conclusions

Genome assembly using short-reads is challenged by repetitive sequences and extreme GC contents. Our results indicate that these difficulties can be largely overcome by using single-molecule, long-read sequencing technologies such as the Oxford Nanopore MinION. Using MinION reads for assembly followed by polishing with Illumina reads generated the most contiguous genomes with sufficient accuracy to enable the accurate annotation of important but difficult to sequence genomic features such as insertion sequences and secondary metabolite biosynthetic gene clusters. The combination of Oxford Nanopore and Illumina sequencing can therefore cost-effectively advance studies of microbial evolution and genome-driven drug discovery.

Electronic supplementary material

The online version of this article (10.1186/s12864-018-5381-7) contains supplementary material, which is available to authorized users.

Collapse

Khan M, Fadaie Z, Cornelis SS, Cremers FPM, Roosing S. Identification and Analysis of Genes Associated with Inherited Retinal Diseases. Methods Mol Biol 2019;1834:3-27. [PMID: 30324433 DOI: 10.1007/978-1-4939-8669-9_1] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Pu D, Xiao P. A real-time decoding sequencing technology—new possibility for high throughput sequencing. RSC Adv 2017. [DOI: 10.1039/c7ra06202h] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

Feng W, Zhao S, Xue D, Song F, Li Z, Chen D, He B, Hao Y, Wang Y, Liu Y. Improving alignment accuracy on homopolymer regions for semiconductor-based sequencing technologies. BMC Genomics 2016;17 Suppl 7:521. [PMID: 27556417 PMCID: PMC5001236 DOI: 10.1186/s12864-016-2894-9] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open

SNP Mining in Functional Genes from Nonmodel Species by Next-Generation Sequencing: A Case of Flowering, Pre-Harvest Sprouting, and Dehydration Resistant Genes in Wheat. BIOMED RESEARCH INTERNATIONAL 2016;2016:3524908. [PMID: 27051662 PMCID: PMC4808660 DOI: 10.1155/2016/3524908] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/30/2015] [Accepted: 02/18/2016] [Indexed: 11/29/2022]

Chen TW, Gan RC, Chang YF, Liao WC, Wu TH, Lee CC, Huang PJ, Lee CY, Chen YYM, Chiu CH, Tang P. Is the whole greater than the sum of its parts? De novo assembly strategies for bacterial genomes based on paired-end sequencing. BMC Genomics 2015;16:648. [PMID: 26315384 PMCID: PMC4552406 DOI: 10.1186/s12864-015-1859-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2015] [Accepted: 08/18/2015] [Indexed: 01/16/2023] Open

Nimmy SF, Kamal MS. Next generation sequencing under de novo genome assembly. INT J BIOMATH 2015. [DOI: 10.1142/s1793524515300018] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]

Feng W, Sang P, Lian D, Dong Y, Song F, Li M, He B, Cao F, Liu Y. ResSeq: Enhancing Short-Read Sequencing Alignment By Rescuing Error-Containing Reads. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015;12:795-798. [PMID: 26357318 DOI: 10.1109/tcbb.2014.2366103] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Bragg L, Tyson GW. Metagenomics using next-generation sequencing. Methods Mol Biol 2014;1096:183-201. [PMID: 24515370 DOI: 10.1007/978-1-62703-712-9_15] [Citation(s) in RCA: 62] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]

Pu D, Qi Y, Cui L, Xiao P, Lu Z. A real-time decoding sequencing based on dual mononucleotide addition for cyclic synthesis. Anal Chim Acta 2014;852:274-83. [PMID: 25441908 DOI: 10.1016/j.aca.2014.09.009] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2014] [Revised: 08/28/2014] [Accepted: 09/08/2014] [Indexed: 11/19/2022]

Liu B, Morrison CD, Johnson CS, Trump DL, Qin M, Conroy JC, Wang J, Liu S. Computational methods for detecting copy number variations in cancer genome using next generation sequencing: principles and challenges. Oncotarget 2014;4:1868-81. [PMID: 24240121 PMCID: PMC3875755 DOI: 10.18632/oncotarget.1537] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open

The eukaryotic genome, its reads, and the unfinished assembly. FEBS Lett 2013;587:2090-3. [PMID: 23727201 DOI: 10.1016/j.febslet.2013.05.048] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2013] [Revised: 05/09/2013] [Accepted: 05/20/2013] [Indexed: 11/21/2022]

Forde BM, O'Toole PW. Next-generation sequencing technologies and their impact on microbial genomics. Brief Funct Genomics 2013;12:440-53. [PMID: 23314033 DOI: 10.1093/bfgp/els062] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open

Wang Z, Willard HF. Evidence for sequence biases associated with patterns of histone methylation. BMC Genomics 2012;13:367. [PMID: 22857523 PMCID: PMC3532361 DOI: 10.1186/1471-2164-13-367] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2011] [Accepted: 07/18/2012] [Indexed: 11/19/2022] Open

Pellin D, Miotto P, Ambrosi A, Cirillo DM, Di Serio C. A genome-wide identification analysis of small regulatory RNAs in Mycobacterium tuberculosis by RNA-Seq and conservation analysis. PLoS One 2012;7:e32723. [PMID: 22470422 PMCID: PMC3314655 DOI: 10.1371/journal.pone.0032723] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2011] [Accepted: 02/03/2012] [Indexed: 12/29/2022] Open

Schulz MH, Zerbino DR, Vingron M, Birney E. Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics 2012;28:1086-92. [PMID: 22368243 PMCID: PMC3324515 DOI: 10.1093/bioinformatics/bts094] [Citation(s) in RCA: 1009] [Impact Index Per Article: 84.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open

Lassen KS, Schultz H, Heegaard NHH, He M. A novel DNAseq program for enhanced analysis of Illumina GAII data: a case study on antibody complementarity-determining regions. N Biotechnol 2012;29:271-8. [PMID: 22155428 DOI: 10.1016/j.nbt.2011.11.014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2011] [Revised: 11/09/2011] [Accepted: 11/25/2011] [Indexed: 11/16/2022]

Derrien T, Estellé J, Marco Sola S, Knowles DG, Raineri E, Guigó R, Ribeca P. Fast computation and applications of genome mappability. PLoS One 2012;7:e30377. [PMID: 22276185 PMCID: PMC3261895 DOI: 10.1371/journal.pone.0030377] [Citation(s) in RCA: 327] [Impact Index Per Article: 27.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2011] [Accepted: 12/19/2011] [Indexed: 01/17/2023] Open

Gene fragmentation in bacterial draft genomes: extent, consequences and mitigation. BMC Genomics 2012;13:14. [PMID: 22233127 PMCID: PMC3322347 DOI: 10.1186/1471-2164-13-14] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2011] [Accepted: 01/10/2012] [Indexed: 12/11/2022] Open

Abstract

UNLABELLED

Ongoing technological advances in genome sequencing are allowing bacterial genomes to be sequenced at ever-lower cost. However, nearly all of these new techniques concomitantly decrease genome quality, primarily due to the inability of their relatively short read lengths to bridge certain genomic regions, e.g., those containing repeats. Fragmentation of predicted open reading frames (ORFs) is one possible consequence of this decreased quality. In this study we quantify ORF fragmentation in draft microbial genomes and its effect on annotation efficacy, and we propose a solution to ameliorate this problem.

RESULTS

A survey of draft-quality genomes in GenBank revealed that fragmented ORFs comprised > 80% of the predicted ORFs in some genomes, and that increased fragmentation correlated with decreased genome assembly quality. In a more thorough analysis of 25 Streptomyces genomes, fragmentation was especially enriched in some protein classes with repeating, multi-modular structures such as polyketide synthases, non-ribosomal peptide synthetases and serine/threonine kinases. Overall, increased genome fragmentation correlated with increased false-negative Pfam and COG annotation rates and increased false-positive KEGG annotation rates. The false-positive KEGG annotation rate could be ameliorated by linking fragmented ORFs using their orthologs in related genomes. Whereas this strategy successfully linked up to 46% of the total ORF fragments in some genomes, its sensitivity appeared to depend heavily on the depth of sampling of a particular taxon's variable genome.

CONCLUSIONS

Draft microbial genomes contain many ORF fragments. Where these correspond to the same gene they have particular potential to confound comparative gene content analyses. Given our findings, and the rapid increase in the number of microbial draft quality genomes, we suggest that accounting for gene fragmentation and its associated biases is important when designing comparative genomic projects.

Collapse

Ji Y, Shi Y, Ding G, Li Y. A new strategy for better genome assembly from very short reads. BMC Bioinformatics 2011;12:493. [PMID: 22208765 PMCID: PMC3268122 DOI: 10.1186/1471-2105-12-493] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2011] [Accepted: 12/30/2011] [Indexed: 11/29/2022] Open

Nunes MCS, Wanner EF, Weber G. Origin of multiple periodicities in the Fourier power spectra of the Plasmodium falciparum genome. BMC Genomics 2011;12 Suppl 4:S4. [PMID: 22369134 PMCID: PMC3287587 DOI: 10.1186/1471-2164-12-s4-s4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open

Hampton M, Melvin RG, Kendall AH, Kirkpatrick BR, Peterson N, Andrews MT. Deep sequencing the transcriptome reveals seasonal adaptive mechanisms in a hibernating mammal. PLoS One 2011;6:e27021. [PMID: 22046435 PMCID: PMC3203946 DOI: 10.1371/journal.pone.0027021] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2011] [Accepted: 10/07/2011] [Indexed: 11/19/2022] Open

Abstract

Mammalian hibernation is a complex phenotype involving metabolic rate reduction, bradycardia, profound hypothermia, and a reliance on stored fat that allows the animal to survive for months without food in a state of suspended animation. To determine the genes responsible for this phenotype in the thirteen-lined ground squirrel (Ictidomys tridecemlineatus) we used the Roche 454 platform to sequence mRNA isolated at six points throughout the year from three key tissues: heart, skeletal muscle, and white adipose tissue (WAT). Deep sequencing generated approximately 3.7 million cDNA reads from 18 samples (6 time points ×3 tissues) with a mean read length of 335 bases. Of these, 3,125,337 reads were assembled into 140,703 contigs. Approximately 90% of all sequences were matched to proteins in the human UniProt database. The total number of distinct human proteins matched by ground squirrel transcripts was 13,637 for heart, 12,496 for skeletal muscle, and 14,351 for WAT. Extensive mitochondrial RNA sequences enabled a novel approach of using the transcriptome to construct the complete mitochondrial genome for I. tridecemlineatus. Seasonal and activity-specific changes in mRNA levels that met our stringent false discovery rate cutoff (1.0 × 10(-11)) were used to identify patterns of gene expression involving various aspects of the hibernation phenotype. Among these patterns are differentially expressed genes encoding heart proteins AT1A1, NAC1 and RYR2 controlling ion transport required for contraction and relaxation at low body temperatures. Abundant RNAs in skeletal muscle coding ubiquitin pathway proteins ASB2, UBC and DDB1 peak in October, suggesting an increase in muscle proteolysis. Finally, genes in WAT that encode proteins involved in lipogenesis (ACOD, FABP4) are highly expressed in August, but gradually decline in expression during the seasonal transition to lipolysis.

Collapse

Charuvaka A, Rangwala H. Evaluation of short read metagenomic assembly. BMC Genomics 2011;12 Suppl 2:S8. [PMID: 21989307 PMCID: PMC3194239 DOI: 10.1186/1471-2164-12-s2-s8] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Straub SCK, Fishbein M, Livshultz T, Foster Z, Parks M, Weitemier K, Cronn RC, Liston A. Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing. BMC Genomics 2011;12:211. [PMID: 21542930 PMCID: PMC3116503 DOI: 10.1186/1471-2164-12-211] [Citation(s) in RCA: 92] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2011] [Accepted: 05/04/2011] [Indexed: 01/05/2023] Open

Abstract

Background

Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution.

Results

A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed.

Conclusions

The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives. This study represents a first step in the development of a community resource for further study of plant-insect co-evolution, anti-herbivore defense, floral developmental genetics, reproductive biology, chemical evolution, population genetics, and comparative genomics using milkweeds, and A. syriaca in particular, as ecological and evolutionary models.

Collapse

van Oeveren J, de Ruiter M, Jesse T, van der Poel H, Tang J, Yalcin F, Janssen A, Volpin H, Stormo KE, Bogden R, van Eijk MJT, Prins M. Sequence-based physical mapping of complex genomes by whole genome profiling. Genome Res 2011;21:618-25. [PMID: 21324881 DOI: 10.1101/gr.112094.110] [Citation(s) in RCA: 72] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Koehler R, Issac H, Cloonan N, Grimmond SM. The uniqueome: a mappability resource for short-tag sequencing. ACTA ACUST UNITED AC 2010;27:272-4. [PMID: 21075741 PMCID: PMC3018812 DOI: 10.1093/bioinformatics/btq640] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]

Paszkiewicz K, Studholme DJ. De novo assembly of short sequence reads. Brief Bioinform 2010;11:457-72. [DOI: 10.1093/bib/bbq020] [Citation(s) in RCA: 134] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Simulation of ChIP-Seq based on extra-sonication of IPed DNA fragments. CHINESE SCIENCE BULLETIN-CHINESE 2010. [DOI: 10.1007/s11434-010-3013-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]

Read length and repeat resolution: exploring prokaryote genomes using next-generation sequencing technologies. PLoS One 2010;5:e11518. [PMID: 20634954 PMCID: PMC2902515 DOI: 10.1371/journal.pone.0011518] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2010] [Accepted: 05/31/2010] [Indexed: 11/19/2022] Open

De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis. PLoS Genet 2010;6:e1000891. [PMID: 20386741 PMCID: PMC2851567 DOI: 10.1371/journal.pgen.1000891] [Citation(s) in RCA: 140] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2009] [Accepted: 03/02/2010] [Indexed: 01/09/2023] Open

Abstract

Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30-90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in approximately 4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb) with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data can be used for comparative studies to address basic questions of fungal biology.

Collapse

Miller JR, Koren S, Sutton G. Assembly algorithms for next-generation sequencing data. Genomics 2010;95:315-27. [PMID: 20211242 DOI: 10.1016/j.ygeno.2010.03.001] [Citation(s) in RCA: 621] [Impact Index Per Article: 44.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2009] [Revised: 02/26/2010] [Accepted: 03/02/2010] [Indexed: 01/08/2023]

Webb KM, Rosenthal BM. Deep resequencing of Trichinella spiralis reveals previously un-described single nucleotide polymorphisms and intra-isolate variation within the mitochondrial genome. INFECTION GENETICS AND EVOLUTION 2010;10:304-10. [PMID: 20083232 DOI: 10.1016/j.meegid.2010.01.003] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/28/2009] [Revised: 12/23/2009] [Accepted: 01/11/2010] [Indexed: 11/25/2022]

Kingsford C, Schatz MC, Pop M. Assembly complexity of prokaryotic genomes using short reads. BMC Bioinformatics 2010;11:21. [PMID: 20064276 PMCID: PMC2821320 DOI: 10.1186/1471-2105-11-21] [Citation(s) in RCA: 85] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2009] [Accepted: 01/12/2010] [Indexed: 01/08/2023] Open

Zerbino DR, McEwen GK, Margulies EH, Birney E. Pebble and rock band: heuristic resolution of repeats and scaffolding in the velvet short-read de novo assembler. PLoS One 2009;4:e8407. [PMID: 20027311 PMCID: PMC2793427 DOI: 10.1371/journal.pone.0008407] [Citation(s) in RCA: 150] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2009] [Accepted: 10/21/2009] [Indexed: 11/22/2022] Open

Wendl MC, Wilson RK. The theory of discovering rare variants via DNA sequencing. BMC Genomics 2009;10:485. [PMID: 19843339 PMCID: PMC2778663 DOI: 10.1186/1471-2164-10-485] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2009] [Accepted: 10/20/2009] [Indexed: 11/18/2022] Open

Chikhi R, Lavenier D. Paired-end read length lower bounds for genome re-sequencing. BMC Bioinformatics 2009. [PMCID: PMC2764126 DOI: 10.1186/1471-2105-10-s13-o2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

ChIP-seq: advantages and challenges of a maturing technology. Nat Rev Genet 2009. [PMID: 19736561 DOI: 10.1038/nrg2641,+10.1038/ni0709-669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

ChIP-seq: advantages and challenges of a maturing technology. Nat Rev Genet 2009;10:669-80. [PMID: 19736561 DOI: 10.1038/nrg2641] [Citation(s) in RCA: 1263] [Impact Index Per Article: 84.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

ChIP-seq: advantages and challenges of a maturing technology. Nat Rev Genet 2009. [PMID: 19736561 DOI: 10.1038/nrg2641, 10.1038/ni0709-669] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Gibbons JG, Janson EM, Hittinger CT, Johnston M, Abbot P, Rokas A. Benchmarking next-generation transcriptome sequencing for functional and evolutionary genomics. Mol Biol Evol 2009;26:2731-44. [PMID: 19706727 DOI: 10.1093/molbev/msp188] [Citation(s) in RCA: 129] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open

Amaral AJ, Megens HJ, Kerstens HHD, Heuven HCM, Dibbits B, Crooijmans RPMA, den Dunnen JT, Groenen MAM. Application of massive parallel sequencing to whole genome SNP discovery in the porcine genome. BMC Genomics 2009;10:374. [PMID: 19674453 PMCID: PMC2739861 DOI: 10.1186/1471-2164-10-374] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2009] [Accepted: 08/12/2009] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Although the Illumina 1 G Genome Analyzer generates billions of base pairs of sequence data, challenges arise in sequence selection due to the varying sequence quality. Therefore, in the framework of the International Porcine SNP Chip Consortium, this pilot study aimed to evaluate the impact of the quality level of the sequenced bases on mapping quality and identification of true SNPs on a large scale.

RESULTS

DNA pooled from five animals from a commercial boar line was digested with DraI; 150-250-bp fragments were isolated and end-sequenced using the Illumina 1 G Genome Analyzer, yielding 70,348,064 sequences 36-bp long. Rules were developed to select sequences, which were then aligned to unique positions in a reference genome. Sequences were selected based on quality, and three thresholds of sequence quality (SQ) were compared. The highest threshold of SQ allowed identification of a larger number of SNPs (17,489), distributed widely across the pig genome. In total, 3,142 SNPs were validated with a success rate of 96%. The correlation between estimated minor allele frequency (MAF) and genotyped MAF was moderate, and SNPs were highly polymorphic in other pig breeds. Lowering the SQ threshold and maintaining the same criteria for SNP identification resulted in the discovery of fewer SNPs (16,768), of which 259 were not identified using higher SQ levels. Validation of SNPs found exclusively in the lower SQ threshold had a success rate of 94% and a low correlation between estimated MAF and genotyped MAF. Base change analysis suggested that the rate of transitions in the pig genome is likely to be similar to that observed in humans. Chromosome X showed reduced nucleotide diversity relative to autosomes, as observed for other species.

CONCLUSION

Large numbers of SNPs can be identified reliably by creating strict rules for sequence selection, which simultaneously decreases sequence ambiguity. Selection of sequences using a higher SQ threshold leads to more reliable identification of SNPs. Lower SQ thresholds can be used to guarantee sufficient sequence coverage, resulting in high success rate but less reliable MAF estimation. Nucleotide diversity varies between porcine chromosomes, with the X chromosome showing less variation as observed in other species.

Collapse

Su Y, Lin L, Tian G, Chen C, Liu T, Xu X, Qi X, Zhang X, Yang H. Preparing a re-sequencing DNA library of 2 cancer candidate genes using the ligation-by-amplification protocol by two PCR reactions. ACTA ACUST UNITED AC 2009;52:483-91. [PMID: 19471873 DOI: 10.1007/s11427-009-0066-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2008] [Accepted: 11/18/2008] [Indexed: 01/03/2023]

Abstract

To meet the needs of large-scale genomic/genetic studies, the next-generation massively parallelized sequencing technologies provide high throughput, low cost and low labor-intensive sequencing service, with subsequent bioinformatic software and laboratory methods developed to expand their applications in various types of research. PCR-based genomic/genetic studies, which have significant usage in association studies like cancer research, haven't benefited much from those next-generation sequencing technologies, because the shortgun re-sequencing strategy used by such sequencing machines as the Illumina/Solexa Genome Analyzer may not be applied to direct re-sequencing of short-length target regions like those in PCR-based genomic/genetic studies. Although several methods have been proposed to solve this problem, including microarray-based genomic selections and selector-based technologies, they require advanced equipment and procedures which limit their applications in many laboratories. By contrast, we overcame such potential drawbacks by utilizing a ligation by amplification (LBA) protocol, a method using a pair of Universal Adapters to randomly ligate target regions in a two-step-PCR procedure, whose Long LBA products were easily fragmented and sequenced on the next-generation sequencing machine. In this concept-proven study, we chose the consensus coding sequences of two human cancer genes: BRCA1 and BRCA2 as target regions, specifically designed LBA primer pairs to amplify and randomly ligate them. 70 target sequences were successfully amplified and ligated into Long LBA products, which were then fragmented to construct DNA libraries for sequencing on both a conventional Sanger sequencer ABI 3730xl DNA Analyzer and the next-generation 'synthesis by sequencing technology' Illumina/Solexa Genome Analyzer. Bioinformatic analysis demonstrated the utility and efficiency (including the coverage and depth of each target sequence and the SNPs detection effectiveness) of using the LBA protocol in facilitating PCR-based re-sequencing and genetic-variant-detection studies on the next-generation sequencing machine, raising the prospect of various PCR-based genomic/genetic studies using this strategy.

Collapse

Identification of EMS-induced mutations in Drosophila melanogaster by whole-genome sequencing. Genetics 2009;182:25-32. [PMID: 19307605 DOI: 10.1534/genetics.109.101998] [Citation(s) in RCA: 105] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Voelkerding KV, Dames SA, Durtschi JD. Next-generation sequencing: from basic research to diagnostics. Clin Chem 2009;55:641-58. [PMID: 19246620 DOI: 10.1373/clinchem.2008.112789] [Citation(s) in RCA: 544] [Impact Index Per Article: 36.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Rokas A, Abbot P. Harnessing genomics for evolutionary insights. Trends Ecol Evol 2009;24:192-200. [PMID: 19201503 DOI: 10.1016/j.tree.2008.11.004] [Citation(s) in RCA: 116] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2008] [Revised: 11/07/2008] [Accepted: 11/10/2008] [Indexed: 11/25/2022]

Rozowsky J, Euskirchen G, Auerbach RK, Zhang ZD, Gibson T, Bjornson R, Carriero N, Snyder M, Gerstein MB. PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nat Biotechnol 2009;27:66-75. [PMID: 19122651 DOI: 10.1038/nbt.1518] [Citation(s) in RCA: 466] [Impact Index Per Article: 31.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2008] [Accepted: 12/03/2008] [Indexed: 01/23/2023]

Imelfort M. Sequence Comparison Tools. Bioinformatics 2009. [DOI: 10.1007/978-0-387-92738-1_2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open