Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	[Subscribe] [Scholar Register]

Number

Cited by Other Article(s)

451

Rider SD, Morgan MS, Arlian LG. Draft genome of the scabies mite. Parasit Vectors 2015;8:585. [PMID: 26555130 PMCID: PMC4641413 DOI: 10.1186/s13071-015-1198-2] [Citation(s) in RCA: 76] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2015] [Accepted: 11/05/2015] [Indexed: 12/11/2022] Open

Abstract

Background

The disease scabies, caused by the ectoparasitic mite, Sarcoptes scabiei, causes significant morbidity in humans and other mammals worldwide. However, there is limited data available regarding the molecular basis of host specificity and host-parasite interactions. Therefore, we sought to produce a draft genome for S. scabiei and use this to identify molecular markers that will be useful for phylogenetic population studies and to identify candidate protein-coding genes that are critical to the unique biology of the parasite.

Methods

S. scabiei var. canis DNA was isolated from living mites and sequenced to ultra-deep coverage using paired-end technology. Sequence reads were assembled into gapped contigs using de Bruijn graph based algorithms. The assembled genome was examined for repetitive elements and gene annotation was performed using ab initio, and homology-based methods.

Results

The draft genome assembly was about 56.2 Mb and included a mitochondrial genome contig. The predicted proteome contained 10,644 proteins, ~67 % of which appear to have clear orthologs in other species. The genome also contained more than 140,000 simple sequence repeat loci that may be useful for population-level studies. The mitochondrial genome contained 13 protein coding loci and 20 transfer RNAs. Hundreds of candidate salivary gland protein genes were identified by comparing the scabies mite predicted proteome with sialoproteins and transcripts identified in ticks and other hematophagous arthropods. These include serpins, ferritins, reprolysins, apyrases and new members of the macrophage migration inhibitory factor (MIF) gene family. Numerous other genes coding for salivary proteins, metabolic enzymes, structural proteins, proteins that are potentially immune modulating, and vaccine candidates were identified. The genes encoding cysteine and serine protease paralogs as well as mu-type glutathione S-transferases are represented by gene clusters. S. scabiei possessed homologs for most of the 33 dust mite allergens.

Conclusion

The draft genome is useful for advancing our understanding of the host-parasite interaction, the biology of the mite and its phylogenetic relationship to other Acari. The identification of antigen-producing genes, candidate immune modulating proteins and pathways, and genes responsible for acaricide resistance offers opportunities for developing new methods for diagnosing, treating and preventing this disease.

Electronic supplementary material

The online version of this article (doi:10.1186/s13071-015-1198-2) contains supplementary material, which is available to authorized users.

Collapse

452

Henson MW, Santo Domingo JW, Kourtev PS, Jensen RV, Dunn JA, Learman DR. Metabolic and genomic analysis elucidates strain-level variation in Microbacterium spp. isolated from chromate contaminated sediment. PeerJ 2015;3:e1395. [PMID: 26587353 PMCID: PMC4647564 DOI: 10.7717/peerj.1395] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2015] [Accepted: 10/19/2015] [Indexed: 01/04/2023] Open

453

Greshake B, Zehr S, Dal Grande F, Meiser A, Schmitt I, Ebersberger I. Potential and pitfalls of eukaryotic metagenome skimming: a test case for lichens. Mol Ecol Resour 2015;16:511-23. [PMID: 26345272 DOI: 10.1111/1755-0998.12463] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2015] [Revised: 07/28/2015] [Accepted: 08/22/2015] [Indexed: 11/30/2022]

454

Wang D, Xu J, Yu J. KGCAK: a K-mer based database for genome-wide phylogeny and complexity evaluation. Biol Direct 2015;10:53. [PMID: 26376976 PMCID: PMC4573299 DOI: 10.1186/s13062-015-0083-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2015] [Accepted: 09/11/2015] [Indexed: 11/28/2022] Open

455

Benoit G, Lemaitre C, Lavenier D, Drezen E, Dayris T, Uricaru R, Rizk G. Reference-free compression of high throughput sequencing data with a probabilistic de Bruijn graph. BMC Bioinformatics 2015;16:288. [PMID: 26370285 PMCID: PMC4570262 DOI: 10.1186/s12859-015-0709-7] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2015] [Accepted: 08/17/2015] [Indexed: 01/09/2023] Open

456

Song L, Florea L, Langmead B. Lighter: fast and memory-efficient sequencing error correction without counting. Genome Biol 2015;15:509. [PMID: 25398208 PMCID: PMC4248469 DOI: 10.1186/s13059-014-0509-9] [Citation(s) in RCA: 150] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2014] [Indexed: 02/02/2023] Open

457

Olson ND, Lund SP, Colman RE, Foster JT, Sahl JW, Schupp JM, Keim P, Morrow JB, Salit ML, Zook JM. Best practices for evaluating single nucleotide variant calling methods for microbial genomics. Front Genet 2015. [PMID: 26217378 PMCID: PMC4493402 DOI: 10.3389/fgene.2015.00235] [Citation(s) in RCA: 114] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

458

Pelin A, Selman M, Aris-Brosou S, Farinelli L, Corradi N. Genome analyses suggest the presence of polyploidy and recent human-driven expansions in eight global populations of the honeybee pathogen Nosema ceranae. Environ Microbiol 2015;17:4443-58. [PMID: 25914091 DOI: 10.1111/1462-2920.12883] [Citation(s) in RCA: 63] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2015] [Revised: 04/13/2015] [Accepted: 04/15/2015] [Indexed: 12/23/2022]

459

De Novo Assembly of Bitter Gourd Transcriptomes: Gene Expression and Sequence Variations in Gynoecious and Monoecious Lines. PLoS One 2015;10:e0128331. [PMID: 26047102 PMCID: PMC4457790 DOI: 10.1371/journal.pone.0128331] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2014] [Accepted: 04/26/2015] [Indexed: 11/19/2022] Open

460

Mans BJ, de Klerk D, Pienaar R, de Castro MH, Latif AA. Next-generation sequencing as means to retrieve tick systematic markers, with the focus on Nuttalliella namaqua (Ixodoidea: Nuttalliellidae). Ticks Tick Borne Dis 2015;6:450-62. [DOI: 10.1016/j.ttbdis.2015.03.013] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2014] [Revised: 03/06/2015] [Accepted: 03/08/2015] [Indexed: 10/23/2022]

461

Draft Genome Sequence of Lachancea lanzarotensis CBS 12615T, an Ascomycetous Yeast Isolated from Grapes. GENOME ANNOUNCEMENTS 2015;3:3/2/e00292-15. [PMID: 25883293 PMCID: PMC4400436 DOI: 10.1128/genomea.00292-15] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

462

Szövényi P, Frangedakis E, Ricca M, Quandt D, Wicke S, Langdale JA. Establishment of Anthoceros agrestis as a model species for studying the biology of hornworts. BMC PLANT BIOLOGY 2015;15:98. [PMID: 25886741 PMCID: PMC4393856 DOI: 10.1186/s12870-015-0481-x] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/28/2015] [Accepted: 03/24/2015] [Indexed: 05/18/2023]

Abstract

BACKGROUND

Plants colonized terrestrial environments approximately 480 million years ago and have contributed significantly to the diversification of life on Earth. Phylogenetic analyses position a subset of charophyte algae as the sister group to land plants, and distinguish two land plant groups that diverged around 450 million years ago - the bryophytes and the vascular plants. Relationships between liverworts, mosses hornworts and vascular plants have proven difficult to resolve, and as such it is not clear which bryophyte lineage is the sister group to all other land plants and which is the sister to vascular plants. The lack of comparative molecular studies in representatives of all three lineages exacerbates this uncertainty. Such comparisons can be made between mosses and liverworts because representative model organisms are well established in these two bryophyte lineages. To date, however, a model hornwort species has not been available.

RESULTS

Here we report the establishment of Anthoceros agrestis as a model hornwort species for laboratory experiments. Axenic culture conditions for maintenance and vegetative propagation have been determined, and treatments for the induction of sexual reproduction and sporophyte development have been established. In addition, protocols have been developed for the extraction of DNA and RNA that is of a quality suitable for molecular analyses. Analysis of haploid-derived genome sequence data of two A. agrestis isolates revealed single nucleotide polymorphisms at multiple loci, and thus these two strains are suitable starting material for classical genetic and mapping experiments.

CONCLUSIONS

Methods and resources have been developed to enable A. agrestis to be used as a model species for developmental, molecular, genomic, and genetic studies. This advance provides an unprecedented opportunity to investigate the biology of hornworts.

Collapse

463

Deng X, Naccache SN, Ng T, Federman S, Li L, Chiu CY, Delwart EL. An ensemble strategy that significantly improves de novo assembly of microbial genomes from metagenomic next-generation sequencing data. Nucleic Acids Res 2015;43:e46. [PMID: 25586223 PMCID: PMC4402509 DOI: 10.1093/nar/gkv002] [Citation(s) in RCA: 206] [Impact Index Per Article: 20.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2014] [Accepted: 01/04/2015] [Indexed: 11/12/2022] Open

464

Campana MG, Robles García NM, Tuross N. America's red gold: multiple lineages of cultivated cochineal in Mexico. Ecol Evol 2015;5:607-17. [PMID: 25691985 PMCID: PMC4328766 DOI: 10.1002/ece3.1398] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2014] [Revised: 12/15/2014] [Accepted: 12/18/2014] [Indexed: 01/31/2023] Open

465

Bloom Filter Trie – A Data Structure for Pan-Genome Storage. LECTURE NOTES IN COMPUTER SCIENCE 2015. [DOI: 10.1007/978-3-662-48221-6_16] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

466

Shariat B, Movahedi NS, Chitsaz H, Boucher C. HyDA-Vista: towards optimal guided selection of k-mer size for sequence assembly. BMC Genomics 2014;15 Suppl 10:S9. [PMID: 25558875 PMCID: PMC4304221 DOI: 10.1186/1471-2164-15-s10-s9] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Abstract

Motivation

Intimately tied to assembly quality is the complexity of the de Bruijn graph built by the assembler. Thus, there have been many paradigms developed to decrease the complexity of the de Bruijn graph. One obvious combinatorial paradigm for this is to allow the value of k to vary; having a larger value of k where the graph is more complex and a smaller value of k where the graph would likely contain fewer spurious edges and vertices. One open problem that affects the practicality of this method is how to predict the value of k prior to building the de Bruijn graph. We show that optimal values of k can be predicted prior to assembly by using the information contained in a phylogenetically-close genome and therefore, help make the use of multiple values of k practical for genome assembly.

Results

We present HyDA-Vista, which is a genome assembler that uses homology information to choose a value of k for each read prior to the de Bruijn graph construction. The chosen k is optimal if there are no sequencing errors and the coverage is sufficient. Fundamental to our method is the construction of the maximal sequence landscape, which is a data structure that stores for each position in the input string, the largest repeated substring containing that position. In particular, we show the maximal sequence landscape can be constructed in O(n + n log n)-time and O(n)-space. HyDA-Vista first constructs the maximal sequence landscape for a homologous genome. The reads are then aligned to this reference genome, and values of k are assigned to each read using the maximal sequence landscape and the alignments. Eventually, all the reads are assembled by an iterative de Bruijn graph construction method. Our results and comparison to other assemblers demonstrate that HyDA-Vista achieves the best assembly of E. coli before repeat resolution or scaffolding.

Availability

HyDA-Vista is freely available [1]. The code for constructing the maximal sequence landscape and choosing the optimal value of k for each read is also separately available on the website and could be incorporated into any genome assembler.

Collapse

467

The complex task of choosing a de novo assembly: Lessons from fungal genomes. Comput Biol Chem 2014;53 Pt A:97-107. [DOI: 10.1016/j.compbiolchem.2014.08.014] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/11/2014] [Indexed: 12/21/2022]

468

Draft Genome Sequence of Taylorella equigenitalis Strain MCE529, Isolated from a Belgian Warmblood Horse. GENOME ANNOUNCEMENTS 2014;2:2/6/e01214-14. [PMID: 25428969 PMCID: PMC4246161 DOI: 10.1128/genomea.01214-14] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

469

Melsted P, Halldórsson BV. KmerStream: streaming algorithms for k-mer abundance estimation. ACTA ACUST UNITED AC 2014;30:3541-7. [PMID: 25355787 DOI: 10.1093/bioinformatics/btu713] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]

470

Jünemann S, Prior K, Albersmeier A, Albaum S, Kalinowski J, Goesmann A, Stoye J, Harmsen D. GABenchToB: a genome assembly benchmark tuned on bacteria and benchtop sequencers. PLoS One 2014;9:e107014. [PMID: 25198770 PMCID: PMC4157817 DOI: 10.1371/journal.pone.0107014] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2014] [Accepted: 08/07/2014] [Indexed: 12/28/2022] Open

Abstract

De novo genome assembly is the process of reconstructing a complete genomic sequence from countless small sequencing reads. Due to the complexity of this task, numerous genome assemblers have been developed to cope with different requirements and the different kinds of data provided by sequencers within the fast evolving field of next-generation sequencing technologies. In particular, the recently introduced generation of benchtop sequencers, like Illumina's MiSeq and Ion Torrent's Personal Genome Machine (PGM), popularized the easy, fast, and cheap sequencing of bacterial organisms to a broad range of academic and clinical institutions. With a strong pragmatic focus, here, we give a novel insight into the line of assembly evaluation surveys as we benchmark popular de novo genome assemblers based on bacterial data generated by benchtop sequencers. Therefore, single-library assemblies were generated, assembled, and compared to each other by metrics describing assembly contiguity and accuracy, and also by practice-oriented criteria as for instance computing time. In addition, we extensively analyzed the effect of the depth of coverage on the genome assemblies within reasonable ranges and the k-mer optimization problem of de Bruijn Graph assemblers. Our results show that, although both MiSeq and PGM allow for good genome assemblies, they require different approaches. They not only pair with different assembler types, but also affect assemblies differently regarding the depth of coverage where oversampling can become problematic. Assemblies vary greatly with respect to contiguity and accuracy but also by the requirement on the computing power. Consequently, no assembler can be rated best for all preconditions. Instead, the given kind of data, the demands on assembly quality, and the available computing infrastructure determines which assembler suits best. The data sets, scripts and all additional information needed to replicate our results are freely available at ftp://ftp.cebitec.uni-bielefeld.de/pub/GABenchToB.

Collapse

471

Zhang Q, Pell J, Canino-Koning R, Howe AC, Brown CT. These are not the k-mers you are looking for: efficient online k-mer counting using a probabilistic data structure. PLoS One 2014;9:e101271. [PMID: 25062443 PMCID: PMC4111482 DOI: 10.1371/journal.pone.0101271] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2013] [Accepted: 06/04/2014] [Indexed: 11/19/2022] Open

472

Utturkar SM, Klingeman DM, Land ML, Schadt CW, Doktycz MJ, Pelletier DA, Brown SD. Evaluation and validation of de novo and hybrid assembly techniques to derive high-quality genome sequences. ACTA ACUST UNITED AC 2014;30:2709-16. [PMID: 24930142 PMCID: PMC4173024 DOI: 10.1093/bioinformatics/btu391] [Citation(s) in RCA: 86] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Affiliation(s)

Sagar M Utturkar Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
Dawn M Klingeman Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
Miriam L Land Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
Christopher W Schadt Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
Mitchel J Doktycz Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
Dale A Pelletier Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
Steven D Brown Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA

Collapse

473

Garmendia J, Viadas C, Calatayud L, Mell JC, Martí-Lliteras P, Euba B, Llobet E, Gil C, Bengoechea JA, Redfield RJ, Liñares J. Characterization of nontypable Haemophilus influenzae isolates recovered from adult patients with underlying chronic lung disease reveals genotypic and phenotypic traits associated with persistent infection. PLoS One 2014;9:e97020. [PMID: 24824990 PMCID: PMC4019658 DOI: 10.1371/journal.pone.0097020] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2014] [Accepted: 04/14/2014] [Indexed: 01/09/2023] Open

Abstract

Nontypable Haemophilus influenzae (NTHi) has emerged as an important opportunistic pathogen causing infection in adults suffering obstructive lung diseases. Existing evidence associates chronic infection by NTHi to the progression of the chronic respiratory disease, but specific features of NTHi associated with persistence have not been comprehensively addressed. To provide clues about adaptive strategies adopted by NTHi during persistent infection, we compared sequential persistent isolates with newly acquired isolates in sputa from six patients with chronic obstructive lung disease. Pulse field gel electrophoresis (PFGE) identified three patients with consecutive persistent strains and three with new strains. Phenotypic characterisation included infection of respiratory epithelial cells, bacterial self-aggregation, biofilm formation and resistance to antimicrobial peptides (AMP). Persistent isolates differed from new strains in showing low epithelial adhesion and inability to form biofilms when grown under continuous-flow culture conditions in microfermenters. Self-aggregation clustered the strains by patient, not by persistence. Increasing resistance to AMPs was observed for each series of persistent isolates; this was not associated with lipooligosaccharide decoration with phosphorylcholine or with lipid A acylation. Variation was further analyzed for the series of three persistent isolates recovered from patient 1. These isolates displayed comparable growth rate, natural transformation frequency and murine pulmonary infection. Genome sequencing of these three isolates revealed sequential acquisition of single-nucleotide variants in the AMP permease sapC, the heme acquisition systems hgpB, hgpC, hup and hxuC, the 3-deoxy-D-manno-octulosonic acid kinase kdkA, the long-chain fatty acid transporter ompP1, and the phosphoribosylamine glycine ligase purD. Collectively, we frame a range of pathogenic traits and a repertoire of genetic variants in the context of persistent infection by NTHi.

Collapse

Affiliation(s)

Junkal Garmendia Instituto de Agrobiotecnología, CSIC-Universidad Pública Navarra-Gobierno Navarra, Mutilva, Spain Centro de Investigación Biomédica en Red de Enfermedades Respiratorias (CIBERES), Madrid, Spain Laboratory Microbial Pathogenesis, Fundación Investigación Sanitaria Illes Balears, Bunyola, Spain * E-mail:
Cristina Viadas Instituto de Agrobiotecnología, CSIC-Universidad Pública Navarra-Gobierno Navarra, Mutilva, Spain
Laura Calatayud Centro de Investigación Biomédica en Red de Enfermedades Respiratorias (CIBERES), Madrid, Spain Microbiology Department, University Hospital Bellvitge, IDIBELL, University of Barcelona, Barcelona, Spain
Joshua Chang Mell Department of Zoology, University British Columbia, Vancouver, British Columbia, Canada Department of Microbiology and Immunology, Drexel University College of Medicine, Philadelphia, Pennsylvania, United States of America
Pau Martí-Lliteras Centro de Investigación Biomédica en Red de Enfermedades Respiratorias (CIBERES), Madrid, Spain Laboratory Microbial Pathogenesis, Fundación Investigación Sanitaria Illes Balears, Bunyola, Spain
Begoña Euba Instituto de Agrobiotecnología, CSIC-Universidad Pública Navarra-Gobierno Navarra, Mutilva, Spain Centro de Investigación Biomédica en Red de Enfermedades Respiratorias (CIBERES), Madrid, Spain
Enrique Llobet Centro de Investigación Biomédica en Red de Enfermedades Respiratorias (CIBERES), Madrid, Spain Laboratory Microbial Pathogenesis, Fundación Investigación Sanitaria Illes Balears, Bunyola, Spain
Carmen Gil Instituto de Agrobiotecnología, CSIC-Universidad Pública Navarra-Gobierno Navarra, Mutilva, Spain
José Antonio Bengoechea Centro de Investigación Biomédica en Red de Enfermedades Respiratorias (CIBERES), Madrid, Spain Laboratory Microbial Pathogenesis, Fundación Investigación Sanitaria Illes Balears, Bunyola, Spain Consejo Superior de Investigaciones Científicas (CSIC), Madrid, Spain
Rosemary J. Redfield Department of Zoology, University British Columbia, Vancouver, British Columbia, Canada
Josefina Liñares Centro de Investigación Biomédica en Red de Enfermedades Respiratorias (CIBERES), Madrid, Spain Microbiology Department, University Hospital Bellvitge, IDIBELL, University of Barcelona, Barcelona, Spain

Collapse

474

Trivedi UH, Cézard T, Bridgett S, Montazam A, Nichols J, Blaxter M, Gharbi K. Quality control of next-generation sequencing data without a reference. Front Genet 2014;5:111. [PMID: 24834071 PMCID: PMC4018527 DOI: 10.3389/fgene.2014.00111] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2014] [Accepted: 04/14/2014] [Indexed: 01/07/2023] Open

475

Koren S, Treangen TJ, Hill CM, Pop M, Phillippy AM. Automated ensemble assembly and validation of microbial genomes. BMC Bioinformatics 2014;15:126. [PMID: 24884846 PMCID: PMC4030574 DOI: 10.1186/1471-2105-15-126] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2014] [Accepted: 04/24/2014] [Indexed: 11/12/2022] Open

Abstract

Background

The continued democratization of DNA sequencing has sparked a new wave of development of genome assembly and assembly validation methods. As individual research labs, rather than centralized centers, begin to sequence the majority of new genomes, it is important to establish best practices for genome assembly. However, recent evaluations such as GAGE and the Assemblathon have concluded that there is no single best approach to genome assembly. Instead, it is preferable to generate multiple assemblies and validate them to determine which is most useful for the desired analysis; this is a labor-intensive process that is often impossible or unfeasible.

Results

To encourage best practices supported by the community, we present iMetAMOS, an automated ensemble assembly pipeline; iMetAMOS encapsulates the process of running, validating, and selecting a single assembly from multiple assemblies. iMetAMOS packages several leading open-source tools into a single binary that automates parameter selection and execution of multiple assemblers, scores the resulting assemblies based on multiple validation metrics, and annotates the assemblies for genes and contaminants. We demonstrate the utility of the ensemble process on 225 previously unassembled Mycobacterium tuberculosis genomes as well as a Rhodobacter sphaeroides benchmark dataset. On these real data, iMetAMOS reliably produces validated assemblies and identifies potential contamination without user intervention. In addition, intelligent parameter selection produces assemblies of R. sphaeroides comparable to or exceeding the quality of those from the GAGE-B evaluation, affecting the relative ranking of some assemblers.

Conclusions

Ensemble assembly with iMetAMOS provides users with multiple, validated assemblies for each genome. Although computationally limited to small or mid-sized genomes, this approach is the most effective and reproducible means for generating high-quality assemblies and enables users to select an assembly best tailored to their specific needs.

Collapse

476

Heo Y, Wu XL, Chen D, Ma J, Hwu WM. BLESS: bloom filter-based error correction solution for high-throughput sequencing reads. ACTA ACUST UNITED AC 2014;30:1354-62. [PMID: 24451628 DOI: 10.1093/bioinformatics/btu030] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

477

Simpson JT. Exploring genome characteristics and sequence quality without a reference. ACTA ACUST UNITED AC 2014;30:1228-35. [PMID: 24443382 PMCID: PMC3998141 DOI: 10.1093/bioinformatics/btu023] [Citation(s) in RCA: 101] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

478

MOLECULAR MECHANISM OF THE CAROTENOID BIOSYNTHESIS ACTIVATION IN THE PRODUCER Streptomyces globisporus 1912. BIOTECHNOLOGIA ACTA 2014. [DOI: 10.15407/biotech7.06.069] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

479

Anvar SY, Khachatryan L, Vermaat M, van Galen M, Pulyakhina I, Ariyurek Y, Kraaijeveld K, den Dunnen JT, de Knijff P, ’t Hoen PAC, Laros JFJ. Determining the quality and complexity of next-generation sequencing data without a reference genome. Genome Biol 2014;15:555. [PMID: 25514851 PMCID: PMC4298064 DOI: 10.1186/s13059-014-0555-3] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2014] [Accepted: 11/27/2014] [Indexed: 01/22/2023] Open