1
|
Jünemann S, Prior K, Albersmeier A, Albaum S, Kalinowski J, Goesmann A, Stoye J, Harmsen D. Correction: GABenchToB: A Genome Assembly Benchmark Tuned on Bacteria and Benchtop Sequencers. PLoS One 2024; 19:e0299269. [PMID: 38359070 PMCID: PMC10868731 DOI: 10.1371/journal.pone.0299269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/17/2024] Open
Abstract
[This corrects the article DOI: 10.1371/journal.pone.0107014.].
Collapse
|
2
|
Rádai Z, Váradi A, Takács P, Nagy NA, Schmitt N, Prépost E, Kardos G, Laczkó L. An overlooked phenomenon: complex interactions of potential error sources on the quality of bacterial de novo genome assemblies. BMC Genomics 2024; 25:45. [PMID: 38195441 PMCID: PMC10777565 DOI: 10.1186/s12864-023-09910-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 12/15/2023] [Indexed: 01/11/2024] Open
Abstract
BACKGROUND Parameters adversely affecting the contiguity and accuracy of the assemblies from Illumina next-generation sequencing (NGS) are well described. However, past studies generally focused on their additive effects, overlooking their potential interactions possibly exacerbating one another's effects in a multiplicative manner. To investigate whether or not they act interactively on de novo genome assembly quality, we simulated sequencing data for 13 bacterial reference genomes, with varying levels of error rate, sequencing depth, PCR and optical duplicate ratios. RESULTS We assessed the quality of assemblies from the simulated sequencing data with a number of contiguity and accuracy metrics, which we used to quantify both additive and multiplicative effects of the four parameters. We found that the tested parameters are engaged in complex interactions, exerting multiplicative, rather than additive, effects on assembly quality. Also, the ratio of non-repeated regions and GC% of the original genomes can shape how the four parameters affect assembly quality. CONCLUSIONS We provide a framework for consideration in future studies using de novo genome assembly of bacterial genomes, e.g. in choosing the optimal sequencing depth, balancing between its positive effect on contiguity and negative effect on accuracy due to its interaction with error rate. Furthermore, the properties of the genomes to be sequenced also should be taken into account, as they might influence the effects of error sources themselves.
Collapse
Affiliation(s)
- Zoltán Rádai
- Institute of Metagenomics, University of Debrecen, Debrecen, Hungary.
- Department of Dermatology, University Hospital Düsseldorf, Heinrich-Heine-University, Düsseldorf, Germany.
| | - Alex Váradi
- Institute of Metagenomics, University of Debrecen, Debrecen, Hungary
- Department of Laboratory Medicine, Medical School, University of Pécs, Pécs, Hungary
| | - Péter Takács
- Institute of Metagenomics, University of Debrecen, Debrecen, Hungary
- Department of Health Informatics, Institute of Health Sciences, Faculty of Health, University of Debrecen, Debrecen, Hungary
| | - Nikoletta Andrea Nagy
- Institute of Metagenomics, University of Debrecen, Debrecen, Hungary
- Department of Evolutionary Zoology, ELKH-DE Behavioural Ecology Research Group, University of Debrecen, Debrecen, Hungary
- Department of Evolutionary Zoology and Human Biology, University of Debrecen, Debrecen, Hungary
| | - Nicholas Schmitt
- Department of Dermatology, University Hospital Düsseldorf, Heinrich-Heine-University, Düsseldorf, Germany
| | - Eszter Prépost
- Department of Health Industry, University of Debrecen, Debrecen, Hungary
| | - Gábor Kardos
- Institute of Metagenomics, University of Debrecen, Debrecen, Hungary
- Department of Gerontology, Faculty of Health Sciences, University of Debrecen, Debrecen, Hungary
| | - Levente Laczkó
- Institute of Metagenomics, University of Debrecen, Debrecen, Hungary
- ELKH-DE Conservation Biology Research Group, Debrecen, Hungary
| |
Collapse
|
3
|
Mierke F, Brink DP, Norbeck J, Siewers V, Andlid T. Functional genome annotation and transcriptome analysis of Pseudozyma hubeiensis BOT-O, an oleaginous yeast that utilizes glucose and xylose at equal rates. Fungal Genet Biol 2023; 166:103783. [PMID: 36870442 DOI: 10.1016/j.fgb.2023.103783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 02/10/2023] [Accepted: 02/27/2023] [Indexed: 03/06/2023]
Abstract
Pseudozyma hubeiensis is a basidiomycete yeast that has the highly desirable traits for lignocellulose valorisation of being equally efficient at utilization of glucose and xylose, and capable of their co-utilization. The species has previously mainly been studied for its capacity to produce secreted biosurfactants in the form of mannosylerythritol lipids, but it is also an oleaginous species capable of accumulating high levels of triacylglycerol storage lipids during nutrient starvation. In this study, we aimed to further characterize the oleaginous nature of P. hubeiensis by evaluating metabolism and gene expression responses during storage lipid formation conditions with glucose or xylose as a carbon source. The genome of the recently isolated P. hubeiensis BOT-O strain was sequenced using MinION long-read sequencing and resulted in the most contiguous P. hubeiensis assembly to date with 18.95 Mb in 31 contigs. Using transcriptome data as experimental support, we generated the first mRNA-supported P. hubeiensis genome annotation and identified 6540 genes. 80% of the predicted genes were assigned functional annotations based on protein homology to other yeasts. Based on the annotation, key metabolic pathways in BOT-O were reconstructed, including pathways for storage lipids, mannosylerythritol lipids and xylose assimilation. BOT-O was confirmed to consume glucose and xylose at equal rates, but during mixed glucose-xylose cultivation glucose was found to be taken up faster. Differential expression analysis revealed that only a total of 122 genes were significantly differentially expressed at a cut-off of |log2 fold change| ≥ 2 when comparing cultivation on xylose with glucose, during exponential growth and during nitrogen-starvation. Of these 122 genes, a core-set of 24 genes was identified that were differentially expressed at all time points. Nitrogen-starvation resulted in a larger transcriptional effect, with a total of 1179 genes with significant expression changes at the designated fold change cut-off compared with exponential growth on either glucose or xylose.
Collapse
Affiliation(s)
- Friederike Mierke
- Food and Nutrition Science, Department of Life Sciences, Chalmers University of Technology, Gothenburg, Sweden; Systems and Synthetic Biology, Department of Life Sciences, Chalmers University of Technology, Gothenburg, Sweden
| | - Daniel P Brink
- Systems and Synthetic Biology, Department of Life Sciences, Chalmers University of Technology, Gothenburg, Sweden; Applied Microbiology, Department of Chemistry, Lund University, Lund, Sweden
| | - Joakim Norbeck
- Systems and Synthetic Biology, Department of Life Sciences, Chalmers University of Technology, Gothenburg, Sweden
| | - Verena Siewers
- Systems and Synthetic Biology, Department of Life Sciences, Chalmers University of Technology, Gothenburg, Sweden.
| | - Thomas Andlid
- Food and Nutrition Science, Department of Life Sciences, Chalmers University of Technology, Gothenburg, Sweden
| |
Collapse
|
4
|
Ikegami H, Noguchi S, Fukuda K, Akata K, Yamasaki K, Kawanami T, Mukae H, Yatera K. Refinement of microbiota analysis of specimens from patients with respiratory infections using next-generation sequencing. Sci Rep 2021; 11:19534. [PMID: 34599245 PMCID: PMC8486753 DOI: 10.1038/s41598-021-98985-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2021] [Accepted: 09/16/2021] [Indexed: 12/12/2022] Open
Abstract
Next-generation sequencing (NGS) technologies have been applied in bacterial flora analysis. However, there is no standardized protocol, and the optimal clustering threshold for estimating bacterial species in respiratory infection specimens is unknown. This study was conducted to investigate the optimal threshold for clustering 16S ribosomal RNA gene sequences into operational taxonomic units (OTUs) by comparing the results of NGS technology with those of the Sanger method, which has a higher accuracy of sequence per single read than NGS technology. This study included 45 patients with pneumonia with aspiration risks and 35 patients with lung abscess. Compared to Sanger method, the concordance rates of NGS technology (clustered at 100%, 99%, and 97% homology) with the predominant phylotype were 78.8%, 71.3%, and 65.0%, respectively. With respect to the specimens dominated by the Streptococcus mitis group, containing several important causative agents of pneumonia, Bray Curtis dissimilarity revealed that the OTUs obtained at 100% clustering threshold (versus those obtained at 99% and 97% thresholds; medians of 0.35, 0.69, and 0.71, respectively) were more similar to those obtained by the Sanger method, with statistical significance (p < 0.05). Clustering with 100% sequence identity is necessary when analyzing the microbiota of respiratory infections using NGS technology.
Collapse
Affiliation(s)
- Hiroaki Ikegami
- Department of Respiratory Medicine, University of Occupational and Environmental Health, Japan, 1-1 Iseigaoka, Yahatanishi-ku, Kitakyushu-city, Fukuoka, 807-8555, Japan
| | - Shingo Noguchi
- Department of Respiratory Medicine, University of Occupational and Environmental Health, Japan, 1-1 Iseigaoka, Yahatanishi-ku, Kitakyushu-city, Fukuoka, 807-8555, Japan
| | - Kazumasa Fukuda
- Department of Microbiology, University of Occupational and Environmental Health, Japan, Kitakyushu, Japan
| | - Kentaro Akata
- Department of Respiratory Medicine, University of Occupational and Environmental Health, Japan, 1-1 Iseigaoka, Yahatanishi-ku, Kitakyushu-city, Fukuoka, 807-8555, Japan
| | - Kei Yamasaki
- Department of Respiratory Medicine, University of Occupational and Environmental Health, Japan, 1-1 Iseigaoka, Yahatanishi-ku, Kitakyushu-city, Fukuoka, 807-8555, Japan
| | - Toshinori Kawanami
- Department of Respiratory Medicine, University of Occupational and Environmental Health, Japan, 1-1 Iseigaoka, Yahatanishi-ku, Kitakyushu-city, Fukuoka, 807-8555, Japan
| | - Hiroshi Mukae
- Department of Respiratory Medicine, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Kazuhiro Yatera
- Department of Respiratory Medicine, University of Occupational and Environmental Health, Japan, 1-1 Iseigaoka, Yahatanishi-ku, Kitakyushu-city, Fukuoka, 807-8555, Japan.
| |
Collapse
|
5
|
Vila Nova M, Durimel K, La K, Felten A, Bessières P, Mistou MY, Mariadassou M, Radomski N. Genetic and metabolic signatures of Salmonella enterica subsp. enterica associated with animal sources at the pangenomic scale. BMC Genomics 2019; 20:814. [PMID: 31694533 PMCID: PMC6836353 DOI: 10.1186/s12864-019-6188-x] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2019] [Accepted: 10/15/2019] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Salmonella enterica subsp. enterica is a public health issue related to food safety, and its adaptation to animal sources remains poorly described at the pangenome scale. Firstly, serovars presenting potential mono- and multi-animal sources were selected from a curated and synthetized subset of Enterobase. The corresponding sequencing reads were downloaded from the European Nucleotide Archive (ENA) providing a balanced dataset of 440 Salmonella genomes in terms of serovars and sources (i). Secondly, the coregenome variants and accessory genes were detected (ii). Thirdly, single nucleotide polymorphisms and small insertions/deletions from the coregenome, as well as the accessory genes were associated to animal sources based on a microbial Genome Wide Association Study (GWAS) integrating an advanced correction of the population structure (iii). Lastly, a Gene Ontology Enrichment Analysis (GOEA) was applied to emphasize metabolic pathways mainly impacted by the pangenomic mutations associated to animal sources (iv). RESULTS Based on a genome dataset including Salmonella serovars from mono- and multi-animal sources (i), 19,130 accessory genes and 178,351 coregenome variants were identified (ii). Among these pangenomic mutations, 52 genomic signatures (iii) and 9 over-enriched metabolic signatures (iv) were associated to avian, bovine, swine and fish sources by GWAS and GOEA, respectively. CONCLUSIONS Our results suggest that the genetic and metabolic determinants of Salmonella adaptation to animal sources may have been driven by the natural feeding environment of the animal, distinct livestock diets modified by human, environmental stimuli, physiological properties of the animal itself, and work habits for health protection of livestock.
Collapse
Affiliation(s)
- Meryl Vila Nova
- French Agency for Food, Environmental and Occupational Health and Safety (Anses), Laboratory for Food Safety (LSAL), Paris-Est University, Maisons-Alfort, France
- Applied Mathematics and Computer Science, from Genomes to the Environment (MaIAGE), French National Institute for Agricultural Research (INRA), Université Paris-Saclay, 78350, Jouy-en-Josas, France
| | - Kévin Durimel
- French Agency for Food, Environmental and Occupational Health and Safety (Anses), Laboratory for Food Safety (LSAL), Paris-Est University, Maisons-Alfort, France
| | - Kévin La
- French Agency for Food, Environmental and Occupational Health and Safety (Anses), Laboratory for Food Safety (LSAL), Paris-Est University, Maisons-Alfort, France
| | - Arnaud Felten
- French Agency for Food, Environmental and Occupational Health and Safety (Anses), Laboratory for Food Safety (LSAL), Paris-Est University, Maisons-Alfort, France
| | - Philippe Bessières
- Applied Mathematics and Computer Science, from Genomes to the Environment (MaIAGE), French National Institute for Agricultural Research (INRA), Université Paris-Saclay, 78350, Jouy-en-Josas, France
| | - Michel-Yves Mistou
- French Agency for Food, Environmental and Occupational Health and Safety (Anses), Laboratory for Food Safety (LSAL), Paris-Est University, Maisons-Alfort, France
| | - Mahendra Mariadassou
- Applied Mathematics and Computer Science, from Genomes to the Environment (MaIAGE), French National Institute for Agricultural Research (INRA), Université Paris-Saclay, 78350, Jouy-en-Josas, France
| | - Nicolas Radomski
- French Agency for Food, Environmental and Occupational Health and Safety (Anses), Laboratory for Food Safety (LSAL), Paris-Est University, Maisons-Alfort, France.
| |
Collapse
|
6
|
Low AJ, Koziol AG, Manninger PA, Blais B, Carrillo CD. ConFindr: rapid detection of intraspecies and cross-species contamination in bacterial whole-genome sequence data. PeerJ 2019; 7:e6995. [PMID: 31183253 PMCID: PMC6546082 DOI: 10.7717/peerj.6995] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2019] [Accepted: 04/20/2019] [Indexed: 12/16/2022] Open
Abstract
Whole-genome sequencing (WGS) of bacterial pathogens is currently widely used to support public-health investigations. The ability to assess WGS data quality is critical to underpin the reliability of downstream analyses. Sequence contamination is a quality issue that could potentially impact WGS-based findings; however, existing tools do not readily identify contamination from closely-related organisms. To address this gap, we have developed a computational pipeline, ConFindr, for detection of intraspecies contamination. ConFindr determines the presence of contaminating sequences based on the identification of multiple alleles of core, single-copy, ribosomal-protein genes in raw sequencing reads. The performance of this tool was assessed using simulated and lab-generated Illumina short-read WGS data with varying levels of contamination (0-20% of reads) and varying genetic distance between the designated target and contaminant strains. Intraspecies and cross-species contamination was reliably detected in datasets containing 5% or more reads from a second, unrelated strain. ConFindr detected intraspecies contamination with higher sensitivity than existing tools, while also being able to automatically detect cross-species contamination with similar sensitivity. The implementation of ConFindr in quality-control pipelines will help to improve the reliability of WGS databases as well as the accuracy of downstream analyses. ConFindr is written in Python, and is freely available under the MIT License at github.com/OLC-Bioinformatics/ConFindr.
Collapse
Affiliation(s)
- Andrew J Low
- Ottawa Laboratory (Carling), Canadian Food Inspection Agency, Ottawa, Ontario, Canada
| | - Adam G Koziol
- Ottawa Laboratory (Carling), Canadian Food Inspection Agency, Ottawa, Ontario, Canada
| | - Paul A Manninger
- Ottawa Laboratory (Carling), Canadian Food Inspection Agency, Ottawa, Ontario, Canada
| | - Burton Blais
- Ottawa Laboratory (Carling), Canadian Food Inspection Agency, Ottawa, Ontario, Canada
| | - Catherine D Carrillo
- Ottawa Laboratory (Carling), Canadian Food Inspection Agency, Ottawa, Ontario, Canada
| |
Collapse
|
7
|
Abstract
In recent years, there have been numerous technological advances in the field of molecular biology; these include next- and third-generation sequencing of DNA genomes and mRNA transcripts and mass spectrometry of proteins. Perhaps, however, it is genome sequencing that impacts a virologist the most. In 2017, more than 480 complete genome sequences of poxviruses have been generated, and are constantly used in many different ways by almost all molecular virologists. Matching this growth in data acquisition is an explosion of the relatively new field of bioinformatics, providing databases to store and organize this valuable/expensive data and algorithms to analyze it. For the bench virologist, access to intuitive, easy-to-use, software is often critical for performing bioinformatics-based experiments. Three common hurdles for the researcher are (1) selection, retrieval, and reformatting genomics data from large databases; (2) use of tools to compare/analyze the genomics data; and (3) display and interpretation of complex sets of results. This chapter is directed at the bench virologist and describes the software that helps overcome these obstacles, with a focus on the comparison and analysis of poxvirus genomes. Although poxvirus genomes are stored in public databases such as GenBank, this resource can be cumbersome and tedious to use if large amounts of data must to be collected. Therefore, we also highlight our Viral Orthologous Clusters database system and integrated tools that we developed specifically for the management and analysis of complete viral genomes.
Collapse
Affiliation(s)
- Shin-Lin Tu
- Department of Biochemistry and Microbiology, University of Victoria, Victoria, BC, Canada
| | - Chris Upton
- Department of Biochemistry and Microbiology, University of Victoria, Victoria, BC, Canada.
| |
Collapse
|
8
|
Motro Y, Carriço JA, Friedrich AW, Rossen JW, Moran-Gilad J. ESCMID postgraduate education course: regional capacity building for integration of next-generation sequencing in the clinical microlab. Microbes Infect 2018; 20:275-280. [DOI: 10.1016/j.micinf.2018.02.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2018] [Revised: 02/15/2018] [Accepted: 02/17/2018] [Indexed: 01/16/2023]
|
9
|
Abstract
Illumina technology is widely used for bacterial whole-genome sequencing due to its simplicity, cheapness, reliability, and abundant software for manipulation with raw data. Illumina technology belongs to a second generation of whole genome sequencing that yields great amount of short reads for genome regions. Genomic DNA is fragmented to short pieces. DNA fragments are amplified for signal increasing, and are read using sequencing-by-synthesis. Millions of short reads up to 100-300 bp in length are assembled in continuous sequences. Mate-pair technology allows resolving a long repeat.Here, we describe the principles of standard and mate-pair library preparation from DNA samples, library quality control, sequencing with MiSeq instrument and following data bioinformatics treatment. Software for genome assembly and completion are listed that assemble, map, annotate, visualize, edit and allow doing other manipulations with genomic sequences. The whole genomes sequencing of the steroid-producing Actinobacteria using these protocols is exemplified.
Collapse
|
10
|
Machado MP, Ribeiro-Gonçalves B, Silva M, Ramirez M, Carriço JA. Epidemiological Surveillance and Typing Methods to Track Antibiotic Resistant Strains Using High Throughput Sequencing. Methods Mol Biol 2017; 1520:331-56. [PMID: 27873262 DOI: 10.1007/978-1-4939-6634-9_20] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2023]
Abstract
High-Throughput Sequencing (HTS) technologies transformed the microbial typing and molecular epidemiology field by providing the cost-effective ability for researchers to probe draft genomes, not only for epidemiological markers but also for antibiotic resistance and virulence determinants. In this chapter, we provide protocols for the analysis of HTS data for the determination of multilocus sequence typing (MLST) information and for determining presence or absence of antibiotic resistance genes.
Collapse
|
11
|
Carriço JA, Rossi M, Moran-Gilad J, Van Domselaar G, Ramirez M. A primer on microbial bioinformatics for nonbioinformaticians. Clin Microbiol Infect 2018; 24:342-349. [PMID: 29309933 DOI: 10.1016/j.cmi.2017.12.015] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2017] [Revised: 11/13/2017] [Accepted: 12/22/2017] [Indexed: 01/19/2023]
Abstract
BACKGROUND Presently, the bottleneck in the deployment of high-throughput sequencing technology is the ability to analyse the increasing amount of data produced in a fit-for-purpose manner. The field of microbial bioinformatics is thriving and quickly adapting to technological changes, which creates difficulties for nonbioinformaticians in following the complexity and increasingly obscure jargon of this field. AIMS This review is directed towards nonbioinformaticians who wish to gain understanding of the overall microbial bioinformatic processes, from raw data obtained from sequencers to final outputs. SOURCES The software and analytical strategies reviewed are based on the personal experience of the authors. CONTENT The bioinformatic processes of transforming raw reads to actionable information in a clinical and epidemiologic context is explained. We review the advantages and limitations of two major strategies currently applied: read mapping, which is the comparison with a predefined reference genome, and de novo assembly, which is the unguided assembly of the raw data. Finally, we discuss the main analytical methodologies and the most frequently used freely available software and its application in the context of bacterial infectious disease management. IMPLICATIONS High-throughput sequencing technologies are overhauling outbreak investigation and epidemiologic surveillance while creating new challenges due to the amount and complexity of data generated. The continuously evolving field of microbial bioinformatics is required for stakeholders to fully harness the power of these new technologies.
Collapse
Affiliation(s)
- J A Carriço
- Instituto de Microbiologia, Instituto de Medicina Molecular, Faculdade de Medicina, Universidade de Lisboa, Lisboa, Portugal.
| | - M Rossi
- Department of Food Hygiene and Environmental Health, Faculty of Veterinary Medicine, University of Helsinki, Helsinki, Finland
| | - J Moran-Gilad
- Department of Health Systems Management, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer-Sheva, Israel; Public Health Services, Ministry of Health, Jerusalem, Israel; ESCMID Study Group for Genomic and Molecular Diagnostics (ESGMD), Basel, Switzerland
| | - G Van Domselaar
- National Microbiology Laboratory, Public Health Agency of Canada, 1015 Arlington St, Winnipeg, MB, R3E 3R2, Canada; Department of Medical Microbiology and Infectious Diseases, University of Manitoba, 745 Bannatyne Avenue, Winnipeg, MB, R3E 0J9, Canada
| | - M Ramirez
- Instituto de Microbiologia, Instituto de Medicina Molecular, Faculdade de Medicina, Universidade de Lisboa, Lisboa, Portugal
| |
Collapse
|
12
|
Jünemann S, Kleinbölting N, Jaenicke S, Henke C, Hassa J, Nelkner J, Stolze Y, Albaum SP, Schlüter A, Goesmann A, Sczyrba A, Stoye J. Bioinformatics for NGS-based metagenomics and the application to biogas research. J Biotechnol 2017; 261:10-23. [PMID: 28823476 DOI: 10.1016/j.jbiotec.2017.08.012] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Revised: 08/08/2017] [Accepted: 08/09/2017] [Indexed: 12/19/2022]
Abstract
Metagenomics has proven to be one of the most important research fields for microbial ecology during the last decade. Starting from 16S rRNA marker gene analysis for the characterization of community compositions to whole metagenome shotgun sequencing which additionally allows for functional analysis, metagenomics has been applied in a wide spectrum of research areas. The cost reduction paired with the increase in the amount of data due to the advent of next-generation sequencing led to a rapidly growing demand for bioinformatic software in metagenomics. By now, a large number of tools that can be used to analyze metagenomic datasets has been developed. The Bielefeld-Gießen center for microbial bioinformatics as part of the German Network for Bioinformatics Infrastructure bundles and imparts expert knowledge in the analysis of metagenomic datasets, especially in research on microbial communities involved in anaerobic digestion residing in biogas reactors. In this review, we give an overview of the field of metagenomics, introduce into important bioinformatic tools and possible workflows, accompanied by application examples of biogas surveys successfully conducted at the Center for Biotechnology of Bielefeld University.
Collapse
Affiliation(s)
- Sebastian Jünemann
- Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany; Faculty of Technology, Bielefeld University, Bielefeld, Germany.
| | - Nils Kleinbölting
- Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany
| | - Sebastian Jaenicke
- Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany; Bioinformatics and Systems Biology, Justus-Liebig-Universität, Gießen, Germany
| | - Christian Henke
- Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany
| | - Julia Hassa
- Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany
| | - Johanna Nelkner
- Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany
| | - Yvonne Stolze
- Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany
| | - Stefan P Albaum
- Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany
| | - Andreas Schlüter
- Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany
| | - Alexander Goesmann
- Bioinformatics and Systems Biology, Justus-Liebig-Universität, Gießen, Germany
| | - Alexander Sczyrba
- Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany; Faculty of Technology, Bielefeld University, Bielefeld, Germany
| | - Jens Stoye
- Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany; Faculty of Technology, Bielefeld University, Bielefeld, Germany
| |
Collapse
|
13
|
Heyer R, Schallert K, Zoun R, Becher B, Saake G, Benndorf D. Challenges and perspectives of metaproteomic data analysis. J Biotechnol 2017; 261:24-36. [PMID: 28663049 DOI: 10.1016/j.jbiotec.2017.06.1201] [Citation(s) in RCA: 88] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2017] [Revised: 06/20/2017] [Accepted: 06/23/2017] [Indexed: 02/07/2023]
Abstract
In nature microorganisms live in complex microbial communities. Comprehensive taxonomic and functional knowledge about microbial communities supports medical and technical application such as fecal diagnostics as well as operation of biogas plants or waste water treatment plants. Furthermore, microbial communities are crucial for the global carbon and nitrogen cycle in soil and in the ocean. Among the methods available for investigation of microbial communities, metaproteomics can approximate the activity of microorganisms by investigating the protein content of a sample. Although metaproteomics is a very powerful method, issues within the bioinformatic evaluation impede its success. In particular, construction of databases for protein identification, grouping of redundant proteins as well as taxonomic and functional annotation pose big challenges. Furthermore, growing amounts of data within a metaproteomics study require dedicated algorithms and software. This review summarizes recent metaproteomics software and addresses the introduced issues in detail.
Collapse
Affiliation(s)
- Robert Heyer
- Otto von Guericke University, Bioprocess Engineering, Universitätsplatz 2, 39106 Magdeburg, Germany.
| | - Kay Schallert
- Otto von Guericke University, Bioprocess Engineering, Universitätsplatz 2, 39106 Magdeburg, Germany.
| | - Roman Zoun
- Otto von Guericke University, Institute for Technical and Business Information Systems, Universitätsplatz 2, 39106 Magdeburg, Germany.
| | - Beatrice Becher
- Otto von Guericke University, Bioprocess Engineering, Universitätsplatz 2, 39106 Magdeburg, Germany.
| | - Gunter Saake
- Otto von Guericke University, Institute for Technical and Business Information Systems, Universitätsplatz 2, 39106 Magdeburg, Germany.
| | - Dirk Benndorf
- Otto von Guericke University, Bioprocess Engineering, Universitätsplatz 2, 39106 Magdeburg, Germany; Max Planck Institute for Dynamics of Complex Technical Systems, Bioprocess Engineering, Sandtorstraße 1, 39106, Magdeburg, Germany.
| |
Collapse
|
14
|
Dimitrov KM, Sharma P, Volkening JD, Goraichuk IV, Wajid A, Rehmani SF, Basharat A, Shittu I, Joannis TM, Miller PJ, Afonso CL. A robust and cost-effective approach to sequence and analyze complete genomes of small RNA viruses. Virol J 2017; 14:72. [PMID: 28388925 PMCID: PMC5384157 DOI: 10.1186/s12985-017-0741-5] [Citation(s) in RCA: 65] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2017] [Accepted: 03/29/2017] [Indexed: 01/26/2023] Open
Abstract
Background Next-generation sequencing (NGS) allows ultra-deep sequencing of nucleic acids. The use of sequence-independent amplification of viral nucleic acids without utilization of target-specific primers provides advantages over traditional sequencing methods and allows detection of unsuspected variants and co-infecting agents. However, NGS is not widely used for small RNA viruses because of incorrectly perceived cost estimates and inefficient utilization of freely available bioinformatics tools. Methods In this study, we have utilized NGS-based random sequencing of total RNA combined with barcode multiplexing of libraries to quickly, effectively and simultaneously characterize the genomic sequences of multiple avian paramyxoviruses. Thirty libraries were prepared from diagnostic samples amplified in allantoic fluids and their total RNAs were sequenced in a single flow cell on an Illumina MiSeq instrument. After digital normalization, data were assembled using the MIRA assembler within a customized workflow on the Galaxy platform. Results Twenty-eight avian paramyxovirus 1 (APMV-1), one APMV-13, four avian influenza and two infectious bronchitis virus complete or nearly complete genome sequences were obtained from the single run. The 29 avian paramyxovirus genomes displayed 99.6% mean coverage based on bases with Phred quality scores of 30 or more. The lower and upper quartiles of sample median depth per position for those 29 samples were 2984 and 6894, respectively, indicating coverage across samples sufficient for deep variant analysis. Sample processing and library preparation took approximately 25–30 h, the sequencing run took 39 h, and processing through the Galaxy workflow took approximately 2–3 h. The cost of all steps, excluding labor, was estimated to be 106 USD per sample. Conclusions This work describes an efficient multiplexing NGS approach, a detailed analysis workflow, and customized tools for the characterization of the genomes of RNA viruses. The combination of multiplexing NGS technology with the Galaxy workflow platform resulted in a fast, user-friendly, and cost-efficient protocol for the simultaneous characterization of multiple full-length viral genomes. Twenty-nine full-length or near-full-length APMV genomes with a high median depth were successfully sequenced out of 30 samples. The applied de novo assembly approach also allowed identification of mixed viral populations in some of the samples. Electronic supplementary material The online version of this article (doi:10.1186/s12985-017-0741-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Kiril M Dimitrov
- Exotic and Emerging Avian Viral Diseases Research Unit, Southeast Poultry Research Laboratory, US National Poultry Research Center, Agricultural Research Service, USDA, 934 College Station Road, Athens, GA, 30605, USA
| | - Poonam Sharma
- Exotic and Emerging Avian Viral Diseases Research Unit, Southeast Poultry Research Laboratory, US National Poultry Research Center, Agricultural Research Service, USDA, 934 College Station Road, Athens, GA, 30605, USA
| | | | - Iryna V Goraichuk
- Exotic and Emerging Avian Viral Diseases Research Unit, Southeast Poultry Research Laboratory, US National Poultry Research Center, Agricultural Research Service, USDA, 934 College Station Road, Athens, GA, 30605, USA.,National Scientific Center Institute of Experimental and Clinical Veterinary Medicine, 83 Pushkinskaya Street, Kharkiv, 61023, Ukraine
| | - Abdul Wajid
- Quality Operations Laboratory (QOL), University of Veterinary and Animal Sciences, Syed Abdul Qadir Jilani Road, Lahore, 54000, Pakistan.,Institute of Biochemistry and Biotechnology, University of Veterinary and Animal Sciences, Syed Abdul Qadir Jilani Road, Lahore, 54000, Pakistan
| | - Shafqat Fatima Rehmani
- Quality Operations Laboratory (QOL), University of Veterinary and Animal Sciences, Syed Abdul Qadir Jilani Road, Lahore, 54000, Pakistan
| | - Asma Basharat
- Quality Operations Laboratory (QOL), University of Veterinary and Animal Sciences, Syed Abdul Qadir Jilani Road, Lahore, 54000, Pakistan
| | - Ismaila Shittu
- Regional Laboratory for Animal Influenza and other Transboundary Animal Diseases, National Veterinary Research Institute, PMB01, Vom, 930010, Plateau State, Nigeria
| | - Tony M Joannis
- Regional Laboratory for Animal Influenza and other Transboundary Animal Diseases, National Veterinary Research Institute, PMB01, Vom, 930010, Plateau State, Nigeria
| | - Patti J Miller
- Exotic and Emerging Avian Viral Diseases Research Unit, Southeast Poultry Research Laboratory, US National Poultry Research Center, Agricultural Research Service, USDA, 934 College Station Road, Athens, GA, 30605, USA
| | - Claudio L Afonso
- Exotic and Emerging Avian Viral Diseases Research Unit, Southeast Poultry Research Laboratory, US National Poultry Research Center, Agricultural Research Service, USDA, 934 College Station Road, Athens, GA, 30605, USA.
| |
Collapse
|
15
|
Mellmann A, Andersen PS, Bletz S, Friedrich AW, Kohl TA, Lilje B, Niemann S, Prior K, Rossen JW, Harmsen D. High Interlaboratory Reproducibility and Accuracy of Next-Generation-Sequencing-Based Bacterial Genotyping in a Ring Trial. J Clin Microbiol 2017; 55:908-13. [PMID: 28053217 DOI: 10.1128/JCM.02242-16] [Citation(s) in RCA: 63] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2016] [Accepted: 12/22/2016] [Indexed: 12/15/2022] Open
Abstract
Today, next-generation whole-genome sequencing (WGS) is increasingly used to determine the genetic relationships of bacteria on a nearly whole-genome level for infection control purposes and molecular surveillance. Here, we conducted a multicenter ring trial comprising five laboratories to determine the reproducibility and accuracy of WGS-based typing. The participating laboratories sequenced 20 blind-coded Staphylococcus aureus DNA samples using 250-bp paired-end chemistry for library preparation in a single sequencing run on an Illumina MiSeq sequencer. The run acceptance criteria were sequencing outputs >5.6 Gb and Q30 read quality scores of >75%. Subsequently, spa typing, multilocus sequence typing (MLST), ribosomal MLST, and core genome MLST (cgMLST) were performed by the participants. Moreover, discrepancies in cgMLST target sequences in comparisons with the included and also published sequence of the quality control strain ATCC 25923 were resolved using Sanger sequencing. All five laboratories fulfilled the run acceptance criteria in a single sequencing run without any repetition. Of the 400 total possible typing results, 394 of the reported spa types, sequence types (STs), ribosomal STs (rSTs), and cgMLST cluster types were correct and identical among all laboratories; only six typing results were missing. An analysis of cgMLST allelic profiles corroborated this high reproducibility; only 3 of 183,927 (0.0016%) cgMLST allele calls were wrong. Sanger sequencing confirmed all 12 discrepancies of the ring trial results in comparison with the published sequence of ATCC 25923. In summary, this ring trial demonstrated the high reproducibility and accuracy of current next-generation sequencing-based bacterial typing for molecular surveillance when done with nearly completely locked-down methods.
Collapse
|
16
|
Abstract
The number of large-scale genomics projects is increasing due to the availability of affordable high-throughput sequencing (HTS) technologies. The use of HTS for bacterial infectious disease research is attractive because one whole-genome sequencing (WGS) run can replace multiple assays for bacterial typing, molecular epidemiology investigations, and more in-depth pathogenomic studies. The computational resources and bioinformatics expertise required to accommodate and analyze the large amounts of data pose new challenges for researchers embarking on genomics projects for the first time. Here, we present a comprehensive overview of a bacterial genomics projects from beginning to end, with a particular focus on the planning and computational requirements for HTS data, and provide a general understanding of the analytical concepts to develop a workflow that will meet the objectives and goals of HTS projects.
Collapse
|
17
|
Li J, Batcha AMN, Grüning B, Mansmann UR. An NGS Workflow Blueprint for DNA Sequencing Data and Its Application in Individualized Molecular Oncology. Cancer Inform 2016; 14:87-107. [PMID: 27081306 PMCID: PMC4827795 DOI: 10.4137/cin.s30793] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2015] [Revised: 03/02/2016] [Accepted: 03/17/2016] [Indexed: 12/23/2022] Open
Abstract
Next-generation sequencing (NGS) technologies that have advanced rapidly in the past few years possess the potential to classify diseases, decipher the molecular code of related cell processes, identify targets for decision-making on targeted therapy or prevention strategies, and predict clinical treatment response. Thus, NGS is on its way to revolutionize oncology. With the help of NGS, we can draw a finer map for the genetic basis of diseases and can improve our understanding of diagnostic and prognostic applications and therapeutic methods. Despite these advantages and its potential, NGS is facing several critical challenges, including reduction of sequencing cost, enhancement of sequencing quality, improvement of technical simplicity and reliability, and development of semiautomated and integrated analysis workflow. In order to address these challenges, we conducted a literature research and summarized a four-stage NGS workflow for providing a systematic review on NGS-based analysis, explaining the strength and weakness of diverse NGS-based software tools, and elucidating its potential connection to individualized medicine. By presenting this four-stage NGS workflow, we try to provide a minimal structural layout required for NGS data storage and reproducibility.
Collapse
Affiliation(s)
- Jian Li
- Institute for Medical Informatics, Biometry and Epidemiology, Ludwig Maximilian University of Munich, Munich, Germany.; German Cancer Consortium (DKTK), Heidelberg, Germany.; German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Aarif Mohamed Nazeer Batcha
- Institute for Medical Informatics, Biometry and Epidemiology, Ludwig Maximilian University of Munich, Munich, Germany.; German Cancer Consortium (DKTK), Heidelberg, Germany.; German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Björn Grüning
- Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University, Freiburg, Freiburg, Germany.; Center for Biological Systems Analysis (ZBSA), University of Freiburg, Freiburg, Germany
| | - Ulrich R Mansmann
- Institute for Medical Informatics, Biometry and Epidemiology, Ludwig Maximilian University of Munich, Munich, Germany.; German Cancer Consortium (DKTK), Heidelberg, Germany
| |
Collapse
|
18
|
Lugli GA, Milani C, Mancabelli L, van Sinderen D, Ventura M. MEGAnnotator: a user-friendly pipeline for microbial genomes assembly and annotation. FEMS Microbiol Lett 2016; 363:fnw049. [DOI: 10.1093/femsle/fnw049] [Citation(s) in RCA: 79] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/24/2016] [Indexed: 12/18/2022] Open
|
19
|
Alic AS, Ruzafa D, Dopazo J, Blanquer I. Objective review of de novostand-alone error correction methods for NGS data. WIREs Comput Mol Sci 2016. [DOI: 10.1002/wcms.1239] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Andy S. Alic
- Institute of Instrumentation for Molecular Imaging (I3M); Universitat Politècnica de València; València Spain
| | - David Ruzafa
- Departamento de Quìmica Fìsica e Instituto de Biotecnologìa, Facultad de Ciencias; Universidad de Granada; Granada Spain
| | - Joaquin Dopazo
- Department of Computational Genomics; Príncipe Felipe Research Centre (CIPF); Valencia Spain
- CIBER de Enfermedades Raras (CIBERER); Valencia Spain
- Functional Genomics Node (INB) at CIPF; Valencia Spain
| | - Ignacio Blanquer
- Institute of Instrumentation for Molecular Imaging (I3M); Universitat Politècnica de València; València Spain
- Biomedical Imaging Research Group GIBI 2; Polytechnic University Hospital La Fe; Valencia Spain
| |
Collapse
|
20
|
Lin HH, Liao YC. Evaluation and Validation of Assembling Corrected PacBio Long Reads for Microbial Genome Completion via Hybrid Approaches. PLoS One 2015; 10:e0144305. [PMID: 26641475 PMCID: PMC4671558 DOI: 10.1371/journal.pone.0144305] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2015] [Accepted: 11/16/2015] [Indexed: 11/23/2022] Open
Abstract
Despite the ever-increasing output of next-generation sequencing data along with developing assemblers, dozens to hundreds of gaps still exist in de novo microbial assemblies due to uneven coverage and large genomic repeats. Third-generation single-molecule, real-time (SMRT) sequencing technology avoids amplification artifacts and generates kilobase-long reads with the potential to complete microbial genome assembly. However, due to the low accuracy (~85%) of third-generation sequences, a considerable amount of long reads (>50X) are required for self-correction and for subsequent de novo assembly. Recently-developed hybrid approaches, using next-generation sequencing data and as few as 5X long reads, have been proposed to improve the completeness of microbial assembly. In this study we have evaluated the contemporary hybrid approaches and demonstrated that assembling corrected long reads (by runCA) produced the best assembly compared to long-read scaffolding (e.g., AHA, Cerulean and SSPACE-LongRead) and gap-filling (SPAdes). For generating corrected long reads, we further examined long-read correction tools, such as ECTools, LSC, LoRDEC, PBcR pipeline and proovread. We have demonstrated that three microbial genomes including Escherichia coli K12 MG1655, Meiothermus ruber DSM1279 and Pdeobacter heparinus DSM2366 were successfully hybrid assembled by runCA into near-perfect assemblies using ECTools-corrected long reads. In addition, we developed a tool, Patch, which implements corrected long reads and pre-assembled contigs as inputs, to enhance microbial genome assemblies. With the additional 20X long reads, short reads of S. cerevisiae W303 were hybrid assembled into 115 contigs using the verified strategy, ECTools + runCA. Patch was subsequently applied to upgrade the assembly to a 35-contig draft genome. Our evaluation of the hybrid approaches shows that assembling the ECTools-corrected long reads via runCA generates near complete microbial genomes, suggesting that genome assembly could benefit from re-analyzing the available hybrid datasets that were not assembled in an optimal fashion.
Collapse
Affiliation(s)
- Hsin-Hung Lin
- Institute of Population Health Sciences, National Health Research Institutes, Miaoli County, Taiwan
| | - Yu-Chieh Liao
- Institute of Population Health Sciences, National Health Research Institutes, Miaoli County, Taiwan
- * E-mail:
| |
Collapse
|
21
|
Mikalsen T, Pedersen T, Willems R, Coque TM, Werner G, Sadowy E, van Schaik W, Jensen LB, Sundsfjord A, Hegstad K. Investigating the mobilome in clinically important lineages of Enterococcus faecium and Enterococcus faecalis. BMC Genomics 2015; 16:282. [PMID: 25885771 PMCID: PMC4438569 DOI: 10.1186/s12864-015-1407-6] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2014] [Accepted: 02/27/2015] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND The success of Enterococcus faecium and E. faecalis evolving as multi-resistant nosocomial pathogens is associated with their ability to acquire and share adaptive traits, including antimicrobial resistance genes encoded by mobile genetic elements (MGEs). Here, we investigate this mobilome in successful hospital associated genetic lineages, E. faecium sequence type (ST)17 (n=10) and ST78 (n=10), E. faecalis ST6 (n=10) and ST40 (n=10) by DNA microarray analyses. RESULTS The hybridization patterns of 272 representative targets including plasmid backbones (n=85), transposable elements (n=85), resistance determinants (n=67), prophages (n=29) and clustered regularly interspaced short palindromic repeats (CRISPR)-cas sequences (n=6) separated the strains according to species, and for E. faecalis also according to STs. RCR-, Rep_3-, RepA_N- and Inc18-family plasmids were highly prevalent and with the exception of Rep_3, evenly distributed between the species. There was a considerable difference in the replicon profile, with rep 17/pRUM , rep 2/pRE25 , rep 14/EFNP1 and rep 20/pLG1 dominating in E. faecium and rep 9/pCF10 , rep 2/pRE25 and rep 7 in E. faecalis strains. We observed an overall high correlation between the presence and absence of genes coding for resistance towards antibiotics, metals, biocides and their corresponding MGEs as well as their phenotypic antimicrobial susceptibility pattern. Although most IS families were represented in both E. faecalis and E. faecium, specific IS elements within these families were distributed in only one species. The prevalence of IS256-, IS3-, ISL3-, IS200/IS605-, IS110-, IS982- and IS4-transposases was significantly higher in E. faecium than E. faecalis, and that of IS110-, IS982- and IS1182-transposases in E. faecalis ST6 compared to ST40. Notably, the transposases of IS981, ISEfm1 and IS1678 that have only been reported in few enterococcal isolates were well represented in the E. faecium strains. E. faecalis ST40 strains harboured possible functional CRISPR-Cas systems, and still resistance and prophage sequences were generally well represented. CONCLUSIONS The targeted MGEs were highly prevalent among the selected STs, underlining their potential importance in the evolution of hospital-adapted lineages of enterococci. Although the propensity of inter-species horizontal gene transfer (HGT) must be emphasized, the considerable species-specificity of these MGEs indicates a separate vertical evolution of MGEs within each species, and for E. faecalis within each ST.
Collapse
Affiliation(s)
- Theresa Mikalsen
- Research group for Host-microbe Interactions, Department of Medical Biology, Faculty of Health Science, UiT - The Arctic University of Norway, Tromsø, Norway.
| | - Torunn Pedersen
- Norwegian National Advisory Unit on Detection of Antimicrobial Resistance, Department of Microbiology and Infection Control, University Hospital of North Norway, Tromsø, Norway.
| | - Rob Willems
- Department of Medical Microbiology, University Medical Center Utrecht, Utrecht, The Netherlands.
| | - Teresa M Coque
- Servicio de Microbiologia, Hospital Ramón y Cajal, Instituto Ramón y Cajal de Investigación Sanitaria (IRYCIS), Madrid, Spain. .,Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBER-ESP), Madrid, Spain.
| | - Guido Werner
- Division of Nosocomial Pathogens and Antibiotic Resistance, Robert Koch Institute, Wernigerode Branch, Wernigerode, Germany.
| | - Ewa Sadowy
- Department of Molecular Microbiology, National Medicines Institute, ul, Chełmska 30/34, 00-725, Warsaw, Poland.
| | - Willem van Schaik
- Department of Medical Microbiology, University Medical Center Utrecht, Utrecht, The Netherlands.
| | - Lars Bogø Jensen
- Division of Food Microbiologyt, National Food Institute, Danish Technical University, Copenhagen V, Denmark.
| | - Arnfinn Sundsfjord
- Research group for Host-microbe Interactions, Department of Medical Biology, Faculty of Health Science, UiT - The Arctic University of Norway, Tromsø, Norway. .,Norwegian National Advisory Unit on Detection of Antimicrobial Resistance, Department of Microbiology and Infection Control, University Hospital of North Norway, Tromsø, Norway.
| | - Kristin Hegstad
- Research group for Host-microbe Interactions, Department of Medical Biology, Faculty of Health Science, UiT - The Arctic University of Norway, Tromsø, Norway. .,Norwegian National Advisory Unit on Detection of Antimicrobial Resistance, Department of Microbiology and Infection Control, University Hospital of North Norway, Tromsø, Norway.
| |
Collapse
|
22
|
PLOS ONE Staff. Correction: GABenchToB: a genome assembly benchmark tuned on bacteria and benchtop sequencers. PLoS One 2015; 10:e0118741. [PMID: 25789774 DOI: 10.1371/journal.pone.0118741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
23
|
Abstract
Determining the genomic sequences of microorganisms is the basis and prerequisite for understanding their biology and functional characterization. While the advent of low-cost, extremely high-throughput second-generation sequencing technologies and the parallel development of assembly algorithms have generated rapid and cost-effective genome assemblies, such assemblies are often unfinished, fragmented draft genomes as a result of short read lengths and long repeats present in multiple copies. Third-generation, PacBio sequencing technologies circumvented this problem by greatly increasing read length. Hybrid approaches including ALLPATHS-LG, PacBio corrected reads pipeline, SPAdes, and SSPACE-LongRead, and non-hybrid approaches--hierarchical genome-assembly process (HGAP) and PacBio corrected reads pipeline via self-correction--have therefore been proposed to utilize the PacBio long reads that can span many thousands of bases to facilitate the assembly of complete microbial genomes. However, standardized procedures that aim at evaluating and comparing these approaches are currently insufficient. To address the issue, we herein provide a comprehensive comparison by collecting datasets for the comparative assessment on the above-mentioned five assemblers. In addition to offering explicit and beneficial recommendations to practitioners, this study aims to aid in the design of a paradigm positioned to complete bacterial genome assembly.
Collapse
Affiliation(s)
- Yu-Chieh Liao
- Institute of Population Health Sciences, National Health Research Institutes, Miaoli 350, Taiwan
| | - Shu-Hung Lin
- Institute of Population Health Sciences, National Health Research Institutes, Miaoli 350, Taiwan
| | - Hsin-Hung Lin
- Institute of Population Health Sciences, National Health Research Institutes, Miaoli 350, Taiwan
| |
Collapse
|
24
|
Bletz S, Mellmann A, Rothgänger J, Harmsen D. Ensuring backwards compatibility: traditional genotyping efforts in the era of whole genome sequencing. Clin Microbiol Infect 2014; 21:347.e1-4. [PMID: 25658529 DOI: 10.1016/j.cmi.2014.11.005] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2014] [Revised: 10/10/2014] [Accepted: 10/13/2014] [Indexed: 10/24/2022]
Abstract
When using next-generation whole genome sequencing (WGS), extraction of spa types from WGS data is essential for backwards compatibility with Sanger sequencing-based spa typing of methicillin-resistant Staphylococcus aureus (MRSA). We evaluated WGS-based spa typing with a 2×250 bp protocol in a diverse collection of 423 MRSA isolates using two pipelines that executed sequence quality-trimming and de novo assembly before spa typing. The SeqSphere(+) pipeline correctly typed 419 isolates (99.1%) whereas the CLCbio pipeline succeeded in 249 isolates (58.9%). In summary, WGS combined with an optimized de novo assembly enables nearly full compatibility with Sanger sequencing-based spa typing data.
Collapse
Affiliation(s)
- S Bletz
- Institute of Hygiene, University Hospital Münster, Münster, Germany
| | - A Mellmann
- Institute of Hygiene, University Hospital Münster, Münster, Germany.
| | | | - D Harmsen
- Department of Periodontology, University Hospital Münster, Münster, Germany
| |
Collapse
|