1
|
Li Y, van Houten CB, Boers SA, Jansen R, Cohen A, Engelhard D, Kraaij R, Hiltemann SD, Ju J, Fernández D, Mankoc C, González E, de Waal WJ, de Winter-de Groot KM, Wolfs TFW, Meijers P, Luijk B, Oosterheert JJ, Sankatsing SUC, Bossink AWJ, Stein M, Klein A, Ashkar J, Bamberger E, Srugo I, Odeh M, Dotan Y, Boico O, Etshtein L, Paz M, Navon R, Friedman T, Simon E, Gottlieb TM, Pri-Or E, Kronenfeld G, Oved K, Eden E, Stubbs AP, Bont LJ, Hays JP. The diagnostic value of nasal microbiota and clinical parameters in a multi-parametric prediction model to differentiate bacterial versus viral infections in lower respiratory tract infections. PLoS One 2022; 17:e0267140. [PMID: 35436301 PMCID: PMC9015155 DOI: 10.1371/journal.pone.0267140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2021] [Accepted: 04/04/2022] [Indexed: 11/18/2022] Open
Abstract
Background The ability to accurately distinguish bacterial from viral infection would help clinicians better target antimicrobial therapy during suspected lower respiratory tract infections (LRTI). Although technological developments make it feasible to rapidly generate patient-specific microbiota profiles, evidence is required to show the clinical value of using microbiota data for infection diagnosis. In this study, we investigated whether adding nasal cavity microbiota profiles to readily available clinical information could improve machine learning classifiers to distinguish bacterial from viral infection in patients with LRTI. Results Various multi-parametric Random Forests classifiers were evaluated on the clinical and microbiota data of 293 LRTI patients for their prediction accuracies to differentiate bacterial from viral infection. The most predictive variable was C-reactive protein (CRP). We observed a marginal prediction improvement when 7 most prevalent nasal microbiota genera were added to the CRP model. In contrast, adding three clinical variables, absolute neutrophil count, consolidation on X-ray, and age group to the CRP model significantly improved the prediction. The best model correctly predicted 85% of the ‘bacterial’ patients and 82% of the ‘viral’ patients using 13 clinical and 3 nasal cavity microbiota genera (Staphylococcus, Moraxella, and Streptococcus). Conclusions We developed high-accuracy multi-parametric machine learning classifiers to differentiate bacterial from viral infections in LRTI patients of various ages. We demonstrated the predictive value of four easy-to-collect clinical variables which facilitate personalized and accurate clinical decision-making. We observed that nasal cavity microbiota correlate with the clinical variables and thus may not add significant value to diagnostic algorithms that aim to differentiate bacterial from viral infections.
Collapse
Affiliation(s)
- Yunlei Li
- Department of Pathology & Clinical Bioinformatics, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - Chantal B. van Houten
- Division of Paediatric Immunology and Infectious Diseases, University Medical Centre Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Stefan A. Boers
- Department of Medical Microbiology and Infectious Diseases, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, The Netherlands
| | | | | | - Dan Engelhard
- Division of Paediatric Infectious Disease Unit, Hadassah-Hebrew University Medical Centre, Jerusalem, Israel
| | - Robert Kraaij
- Department of Internal Medicine, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - Saskia D. Hiltemann
- Department of Pathology & Clinical Bioinformatics, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - Jie Ju
- Department of Pathology & Clinical Bioinformatics, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, The Netherlands
| | | | | | | | - Wouter J. de Waal
- Department of Paediatrics, Diakonessenhuis, Utrecht, The Netherlands
| | - Karin M. de Winter-de Groot
- Department of Paediatric Respiratory Medicine, University Medical Centre Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Tom F. W. Wolfs
- Division of Paediatric Immunology and Infectious Diseases, University Medical Centre Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Pieter Meijers
- Department of Paediatrics, Gelderse Vallei Hospital, Ede, The Netherlands
| | - Bart Luijk
- Department of Respiratory Medicine, University Medical Centre Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Jan Jelrik Oosterheert
- Department of Internal Medicine and Infectious Diseases, University Medical Centre Utrecht, Utrecht University, Utrecht, The Netherlands
| | | | - Aik W. J. Bossink
- Department of Respiratory Medicine, Diakonessenhuis Utrecht, Utrecht, The Netherlands
| | - Michal Stein
- Department of Paediatrics, Hillel Yaffe Medical Centre, Hadera, Israel
| | - Adi Klein
- Department of Paediatrics, Hillel Yaffe Medical Centre, Hadera, Israel
| | - Jalal Ashkar
- Department of Paediatrics, Hillel Yaffe Medical Centre, Hadera, Israel
| | - Ellen Bamberger
- MeMed, Tirat Carmel, Israel
- Department of Paediatrics, Bnai Zion Medical Centre, Haifa, Israel
| | - Isaac Srugo
- Department of Paediatrics, Bnai Zion Medical Centre, Haifa, Israel
| | - Majed Odeh
- Department of Internal Medicine A, Bnai Zion Medical Centre, Haifa, Israel
| | - Yaniv Dotan
- Pulmonary Division, Rambam Health Care Campus, Haifa, Israel
| | | | | | | | | | | | | | | | | | | | | | | | - Andrew P. Stubbs
- Department of Pathology & Clinical Bioinformatics, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - Louis J. Bont
- Division of Paediatric Immunology and Infectious Diseases, University Medical Centre Utrecht, Utrecht University, Utrecht, The Netherlands
| | - John P. Hays
- Department of Medical Microbiology and Infectious Diseases, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, The Netherlands
- * E-mail:
| |
Collapse
|
2
|
Heikema AP, Jansen R, Hiltemann SD, Hays JP, Stubbs AP. WeFaceNano: a user-friendly pipeline for complete ONT sequence assembly and detection of antibiotic resistance in multi-plasmid bacterial isolates. BMC Microbiol 2021; 21:171. [PMID: 34098864 PMCID: PMC8186029 DOI: 10.1186/s12866-021-02225-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Accepted: 05/13/2021] [Indexed: 11/10/2022] Open
Abstract
Background Bacterial plasmids often carry antibiotic resistance genes and are a significant factor in the spread of antibiotic resistance. The ability to completely assemble plasmid sequences would facilitate the localization of antibiotic resistance genes, the identification of genes that promote plasmid transmission and the accurate tracking of plasmid mobility. However, the complete assembly of plasmid sequences using the currently most widely used sequencing platform (Illumina-based sequencing) is restricted due to the generation of short sequence lengths. The long-read Oxford Nanopore Technologies (ONT) sequencing platform overcomes this limitation. Still, the assembly of plasmid sequence data remains challenging due to software incompatibility with long-reads and the error rate generated using ONT sequencing. Bioinformatics pipelines have been developed for ONT-generated sequencing but require computational skills that frequently are beyond the abilities of scientific researchers. To overcome this challenge, the authors developed ‘WeFaceNano’, a user-friendly Web interFace for rapid assembly and analysis of plasmid DNA sequences generated using the ONT platform. WeFaceNano includes: a read statistics report; two assemblers (Miniasm and Flye); BLAST searching; the detection of antibiotic resistance- and replicon genes and several plasmid visualizations. A user-friendly interface displays the main features of WeFaceNano and gives access to the analysis tools. Results Publicly available ONT sequence data of 21 plasmids were used to validate WeFaceNano, with plasmid assemblages and anti-microbial resistance gene detection being concordant with the published results. Interestingly, the “Flye” assembler with “meta” settings generated the most complete plasmids. Conclusions WeFaceNano is a user-friendly open-source software pipeline suitable for accurate plasmid assembly and the detection of anti-microbial resistance genes in (clinical) samples where multiple plasmids can be present.
Collapse
Affiliation(s)
- Astrid P Heikema
- Department of Medical Microbiology and Infectious Diseases, Erasmus University Medical Center (Erasmus MC), Rotterdam, the Netherlands.
| | - Rick Jansen
- Department of Pathology, Clinical Bioinformatics Unit, Erasmus University Medical Center (Erasmus MC), Rotterdam, The Netherlands
| | - Saskia D Hiltemann
- Department of Pathology, Clinical Bioinformatics Unit, Erasmus University Medical Center (Erasmus MC), Rotterdam, The Netherlands
| | - John P Hays
- Department of Medical Microbiology and Infectious Diseases, Erasmus University Medical Center (Erasmus MC), Rotterdam, the Netherlands
| | - Andrew P Stubbs
- Department of Pathology, Clinical Bioinformatics Unit, Erasmus University Medical Center (Erasmus MC), Rotterdam, The Netherlands
| |
Collapse
|
3
|
Hiltemann SD, Boers SA, van der Spek PJ, Jansen R, Hays JP, Stubbs AP. Galaxy mothur Toolset (GmT): a user-friendly application for 16S rRNA gene sequencing analysis using mothur. Gigascience 2019; 8:5266305. [PMID: 30597007 PMCID: PMC6377400 DOI: 10.1093/gigascience/giy166] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Revised: 10/02/2018] [Accepted: 12/18/2018] [Indexed: 12/22/2022] Open
Abstract
Background The determination of microbial communities using the mothur tool suite (https://www.mothur.org) is well established. However, mothur requires bioinformatics-based proficiency in order to perform calculations via the command-line. Galaxy is a project dedicated to providing a user-friendly web interface for such command-line tools (https://galaxyproject.org/). Results We have integrated the full set of 125+ mothur tools into Galaxy as the Galaxy mothur Toolset (GmT) and provided a set of workflows to perform end-to-end 16S rRNA gene analyses and integrate with third-party visualization and reporting tools. We demonstrate the utility of GmT by analyzing the mothur MiSeq standard operating procedure (SOP) dataset (https://www.mothur.org/wiki/MiSeq_SOP). Conclusions GmT is available from the Galaxy Tool Shed, and a workflow definition file and full Galaxy training manual for the mothur SOP have been created. A Docker image with a fully configured GmT Galaxy is also available.
Collapse
Affiliation(s)
- Saskia D Hiltemann
- Erasmus University Medical Center Rotterdam, Department of Pathology, Bioinformatics group, Wytemaweg 80, 3015 CN, Rotterdam, The Netherlands
| | - Stefan A Boers
- Erasmus University Medical Center Rotterdam, Department of Medical Microbiology and Infectious Diseases, Dr. Molewaterplein 40, 3015 GD, Rotterdam, The Netherlands
| | - Peter J van der Spek
- Erasmus University Medical Center Rotterdam, Department of Pathology, Bioinformatics group, Wytemaweg 80, 3015 CN, Rotterdam, The Netherlands
| | - Ruud Jansen
- Regional Laboratory of Public Health Kennemerland, Department of Molecular Biology, Boerhaavelaan 26, 2035 RC, Haarlem, The Netherlands
| | - John P Hays
- Erasmus University Medical Center Rotterdam, Department of Medical Microbiology and Infectious Diseases, Dr. Molewaterplein 40, 3015 GD, Rotterdam, The Netherlands
| | - Andrew P Stubbs
- Erasmus University Medical Center Rotterdam, Department of Pathology, Bioinformatics group, Wytemaweg 80, 3015 CN, Rotterdam, The Netherlands
| |
Collapse
|
4
|
Boers SA, Hiltemann SD, Stubbs AP, Jansen R, Hays JP. Development and evaluation of a culture-free microbiota profiling platform (MYcrobiota) for clinical diagnostics. Eur J Clin Microbiol Infect Dis 2018; 37:1081-1089. [PMID: 29549470 PMCID: PMC5948305 DOI: 10.1007/s10096-018-3220-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2018] [Accepted: 02/20/2018] [Indexed: 12/19/2022]
Abstract
Microbiota profiling has the potential to greatly impact on routine clinical diagnostics by detecting DNA derived from live, fastidious, and dead bacterial cells present within clinical samples. Such results could potentially be used to benefit patients by influencing antibiotic prescribing practices or to generate new classical-based diagnostic methods, e.g., culture or PCR. However, technical flaws in 16S rRNA gene next-generation sequencing (NGS) protocols, together with the requirement for access to bioinformatics, currently hinder the introduction of microbiota analysis into clinical diagnostics. Here, we report on the development and evaluation of an “end-to-end” microbiota profiling platform (MYcrobiota), which combines our previously validated micelle PCR/NGS (micPCR/NGS) methodology with an easy-to-use, dedicated bioinformatics pipeline. The newly designed bioinformatics pipeline processes micPCR/NGS data automatically and summarizes the results in interactive, but simple web reports. In order to explore the utility of MYcrobiota in clinical diagnostics, 47 clinical samples (40 “damaged skin” samples and 7 synovial fluids) were investigated using routine bacterial culture as comparator. MYcrobiota confirmed the presence of bacterial DNA in 37/37 culture-positive samples and detected bacterial taxa in 2/10 culture-negative samples. Moreover, 36/38 potentially relevant aerobic bacterial taxa and 3/3 mixtures of anaerobic bacteria were identified using culture and MYcrobiota, with the sensitivity and specificity being 95%. Interestingly, the majority of the 448 bacterial taxa identified using MYcrobiota were not identified using culture, which could potentially have an impact on clinical decision-making. Taken together, the development of MYcrobiota is a promising step towards the introduction of microbiota analysis into clinical diagnostic laboratories.
Collapse
Affiliation(s)
- Stefan A Boers
- Department of Medical Microbiology and Infectious Diseases, Erasmus University Medical Centre Rotterdam (Erasmus MC), Wytemaweg 80, 3015 CN, Rotterdam, the Netherlands
| | - Saskia D Hiltemann
- Department of Bioinformatics, Erasmus University Medical Centre Rotterdam (Erasmus MC), Rotterdam, the Netherlands
| | - Andrew P Stubbs
- Department of Bioinformatics, Erasmus University Medical Centre Rotterdam (Erasmus MC), Rotterdam, the Netherlands
| | - Ruud Jansen
- Department of Molecular Biology, Regional Laboratory of Public Health Kennemerland, Haarlem, the Netherlands
| | - John P Hays
- Department of Medical Microbiology and Infectious Diseases, Erasmus University Medical Centre Rotterdam (Erasmus MC), Wytemaweg 80, 3015 CN, Rotterdam, the Netherlands.
| |
Collapse
|
5
|
Stubbs A, McClellan EA, Horsman S, Hiltemann SD, Palli I, Nouwens S, Koning AH, Hoogland F, Reumers J, Heijsman D, Swagemakers S, Kremer A, Meijerink J, Lambrechts D, van der Spek PJ. Huvariome: a web server resource of whole genome next-generation sequencing allelic frequencies to aid in pathological candidate gene selection. J Clin Bioinforma 2012; 2:19. [PMID: 23164068 PMCID: PMC3549785 DOI: 10.1186/2043-9113-2-19] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2012] [Accepted: 10/16/2012] [Indexed: 01/01/2023] Open
Abstract
UNLABELLED BACKGROUND Next generation sequencing provides clinical research scientists with direct read out of innumerable variants, including personal, pathological and common benign variants. The aim of resequencing studies is to determine the candidate pathogenic variants from individual genomes, or from family-based or tumor/normal genome comparisons. Whilst the use of appropriate controls within the experimental design will minimize the number of false positive variations selected, this number can be reduced further with the use of high quality whole genome reference data to minimize false positives variants prior to candidate gene selection. In addition the use of platform related sequencing error models can help in the recovery of ambiguous genotypes from lower coverage data. DESCRIPTION We have developed a whole genome database of human genetic variations, Huvariome, determined by whole genome deep sequencing data with high coverage and low error rates. The database was designed to be sequencing technology independent but is currently populated with 165 individual whole genomes consisting of small pedigrees and matched tumor/normal samples sequenced with the Complete Genomics sequencing platform. Common variants have been determined for a Benelux population cohort and represented as genotypes alongside the results of two sets of control data (73 of the 165 genomes), Huvariome Core which comprises 31 healthy individuals from the Benelux region, and Diversity Panel consisting of 46 healthy individuals representing 10 different populations and 21 samples in three Pedigrees. Users can query the database by gene or position via a web interface and the results are displayed as the frequency of the variations as detected in the datasets. We demonstrate that Huvariome can provide accurate reference allele frequencies to disambiguate sequencing inconsistencies produced in resequencing experiments. Huvariome has been used to support the selection of candidate cardiomyopathy related genes which have a homozygous genotype in the reference cohorts. This database allows the users to see which selected variants are common variants (> 5% minor allele frequency) in the Huvariome core samples, thus aiding in the selection of potentially pathogenic variants by filtering out common variants that are not listed in one of the other public genomic variation databases. The no-call rate and the accuracy of allele calling in Huvariome provides the user with the possibility of identifying platform dependent errors associated with specific regions of the human genome. CONCLUSION Huvariome is a simple to use resource for validation of resequencing results obtained by NGS experiments. The high sequence coverage and low error rates provide scientists with the ability to remove false positive results from pedigree studies. Results are returned via a web interface that displays location-based genetic variation frequency, impact on protein function, association with known genetic variations and a quality score of the variation base derived from Huvariome Core and the Diversity Panel data. These results may be used to identify and prioritize rare variants that, for example, might be disease relevant. In testing the accuracy of the Huvariome database, alleles of a selection of ambiguously called coding single nucleotide variants were successfully predicted in all cases. Data protection of individuals is ensured by restricted access to patient derived genomes from the host institution which is relevant for future molecular diagnostics.
Collapse
Affiliation(s)
- Andrew Stubbs
- Department of Bioinformatics, Erasmus University Medical Center, Molewaterplein 50, Rotterdam, The Netherlands.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|