1
|
Kalantar KL, Carvalho T, de Bourcy CFA, Dimitrov B, Dingle G, Egger R, Han J, Holmes OB, Juan YF, King R, Kislyuk A, Lin MF, Mariano M, Morse T, Reynoso LV, Cruz DR, Sheu J, Tang J, Wang J, Zhang MA, Zhong E, Ahyong V, Lay S, Chea S, Bohl JA, Manning JE, Tato CM, DeRisi JL. IDseq-An open source cloud-based pipeline and analysis service for metagenomic pathogen detection and monitoring. Gigascience 2020; 9:giaa111. [PMID: 33057676 PMCID: PMC7566497 DOI: 10.1093/gigascience/giaa111] [Citation(s) in RCA: 135] [Impact Index Per Article: 33.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 08/28/2020] [Accepted: 09/22/2020] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Metagenomic next-generation sequencing (mNGS) has enabled the rapid, unbiased detection and identification of microbes without pathogen-specific reagents, culturing, or a priori knowledge of the microbial landscape. mNGS data analysis requires a series of computationally intensive processing steps to accurately determine the microbial composition of a sample. Existing mNGS data analysis tools typically require bioinformatics expertise and access to local server-class hardware resources. For many research laboratories, this presents an obstacle, especially in resource-limited environments. FINDINGS We present IDseq, an open source cloud-based metagenomics pipeline and service for global pathogen detection and monitoring (https://idseq.net). The IDseq Portal accepts raw mNGS data, performs host and quality filtration steps, then executes an assembly-based alignment pipeline, which results in the assignment of reads and contigs to taxonomic categories. The taxonomic relative abundances are reported and visualized in an easy-to-use web application to facilitate data interpretation and hypothesis generation. Furthermore, IDseq supports environmental background model generation and automatic internal spike-in control recognition, providing statistics that are critical for data interpretation. IDseq was designed with the specific intent of detecting novel pathogens. Here, we benchmark novel virus detection capability using both synthetically evolved viral sequences and real-world samples, including IDseq analysis of a nasopharyngeal swab sample acquired and processed locally in Cambodia from a tourist from Wuhan, China, infected with the recently emergent SARS-CoV-2. CONCLUSION The IDseq Portal reduces the barrier to entry for mNGS data analysis and enables bench scientists, clinicians, and bioinformaticians to gain insight from mNGS datasets for both known and novel pathogens.
Collapse
Affiliation(s)
- Katrina L Kalantar
- Chan Zuckerberg Initiative, Science, PO Box 8040 Redwood City, CA 94063, USA
| | - Tiago Carvalho
- Chan Zuckerberg Initiative, Science, PO Box 8040 Redwood City, CA 94063, USA
| | | | - Boris Dimitrov
- Chan Zuckerberg Initiative, Science, PO Box 8040 Redwood City, CA 94063, USA
| | - Greg Dingle
- Chan Zuckerberg Initiative, Science, PO Box 8040 Redwood City, CA 94063, USA
| | - Rebecca Egger
- Chan Zuckerberg Initiative, Science, PO Box 8040 Redwood City, CA 94063, USA
| | - Julie Han
- Chan Zuckerberg Initiative, Science, PO Box 8040 Redwood City, CA 94063, USA
| | - Olivia B Holmes
- Chan Zuckerberg Initiative, Science, PO Box 8040 Redwood City, CA 94063, USA
| | - Yun-Fang Juan
- Chan Zuckerberg Initiative, Science, PO Box 8040 Redwood City, CA 94063, USA
| | - Ryan King
- Chan Zuckerberg Initiative, Science, PO Box 8040 Redwood City, CA 94063, USA
| | - Andrey Kislyuk
- Chan Zuckerberg Initiative, Science, PO Box 8040 Redwood City, CA 94063, USA
| | - Michael F Lin
- Chan Zuckerberg Initiative, Science, PO Box 8040 Redwood City, CA 94063, USA
| | - Maria Mariano
- Chan Zuckerberg Initiative, Science, PO Box 8040 Redwood City, CA 94063, USA
| | - Todd Morse
- Chan Zuckerberg Initiative, Science, PO Box 8040 Redwood City, CA 94063, USA
| | - Lucia V Reynoso
- Chan Zuckerberg Initiative, Science, PO Box 8040 Redwood City, CA 94063, USA
| | - David Rissato Cruz
- Chan Zuckerberg Initiative, Science, PO Box 8040 Redwood City, CA 94063, USA
| | - Jonathan Sheu
- Chan Zuckerberg Initiative, Science, PO Box 8040 Redwood City, CA 94063, USA
| | - Jennifer Tang
- Chan Zuckerberg Initiative, Science, PO Box 8040 Redwood City, CA 94063, USA
| | - James Wang
- Chan Zuckerberg Initiative, Science, PO Box 8040 Redwood City, CA 94063, USA
| | - Mark A Zhang
- Chan Zuckerberg Initiative, Science, PO Box 8040 Redwood City, CA 94063, USA
| | - Emily Zhong
- Chan Zuckerberg Initiative, Science, PO Box 8040 Redwood City, CA 94063, USA
| | - Vida Ahyong
- Chan Zuckerberg Biohub, 499 Illinois St, San Francisco, CA 94158, USA
| | - Sreyngim Lay
- Malaria and Vector Research Laboratory, National Institute of Allergy and Infectious Diseases, Phnom Penh, Cambodia
| | - Sophana Chea
- Malaria and Vector Research Laboratory, National Institute of Allergy and Infectious Diseases, Phnom Penh, Cambodia
| | - Jennifer A Bohl
- Malaria and Vector Research Laboratory, National Institute of Allergy and Infectious Diseases, Phnom Penh, Cambodia
| | - Jessica E Manning
- Malaria and Vector Research Laboratory, National Institute of Allergy and Infectious Diseases, Phnom Penh, Cambodia
| | - Cristina M Tato
- Chan Zuckerberg Biohub, 499 Illinois St, San Francisco, CA 94158, USA
| | - Joseph L DeRisi
- Chan Zuckerberg Biohub, 499 Illinois St, San Francisco, CA 94158, USA
| |
Collapse
|