1
|
Wilks C, Gaddipati P, Nellore A, Langmead B. Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples. Bioinformatics 2018; 34:114-116. [PMID: 28968689 PMCID: PMC5870547 DOI: 10.1093/bioinformatics/btx547] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2017] [Accepted: 08/31/2017] [Indexed: 11/15/2022] Open
Abstract
Motivation As more and larger genomics studies appear, there is a growing need for comprehensive and queryable cross-study summaries. These enable researchers to leverage vast datasets that would otherwise be difficult to obtain. Results Snaptron is a search engine for summarized RNA sequencing data with a query planner that leverages R-tree, B-tree and inverted indexing strategies to rapidly execute queries over 146 million exon-exon splice junctions from over 70 000 human RNA-seq samples. Queries can be tailored by constraining which junctions and samples to consider. Snaptron can score junctions according to tissue specificity or other criteria, and can score samples according to the relative frequency of different splicing patterns. We describe the software and outline biological questions that can be explored with Snaptron queries. Availability and implementation Documentation is at http://snaptron.cs.jhu.edu. Source code is at https://github.com/ChristopherWilks/snaptron and https://github.com/ChristopherWilks/snaptron-experiments with a CC BY-NC 4.0 license. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Christopher Wilks
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.,Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Phani Gaddipati
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Abhinav Nellore
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR 97239, USA.,Department of Surgery,Oregon Health & Science University, Portland, OR 97239, USA.,Computational Biology Program, Oregon Health & Science University, Portland, OR 97239, USA
| | - Ben Langmead
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.,Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
| |
Collapse
|
2
|
Abstract
Next-generation sequencing has made major strides in the past decade. Studies based on large sequencing data sets are growing in number, and public archives for raw sequencing data have been doubling in size every 18 months. Leveraging these data requires researchers to use large-scale computational resources. Cloud computing, a model whereby users rent computers and storage from large data centres, is a solution that is gaining traction in genomics research. Here, we describe how cloud computing is used in genomics for research and large-scale collaborations, and argue that its elasticity, reproducibility and privacy features make it ideally suited for the large-scale reanalysis of publicly available archived data, including privacy-protected data.
Collapse
Affiliation(s)
- Ben Langmead
- Department of Computer Science, Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Abhinav Nellore
- Department of Biomedical Engineering, Department of Surgery, Computational Biology Program, Oregon Health and Science University, Portland, OR, USA
| |
Collapse
|
3
|
Collado-Torres L, Nellore A, Kammers K, Ellis SE, Taub MA, Hansen KD, Jaffe AE, Langmead B, Leek JT. Reproducible RNA-seq analysis using recount2. Nat Biotechnol 2017; 35:319-321. [PMID: 28398307 PMCID: PMC6742427 DOI: 10.1038/nbt.3838] [Citation(s) in RCA: 263] [Impact Index Per Article: 32.9] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Affiliation(s)
- Leonardo Collado-Torres
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, Maryland, USA
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, Maryland, USA
| | - Abhinav Nellore
- Department of Biomedical Engineering, Oregon Health &Science University, Portland, Oregon, USA
- Department of Surgery, Oregon Health &Science University, Portland, Oregon, USA
- Computational Biology Program, Oregon Health &Science University, Portland, Oregon, USA
| | - Kai Kammers
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, Maryland, USA
- Division of Biostatistics and Bioinformatics, Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Shannon E Ellis
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, Maryland, USA
| | - Margaret A Taub
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, Maryland, USA
| | - Kasper D Hansen
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, Maryland, USA
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland, USA
| | - Andrew E Jaffe
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, Maryland, USA
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, Maryland, USA
- Department of Mental Health, Johns Hopkins University, Baltimore, Maryland, USA
| | - Ben Langmead
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, Maryland, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, USA
| | - Jeffrey T Leek
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, Maryland, USA
| |
Collapse
|