1
|
Apostolides M, Jiang Y, Husić M, Siddaway R, Hawkins C, Turinsky AL, Brudno M, Ramani AK. MetaFusion: A high-confidence metacaller for filtering and prioritizing RNA-seq gene fusion candidates. Bioinformatics 2021; 37:3144-3151. [PMID: 33944895 DOI: 10.1093/bioinformatics/btab249] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 03/04/2021] [Accepted: 05/03/2021] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Current fusion detection tools use diverse calling approaches and provide varying results, making selection of the appropriate tool challenging. Ensemble fusion calling techniques appear promising; however, current options have limited accessibility and function. RESULTS MetaFusion is a flexible meta-calling tool that amalgamates outputs from any number of fusion callers. Individual caller results are standardized by conversion into the new file type Common Fusion Format (CFF). Calls are annotated, merged using graph clustering, filtered, and ranked to provide a final output of high confidence candidates. MetaFusion consistently achieves higher precision and recall than individual callers on real and simulated datasets, and reaches up to 100% precision, indicating that ensemble calling is imperative for high confidence results. MetaFusion uses FusionAnnotator to annotate calls with information from cancer fusion databases, and is provided with a benchmarking toolkit to calibrate new callers. AVAILABILITY MetaFusion is freely available at https://github.com/ccmbioinfo/MetaFusion. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Michael Apostolides
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada
| | - Yue Jiang
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada
| | - Mia Husić
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada
| | - Robert Siddaway
- The Arthur and Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children, Toronto, ON, Canada
| | - Cynthia Hawkins
- The Arthur and Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children, Toronto, ON, Canada.,Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada.,Division of Pathology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Andrei L Turinsky
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada
| | - Michael Brudno
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada.,Genetics and Genome Biology Program, The Hospital for Sick Children, Toronto, ON, Canada.,Department of Computer Science, University of Toronto, Toronto, ON, Canada.,University Health Network, Toronto, ON, Canada
| | - Arun K Ramani
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada
| |
Collapse
|
2
|
Liu Z, Chen X, Roberts R, Huang R, Mikailov M, Tong W. Unraveling Gene Fusions for Drug Repositioning in High-Risk Neuroblastoma. Front Pharmacol 2021; 12:608778. [PMID: 33967751 PMCID: PMC8105087 DOI: 10.3389/fphar.2021.608778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 03/23/2021] [Indexed: 11/13/2022] Open
Abstract
High-risk neuroblastoma (NB) remains a significant therapeutic challenge facing current pediatric oncology patients. Structural variants such as gene fusions have shown an initial promise in enhancing mechanistic understanding of NB and improving survival rates. In this study, we performed a comprehensive in silico investigation on the translational ability of gene fusions for patient stratification and treatment development for high-risk NB patients. Specifically, three state-of-the-art gene fusion detection algorithms, including ChimeraScan, SOAPfuse, and TopHat-Fusion, were employed to identify the fusion transcripts in a RNA-seq data set of 498 neuroblastoma patients. Then, the 176 high-risk patients were further stratified into four different subgroups based on gene fusion profiles. Furthermore, Kaplan-Meier survival analysis was performed, and differentially expressed genes (DEGs) for the redefined high-risk group were extracted and functionally analyzed. Finally, repositioning candidates were enriched in each patient subgroup with drug transcriptomic profiles from the LINCS L1000 Connectivity Map. We found the number of identified gene fusions was increased from clinical the low-risk stage to the high-risk stage. Although the technical concordance of fusion detection algorithms was suboptimal, they have a similar biological relevance concerning perturbed pathways and regulated DEGs. The gene fusion profiles could be utilized to redefine high-risk patient subgroups with significant onset age of NB, which yielded the improved survival curves (Log-rank p value ≤ 0.05). Out of 48 enriched repositioning candidates, 45 (93.8%) have antitumor potency, and 24 (50%) were confirmed with either on-going clinical trials or literature reports. The gene fusion profiles have a discrimination power for redefining patient subgroups in high-risk NB and facilitate precision medicine-based drug repositioning implementation.
Collapse
Affiliation(s)
- Zhichao Liu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, United States
| | - Xi Chen
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, United States
| | - Ruth Roberts
- ApconiX, BioHub at Alderley Park, Alderley Edge, United Kingdom.,University of Birmingham, Edgbaston, Birmingham, United Kingdom
| | - Ruili Huang
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD, United States
| | - Mike Mikailov
- Office of Science and Engineering Labs, Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, MD, United States
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, United States
| |
Collapse
|
3
|
Schmidt BM, Davidson NM, Hawkins ADK, Bartolo R, Majewski IJ, Ekert PG, Oshlack A. Clinker: visualizing fusion genes detected in RNA-seq data. Gigascience 2018; 7:5049009. [PMID: 29982439 PMCID: PMC6065480 DOI: 10.1093/gigascience/giy079] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2018] [Accepted: 06/21/2018] [Indexed: 12/02/2022] Open
Abstract
Background Genomic profiling efforts have revealed a rich diversity of oncogenic fusion genes. While there are many methods for identifying fusion genes from RNA-sequencing (RNA-seq) data, visualizing these transcripts and their supporting reads remains challenging. Findings Clinker is a bioinformatics tool written in Python, R, and Bpipe that leverages the superTranscript method to visualize fusion genes. We demonstrate the use of Clinker to obtain interpretable visualizations of the RNA-seq data that lead to fusion calls. In addition, we use Clinker to explore multiple fusion transcripts with novel breakpoints within the P2RY8-CRLF2 fusion gene in B-cell acute lymphoblastic leukemia. Conclusions Clinker is freely available software that allows visualization of fusion genes and the RNA-seq data used in their discovery.
Collapse
Affiliation(s)
- Breon M Schmidt
- Murdoch Children's Research Institute, The Royal Children's Hospital, Flemington, Road, Parkville Vic 3052 Australia
| | - Nadia M Davidson
- Murdoch Children's Research Institute, The Royal Children's Hospital, Flemington, Road, Parkville Vic 3052 Australia.,School of Biosciences, University of Melbourne, Parkivlle Vic 3010, Australia
| | - Anthony D K Hawkins
- Murdoch Children's Research Institute, The Royal Children's Hospital, Flemington, Road, Parkville Vic 3052 Australia
| | - Ray Bartolo
- Murdoch Children's Research Institute, The Royal Children's Hospital, Flemington, Road, Parkville Vic 3052 Australia
| | - Ian J Majewski
- Division of Cancer and Haematology, The Walter and Eliza Hall Institute of Medical Research, Parkville Vic 3052, Australia.,Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Parkville Vic 3010, Australia
| | - Paul G Ekert
- Murdoch Children's Research Institute, The Royal Children's Hospital, Flemington, Road, Parkville Vic 3052 Australia
| | - Alicia Oshlack
- Murdoch Children's Research Institute, The Royal Children's Hospital, Flemington, Road, Parkville Vic 3052 Australia.,School of Biosciences, University of Melbourne, Parkivlle Vic 3010, Australia
| |
Collapse
|
4
|
Co-fuse: a new class discovery analysis tool to identify and prioritize recurrent fusion genes from RNA-sequencing data. Mol Genet Genomics 2018; 293:1217-1229. [PMID: 29882166 DOI: 10.1007/s00438-018-1454-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2017] [Accepted: 05/31/2018] [Indexed: 10/14/2022]
Abstract
Recurrent oncogenic fusion genes play a critical role in the development of various cancers and diseases and provide, in some cases, excellent therapeutic targets. To date, analysis tools that can identify and compare recurrent fusion genes across multiple samples have not been available to researchers. To address this deficiency, we developed Co-occurrence Fusion (Co-fuse), a new and easy to use software tool that enables biologists to merge RNA-seq information, allowing them to identify recurrent fusion genes, without the need for exhaustive data processing. Notably, Co-fuse is based on pattern mining and statistical analysis which enables the identification of hidden patterns of recurrent fusion genes. In this report, we show that Co-fuse can be used to identify 2 distinct groups within a set of 49 leukemic cell lines based on their recurrent fusion genes: a multiple myeloma (MM) samples-enriched cluster and an acute myeloid leukemia (AML) samples-enriched cluster. Our experimental results further demonstrate that Co-fuse can identify known driver fusion genes (e.g., IGH-MYC, IGH-WHSC1) in MM, when compared to AML samples, indicating the potential of Co-fuse to aid the discovery of yet unknown driver fusion genes through cohort comparisons. Additionally, using a 272 primary glioma sample RNA-seq dataset, Co-fuse was able to validate recurrent fusion genes, further demonstrating the power of this analysis tool to identify recurrent fusion genes. Taken together, Co-fuse is a powerful new analysis tool that can be readily applied to large RNA-seq datasets, and may lead to the discovery of new disease subgroups and potentially new driver genes, for which, targeted therapies could be developed. The Co-fuse R source code is publicly available at https://github.com/sakrapee/co-fuse .
Collapse
|
5
|
Identification of Gene Mutations and Fusion Genes in Patients with Sézary Syndrome. J Invest Dermatol 2016; 136:1490-1499. [DOI: 10.1016/j.jid.2016.03.024] [Citation(s) in RCA: 61] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2015] [Revised: 03/07/2016] [Accepted: 03/11/2016] [Indexed: 12/12/2022]
|
6
|
Latysheva NS, Babu MM. Discovering and understanding oncogenic gene fusions through data intensive computational approaches. Nucleic Acids Res 2016; 44:4487-503. [PMID: 27105842 PMCID: PMC4889949 DOI: 10.1093/nar/gkw282] [Citation(s) in RCA: 110] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2016] [Accepted: 03/24/2016] [Indexed: 12/21/2022] Open
Abstract
Although gene fusions have been recognized as important drivers of cancer for decades, our understanding of the prevalence and function of gene fusions has been revolutionized by the rise of next-generation sequencing, advances in bioinformatics theory and an increasing capacity for large-scale computational biology. The computational work on gene fusions has been vastly diverse, and the present state of the literature is fragmented. It will be fruitful to merge three camps of gene fusion bioinformatics that appear to rarely cross over: (i) data-intensive computational work characterizing the molecular biology of gene fusions; (ii) development research on fusion detection tools, candidate fusion prioritization algorithms and dedicated fusion databases and (iii) clinical research that seeks to either therapeutically target fusion transcripts and proteins or leverages advances in detection tools to perform large-scale surveys of gene fusion landscapes in specific cancer types. In this review, we unify these different-yet highly complementary and symbiotic-approaches with the view that increased synergy will catalyze advancements in gene fusion identification, characterization and significance evaluation.
Collapse
Affiliation(s)
- Natasha S Latysheva
- MRC Laboratory of Molecular Biology, Francis Crick Ave, Cambridge CB2 0QH, United Kingdom
| | - M Madan Babu
- MRC Laboratory of Molecular Biology, Francis Crick Ave, Cambridge CB2 0QH, United Kingdom
| |
Collapse
|
7
|
Pagan M, Kloos RT, Lin CF, Travers KJ, Matsuzaki H, Tom EY, Kim SY, Wong MG, Stewart AC, Huang J, Walsh PS, Monroe RJ, Kennedy GC. The diagnostic application of RNA sequencing in patients with thyroid cancer: an analysis of 851 variants and 133 fusions in 524 genes. BMC Bioinformatics 2016; 17 Suppl 1:6. [PMID: 26818556 PMCID: PMC4895782 DOI: 10.1186/s12859-015-0849-9] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Background Thyroid carcinomas are known to harbor oncogenic driver mutations and advances in sequencing technology now allow the detection of these in fine needle aspiration biopsies (FNA). Recent work by The Cancer Genome Atlas (TCGA) Research Network has expanded the number of genetic alterations detected in papillary thyroid carcinomas (PTC). We sought to investigate the prevalence of these and other genetic alterations in diverse subtypes of thyroid nodules beyond PTC, including a variety of samples with benign histopathology. This is the first clinical evaluation of a large panel of TCGA-reported genomic alterations in thyroid FNAs. Results In FNAs, genetic alterations were detected in 19/44 malignant samples (43 % sensitivity) and in 7/44 histopathology benign samples (84 % specificity). Overall, after adding a cohort of tissue samples, 38/76 (50 %) of histopathology malignant samples were found to harbor a genetic alteration, while 15/75 (20 %) of benign samples were also mutated. The most frequently mutated malignant subtypes were medullary thyroid carcinoma (9/12, 75 %) and PTC (14/30, 47 %). Additionally, follicular adenoma, a benign subtype of thyroid neoplasm, was also found to harbor mutations (12/29, 41 %). Frequently mutated genes in malignant samples included BRAF (20/76, 26 %) and RAS (9/76, 12 %). Of the TSHR variants detected, (6/7, 86 %) were in benign nodules. In a direct comparison of the same FNA also tested by an RNA-based gene expression classifier (GEC), the sensitivity of genetic alterations alone was 42 %, compared to the 91 % sensitivity achieved by the GEC. The specificity based only on genetic alterations was 84 %, compared to 77 % specificity with the GEC. Conclusions While the genomic landscape of all thyroid neoplasm subtypes will inevitably be elucidated, caution should be used in the early adoption of published mutations as the sole predictor of malignancy in thyroid. The largest set of such mutations known to date detects only a portion of thyroid carcinomas in preoperative FNAs in our cohort and thus is not sufficient to rule out cancer. Due to the finding that variants are also found in benign nodules, testing only GEC suspicious nodules may be helpful in avoiding false positives and altering the extent of treatment when selected mutations are found. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0849-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | | | | | | | - Ed Y Tom
- Veracyte, Inc., South San Francisco, CA, USA
| | - Su Yeon Kim
- Veracyte, Inc., South San Francisco, CA, USA
| | - Mei G Wong
- Veracyte, Inc., South San Francisco, CA, USA
| | | | - Jing Huang
- Veracyte, Inc., South San Francisco, CA, USA
| | | | | | | |
Collapse
|
8
|
Arsenijevic V, Davis-Dusenbery BN. Reproducible, Scalable Fusion Gene Detection from RNA-Seq. Methods Mol Biol 2016; 1381:223-37. [PMID: 26667464 DOI: 10.1007/978-1-4939-3204-7_13] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Chromosomal rearrangements resulting in the creation of novel gene products, termed fusion genes, have been identified as driving events in the development of multiple types of cancer. As these gene products typically do not exist in normal cells, they represent valuable prognostic and therapeutic targets. Advances in next-generation sequencing and computational approaches have greatly improved our ability to detect and identify fusion genes. Nevertheless, these approaches require significant computational resources. Here we describe an approach which leverages cloud computing technologies to perform fusion gene detection from RNA sequencing data at any scale. We additionally highlight methods to enhance reproducibility of bioinformatics analyses which may be applied to any next-generation sequencing experiment.
Collapse
Affiliation(s)
- Vladan Arsenijevic
- Department of Bioinformatics, Seven Bridges Genomics, One Broadway, 14th Floor, Cambridge, MA, 02142, USA
| | - Brandi N Davis-Dusenbery
- Department of Bioinformatics, Seven Bridges Genomics, One Broadway, 14th Floor, Cambridge, MA, 02142, USA.
| |
Collapse
|
9
|
Abstract
The occurrence of chimeric transcripts has been reported in many cancer cells and seen as potential biomarkers and therapeutic targets. Modern high-throughput sequencing technologies offer a way to investigate individual chimeric transcripts and the systematic information of associated gene expressions about underlying genome structural variations and genomic interactions. The detection methods of finding chimeric transcripts from massive amount of short read sequence data are discussed here. Both assembly-based and alignment-based methods are used for the investigation of chimeric transcripts.
Collapse
|
10
|
Hoogstrate Y, Böttcher R, Hiltemann S, van der Spek PJ, Jenster G, Stubbs AP. FuMa: reporting overlap in RNA-seq detected fusion genes. Bioinformatics 2015; 32:1226-8. [PMID: 26656567 DOI: 10.1093/bioinformatics/btv721] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2015] [Accepted: 12/04/2015] [Indexed: 11/14/2022] Open
Abstract
UNLABELLED A new generation of tools that identify fusion genes in RNA-seq data is limited in either sensitivity and or specificity. To allow further downstream analysis and to estimate performance, predicted fusion genes from different tools have to be compared. However, the transcriptomic context complicates genomic location-based matching. FusionMatcher (FuMa) is a program that reports identical fusion genes based on gene-name annotations. FuMa automatically compares and summarizes all combinations of two or more datasets in a single run, without additional programming necessary. FuMa uses one gene annotation, avoiding mismatches caused by tool-specific gene annotations. FuMa matches 10% more fusion genes compared with exact gene matching due to overlapping genes and accepts intermediate output files that allow a stepwise analysis of corresponding tools. AVAILABILITY AND IMPLEMENTATION The code is available at: https://github.com/ErasmusMC-Bioinformatics/fuma and available for Galaxy in the tool sheds and directly accessible at https://bioinf-galaxian.erasmusmc.nl/galaxy/ CONTACT y.hoogstrate@erasmusmc.nl or a.stubbs@erasmusmc.nl SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Youri Hoogstrate
- Department of Urology and Department of Bioinformatics, Erasmus University Medical Center, Rotterdam, 3000 CA, The Netherlands
| | | | - Saskia Hiltemann
- Department of Urology and Department of Bioinformatics, Erasmus University Medical Center, Rotterdam, 3000 CA, The Netherlands
| | - Peter J van der Spek
- Department of Bioinformatics, Erasmus University Medical Center, Rotterdam, 3000 CA, The Netherlands
| | | | - Andrew P Stubbs
- Department of Bioinformatics, Erasmus University Medical Center, Rotterdam, 3000 CA, The Netherlands
| |
Collapse
|
11
|
Liu S, Tsai WH, Ding Y, Chen R, Fang Z, Huo Z, Kim S, Ma T, Chang TY, Priedigkeit NM, Lee AV, Luo J, Wang HW, Chung IF, Tseng GC. Comprehensive evaluation of fusion transcript detection algorithms and a meta-caller to combine top performing methods in paired-end RNA-seq data. Nucleic Acids Res 2015; 44:e47. [PMID: 26582927 PMCID: PMC4797269 DOI: 10.1093/nar/gkv1234] [Citation(s) in RCA: 112] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2015] [Accepted: 10/24/2015] [Indexed: 12/31/2022] Open
Abstract
Background: Fusion transcripts are formed by either fusion genes (DNA level) or trans-splicing events (RNA level). They have been recognized as a promising tool for diagnosing, subtyping and treating cancers. RNA-seq has become a precise and efficient standard for genome-wide screening of such aberration events. Many fusion transcript detection algorithms have been developed for paired-end RNA-seq data but their performance has not been comprehensively evaluated to guide practitioners. In this paper, we evaluated 15 popular algorithms by their precision and recall trade-off, accuracy of supporting reads and computational cost. We further combine top-performing methods for improved ensemble detection. Results: Fifteen fusion transcript detection tools were compared using three synthetic data sets under different coverage, read length, insert size and background noise, and three real data sets with selected experimental validations. No single method dominantly performed the best but SOAPfuse generally performed well, followed by FusionCatcher and JAFFA. We further demonstrated the potential of a meta-caller algorithm by combining top performing methods to re-prioritize candidate fusion transcripts with high confidence that can be followed by experimental validation. Conclusion: Our result provides insightful recommendations when applying individual tool or combining top performers to identify fusion transcript candidates.
Collapse
Affiliation(s)
- Silvia Liu
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, 130 De Soto Street, Pittsburgh, PA 15261, USA Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Biomedical Science Tower 3, 3501 Fifth Avenue, Pittsburgh, PA 15213, USA
| | - Wei-Hsiang Tsai
- Institute of Biomedical Informatics, National Yang-Ming University, No. 155, Sec. 2, Linong Street, Beitou District, Taipei 112, Taiwan
| | - Ying Ding
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, 130 De Soto Street, Pittsburgh, PA 15261, USA Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Biomedical Science Tower 3, 3501 Fifth Avenue, Pittsburgh, PA 15213, USA
| | - Rui Chen
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, 130 De Soto Street, Pittsburgh, PA 15261, USA
| | - Zhou Fang
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, 130 De Soto Street, Pittsburgh, PA 15261, USA
| | - Zhiguang Huo
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, 130 De Soto Street, Pittsburgh, PA 15261, USA
| | - SungHwan Kim
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, 130 De Soto Street, Pittsburgh, PA 15261, USA
| | - Tianzhou Ma
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, 130 De Soto Street, Pittsburgh, PA 15261, USA
| | - Ting-Yu Chang
- Institute of Microbiology and Immunology, National Yang-Ming University, No. 155, Sec. 2, Linong Street, Beitou District, Taipei 112, Taiwan
| | - Nolan Michael Priedigkeit
- Molecular Pharmacology, School of Medicine, University of Pittsburgh, 3550 Terrace Street, Pittsburgh, PA 15261, USA
| | - Adrian V Lee
- Magee-Women's Research Institute, 204 Craft Avenue, Pittsburgh, PA 15213, USA
| | - Jianhua Luo
- Department of Pathology, School of Medicine, University of Pittsburgh, 3550 Terrace Street, Pittsburgh, PA 15261, USA
| | - Hsei-Wei Wang
- Institute of Biomedical Informatics, National Yang-Ming University, No. 155, Sec. 2, Linong Street, Beitou District, Taipei 112, Taiwan Institute of Microbiology and Immunology, National Yang-Ming University, No. 155, Sec. 2, Linong Street, Beitou District, Taipei 112, Taiwan Center for Systems and Synthetic Biology, National Yang-Ming University, No. 155, Sec. 2, Linong Street, Beitou District, Taipei 112, Taiwan
| | - I-Fang Chung
- Institute of Biomedical Informatics, National Yang-Ming University, No. 155, Sec. 2, Linong Street, Beitou District, Taipei 112, Taiwan Center for Systems and Synthetic Biology, National Yang-Ming University, No. 155, Sec. 2, Linong Street, Beitou District, Taipei 112, Taiwan
| | - George C Tseng
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, 130 De Soto Street, Pittsburgh, PA 15261, USA Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Biomedical Science Tower 3, 3501 Fifth Avenue, Pittsburgh, PA 15213, USA
| |
Collapse
|
12
|
Griffith M, Walker JR, Spies NC, Ainscough BJ, Griffith OL. Informatics for RNA Sequencing: A Web Resource for Analysis on the Cloud. PLoS Comput Biol 2015; 11:e1004393. [PMID: 26248053 PMCID: PMC4527835 DOI: 10.1371/journal.pcbi.1004393] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Massively parallel RNA sequencing (RNA-seq) has rapidly become the assay of choice for interrogating RNA transcript abundance and diversity. This article provides a detailed introduction to fundamental RNA-seq molecular biology and informatics concepts. We make available open-access RNA-seq tutorials that cover cloud computing, tool installation, relevant file formats, reference genomes, transcriptome annotations, quality-control strategies, expression, differential expression, and alternative splicing analysis methods. These tutorials and additional training resources are accompanied by complete analysis pipelines and test datasets made available without encumbrance at www.rnaseq.wiki.
Collapse
Affiliation(s)
- Malachi Griffith
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri, United States of America
- Siteman Cancer Center, Washington University School of Medicine, St. Louis, Missouri, United States of America
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Jason R. Walker
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Nicholas C. Spies
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Benjamin J. Ainscough
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri, United States of America
- Siteman Cancer Center, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Obi L. Griffith
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri, United States of America
- Siteman Cancer Center, Washington University School of Medicine, St. Louis, Missouri, United States of America
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
- Department of Medicine, Washington University School of Medicine, St. Louis, Missouri, United States of America
| |
Collapse
|