1
|
Tekman M, Batut B, Ostrovsky A, Antoniewski C, Clements D, Ramirez F, Etherington GJ, Hotz HR, Scholtalbers J, Manning JR, Bellenger L, Doyle MA, Heydarian M, Huang N, Soranzo N, Moreno P, Mautner S, Papatheodorou I, Nekrutenko A, Taylor J, Blankenberg D, Backofen R, Grüning B. A single-cell RNA-sequencing training and analysis suite using the Galaxy framework. Gigascience 2020; 9:5931798. [PMID: 33079170 PMCID: PMC7574357 DOI: 10.1093/gigascience/giaa102] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 08/30/2020] [Indexed: 11/25/2022] Open
Abstract
Background The vast ecosystem of single-cell RNA-sequencing tools has until recently been plagued by an excess of diverging analysis strategies, inconsistent file formats, and compatibility issues between different software suites. The uptake of 10x Genomics datasets has begun to calm this diversity, and the bioinformatics community leans once more towards the large computing requirements and the statistically driven methods needed to process and understand these ever-growing datasets. Results Here we outline several Galaxy workflows and learning resources for single-cell RNA-sequencing, with the aim of providing a comprehensive analysis environment paired with a thorough user learning experience that bridges the knowledge gap between the computational methods and the underlying cell biology. The Galaxy reproducible bioinformatics framework provides tools, workflows, and trainings that not only enable users to perform 1-click 10x preprocessing but also empower them to demultiplex raw sequencing from custom tagged and full-length sequencing protocols. The downstream analysis supports a range of high-quality interoperable suites separated into common stages of analysis: inspection, filtering, normalization, confounder removal, and clustering. The teaching resources cover concepts from computer science to cell biology. Access to all resources is provided at the singlecell.usegalaxy.eu portal. Conclusions The reproducible and training-oriented Galaxy framework provides a sustainable high-performance computing environment for users to run flexible analyses on both 10x and alternative platforms. The tutorials from the Galaxy Training Network along with the frequent training workshops hosted by the Galaxy community provide a means for users to learn, publish, and teach single-cell RNA-sequencing analysis.
Collapse
Affiliation(s)
- Mehmet Tekman
- Department of Bioinformatics, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany
| | - Bérénice Batut
- Department of Bioinformatics, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany
| | - Alexander Ostrovsky
- Department of Biology, Johns Hopkins University, Mudd Hall 144, 3400 N. Charles Street, Baltimore, MD 21218, USA
| | - Christophe Antoniewski
- ARTbio, Sorbonne Université, CNRS FR 3631, Inserm US 037, Paris, France.,Institut de Biologie Paris Seine, 9 Quai Saint-Bernard Université Pierre et Marie Curie, Campus Jussieu, Bâtiments A-B-C, 75005 Paris, France
| | - Dave Clements
- Department of Biology, Johns Hopkins University, Mudd Hall 144, 3400 N. Charles Street, Baltimore, MD 21218, USA
| | - Fidel Ramirez
- Boehringer Ingelheim International GmbH, Binger Strasse 173, 55216 Ingelheim am Rhein, Biberach, Germany
| | | | - Hans-Rudolf Hotz
- Friedrich Miescher Institute for Biomedical Research, Maulbeerstrasse 66, 4058 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Maulbeerstrasse 66, 4058 Basel, Switzerland
| | - Jelle Scholtalbers
- European Molecular Biology Laboratory, Genome Biology Unit, Meyerhofstraße 1, 69117 Heidelberg, Germany
| | - Jonathan R Manning
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Lea Bellenger
- ARTbio, Sorbonne Université, CNRS FR 3631, Inserm US 037, Paris, France
| | - Maria A Doyle
- Research Computing Facility, Peter MacCallum Cancer Centre, Melbourne, 305 Grattan Street, Victoria 3000, Australia.,Sir Peter MacCallum Department of Oncology, The University of Melbourne, Victoria 3010, Australia
| | - Mohammad Heydarian
- Department of Biology, Johns Hopkins University, Mudd Hall 144, 3400 N. Charles Street, Baltimore, MD 21218, USA
| | - Ni Huang
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK.,Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Nicola Soranzo
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK
| | - Pablo Moreno
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Stefan Mautner
- Department of Bioinformatics, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany
| | - Irene Papatheodorou
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Anton Nekrutenko
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
| | - James Taylor
- Department of Biology, Johns Hopkins University, Mudd Hall 144, 3400 N. Charles Street, Baltimore, MD 21218, USA
| | - Daniel Blankenberg
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, 9500 Euclid Avenue, NB21 Cleveland, OH 44195, USA
| | - Rolf Backofen
- Department of Bioinformatics, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany
| | - Björn Grüning
- Department of Bioinformatics, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany
| |
Collapse
|