1
|
Pinter N, Glätzer D, Fahrner M, Fröhlich K, Johnson J, Grüning BA, Warscheid B, Drepper F, Schilling O, Föll MC. MaxQuant and MSstats in Galaxy Enable Reproducible Cloud-Based Analysis of Quantitative Proteomics Experiments for Everyone. J Proteome Res 2022; 21:1558-1565. [PMID: 35503992 DOI: 10.1021/acs.jproteome.2c00051] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Quantitative mass spectrometry-based proteomics has become a high-throughput technology for the identification and quantification of thousands of proteins in complex biological samples. Two frequently used tools, MaxQuant and MSstats, allow for the analysis of raw data and finding proteins with differential abundance between conditions of interest. To enable accessible and reproducible quantitative proteomics analyses in a cloud environment, we have integrated MaxQuant (including TMTpro 16/18plex), Proteomics Quality Control (PTXQC), MSstats, and MSstatsTMT into the open-source Galaxy framework. This enables the web-based analysis of label-free and isobaric labeling proteomics experiments via Galaxy's graphical user interface on public clouds. MaxQuant and MSstats in Galaxy can be applied in conjunction with thousands of existing Galaxy tools and integrated into standardized, sharable workflows. Galaxy tracks all metadata and intermediate results in analysis histories, which can be shared privately for collaborations or publicly, allowing full reproducibility and transparency of published analysis. To further increase accessibility, we provide detailed hands-on training materials. The integration of MaxQuant and MSstats into the Galaxy framework enables their usage in a reproducible way on accessible large computational infrastructures, hence realizing the foundation for high-throughput proteomics data science for everyone.
Collapse
Affiliation(s)
- Niko Pinter
- Institute for Surgical Pathology, Medical Center, University of Freiburg, 79106 Freiburg, Germany.,Faculty of Medicine, University of Freiburg, 79110 Freiburg, Germany
| | - Damian Glätzer
- Biochemistry and Functional Proteomics, Institute of Biology II, Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany
| | - Matthias Fahrner
- Institute for Surgical Pathology, Medical Center, University of Freiburg, 79106 Freiburg, Germany.,Faculty of Medicine, University of Freiburg, 79110 Freiburg, Germany.,Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany
| | - Klemens Fröhlich
- Institute for Surgical Pathology, Medical Center, University of Freiburg, 79106 Freiburg, Germany.,Faculty of Medicine, University of Freiburg, 79110 Freiburg, Germany.,Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany.,Spemann Graduate School of Biology and Medicine (SGBM), Albert-Ludwigs-University Freiburg, 79104 Freiburg, Germany
| | - James Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | | | - Bettina Warscheid
- Biochemistry and Functional Proteomics, Institute of Biology II, Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany.,Faculty of Chemistry and Pharmacy, Department of Biochemistry, Julius Maximilian University of Würzburg, 97074 Würzburg, Germany
| | - Friedel Drepper
- Biochemistry and Functional Proteomics, Institute of Biology II, Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany
| | - Oliver Schilling
- Institute for Surgical Pathology, Medical Center, University of Freiburg, 79106 Freiburg, Germany.,Faculty of Medicine, University of Freiburg, 79110 Freiburg, Germany.,German Cancer Consortium (DKTK) and Cancer Research Center (DKFZ), 79106 Freiburg, Germany
| | - Melanie Christine Föll
- Institute for Surgical Pathology, Medical Center, University of Freiburg, 79106 Freiburg, Germany.,Faculty of Medicine, University of Freiburg, 79110 Freiburg, Germany.,Khoury College of Computer Sciences, Northeastern University, Boston, Massachusetts 02115, United States
| |
Collapse
|
2
|
Neely BA. Cloudy with a Chance of Peptides: Accessibility, Scalability, and Reproducibility with Cloud-Hosted Environments. J Proteome Res 2021; 20:2076-2082. [PMID: 33513299 PMCID: PMC8637422 DOI: 10.1021/acs.jproteome.0c00920] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Cloud-hosted environments offer known benefits when computational needs outstrip affordable local workstations, enabling high-performance computation without a physical cluster. What has been less apparent, especially to novice users, is the transformative potential for cloud-hosted environments to bridge the digital divide that exists between poorly funded and well-resourced laboratories, and to empower modern research groups with remote personnel and trainees. Using cloud-based proteomic bioinformatic pipelines is not predicated on analyzing thousands of files, but instead can be used to improve accessibility during remote work, extreme weather, or working with under-resourced remote trainees. The general benefits of cloud-hosted environments also allow for scalability and encourage reproducibility. Since one possible hurdle to adoption is awareness, this paper is written with the nonexpert in mind. The benefits and possibilities of using a cloud-hosted environment are emphasized by describing how to setup an example workflow to analyze a previously published label-free data-dependent acquisition mass spectrometry data set of mammalian urine. Cost and time of analysis are compared using different computational tiers, and important practical considerations are described. Overall, cloud-hosted environments offer the potential to solve large computational problems, but more importantly can enable and accelerate research in smaller research groups with inadequate infrastructure and suboptimal local computational resources.
Collapse
Affiliation(s)
- Benjamin A Neely
- Chemical Sciences Division, National Institute of Standards and Technology, Charleston, South Carolina 29412, United States
| |
Collapse
|
3
|
Barnett CB, Senapathi T, Naidoo KJ. Comparative ligand structural analytics illustrated on variably glycosylated MUC1 antigen-antibody binding. Beilstein J Org Chem 2020; 16:2540-2550. [PMID: 33133286 PMCID: PMC7590620 DOI: 10.3762/bjoc.16.206] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 09/30/2020] [Indexed: 01/03/2023] Open
Abstract
When faced with the investigation of the preferential binding of a series of ligands against a known target, the solution is not always evident from single structure analysis. An ensemble of structures generated from computer simulations is valuable; however, visual analysis of the extensive structural data can be overwhelming. Rapid analysis of trajectory data, with tools available in the Galaxy platform, can be used to understand key features and compare differences that inform the preferential ligand structure that favors binding. We illustrate this informatics approach by investigating the in-silico binding of a peptide and glycopeptide epitope of the glycoprotein Mucin 1 (MUC1) binding with the antibody AR20.5. To study the binding, we performed molecular dynamics simulations using OpenMM and then used the Galaxy platform for data analysis. The same analysis tools are applied to each of the simulation trajectories and this process was streamlined by using Galaxy workflows. The conformations of the antigens were analyzed using root-mean-square deviation, end-to-end distance, Ramachandran plots, and hydrogen bonding analysis. Additionally, RMSF and clustering analysis were carried out. These analyses were used to rapidly assess key features of the system, interrogate the dynamic structure of the ligand, and determine the role of glycosylation on the conformational equilibrium. The glycopeptide conformations in solution change relative to the peptide; thus a partially pre-structuring is seen prior to binding. Although the bound conformation of peptide and glycopeptide is similar, the glycopeptide fluctuates less and resides in specific conformers for more extended periods. This structural analysis which gives a high-level view of the features in the system under observation, could be readily applied to other binding problems as part of a general strategy in drug design or mechanistic analysis.
Collapse
Affiliation(s)
- Christopher B Barnett
- Scientific Computing Research Unit and Department of Chemistry, University of Cape Town, Rondebosch, 7701, South Africa
| | - Tharindu Senapathi
- Scientific Computing Research Unit and Department of Chemistry, University of Cape Town, Rondebosch, 7701, South Africa
| | - Kevin J Naidoo
- Scientific Computing Research Unit and Department of Chemistry, University of Cape Town, Rondebosch, 7701, South Africa.,Infectious Disease and Molecular Medicine, Faculty of Health Science, University of Cape Town, Rondebosch, 7701, South Africa
| |
Collapse
|
4
|
Bray SA, Lucas X, Kumar A, Grüning BA. The ChemicalToolbox: reproducible, user-friendly cheminformatics analysis on the Galaxy platform. J Cheminform 2020; 12:40. [PMID: 33431029 PMCID: PMC7268608 DOI: 10.1186/s13321-020-00442-7] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Accepted: 05/16/2020] [Indexed: 01/14/2023] Open
Abstract
Here, we introduce the ChemicalToolbox, a publicly available web server for performing cheminformatics analysis. The ChemicalToolbox provides an intuitive, graphical interface for common tools for downloading, filtering, visualizing and simulating small molecules and proteins. The ChemicalToolbox is based on Galaxy, an open-source web-based platform which enables accessible and reproducible data analysis. There is already an active Galaxy cheminformatics community using and developing tools. Based on their work, we provide four example workflows which illustrate the capabilities of the ChemicalToolbox, covering assembly of a compound library, hole filling, protein-ligand docking, and construction of a quantitative structure-activity relationship (QSAR) model. These workflows may be modified and combined flexibly, together with the many other tools available, to fit the needs of a particular project. The ChemicalToolbox is hosted on the European Galaxy server and may be accessed via https://cheminformatics.usegalaxy.eu.
Collapse
Affiliation(s)
- Simon A Bray
- Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, Freiburg, Germany.
| | - Xavier Lucas
- Roche Pharma Research and Early Development, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, Basel, Switzerland
| | - Anup Kumar
- Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, Freiburg, Germany
| | - Björn A Grüning
- Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, Freiburg, Germany
| |
Collapse
|