1
|
Arend L, Adamowicz K, Schmidt JR, Burankova Y, Zolotareva O, Tsoy O, Pauling JK, Kalkhof S, Baumbach J, List M, Laske T. Systematic evaluation of normalization approaches in tandem mass tag and label-free protein quantification data using PRONE. Brief Bioinform 2025; 26:bbaf201. [PMID: 40336172 PMCID: PMC12058466 DOI: 10.1093/bib/bbaf201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2025] [Revised: 03/28/2025] [Accepted: 04/09/2025] [Indexed: 05/09/2025] Open
Abstract
Despite the significant progress in accuracy and reliability in mass spectrometry technology, as well as the development of strategies based on isotopic labeling or internal standards in recent decades, systematic biases originating from non-biological factors remain a significant challenge in data analysis. In addition, the wide range of available normalization methods renders the choice of a suitable normalization method challenging. We systematically evaluated 17 normalization and 2 batch effect correction methods, originally developed for preprocessing DNA microarray data but widely applied in proteomics, on 6 publicly available spike-in and 3 label-free and tandem mass tag datasets. Opposed to state-of-the-art normalization practice, we found that a reduction in intragroup variation is not directly related to the effectiveness of the normalization methods. Furthermore, our results demonstrated that the methods RobNorm and Normics, specifically developed for proteomics data, in line with LoessF performed consistently well across the spike-in datasets, while EigenMS exhibited a high false-positive rate. Finally, based on experimental data, we show that normalization substantially impacts downstream analyses, and the impact is highly dataset-specific, emphasizing the importance of use-case-specific evaluations for novel proteomics datasets. For this, we developed the PROteomics Normalization Evaluator (PRONE), a unifying R package enabling comparative evaluation of normalization methods, including their impact on downstream analyses, while offering considerable flexibility, acknowledging the lack of universally accepted standards. PRONE is available on Bioconductor with a web application accessible at https://exbio.wzw.tum.de/prone/.
Collapse
Affiliation(s)
- Lis Arend
- Data Science in Systems Biology, TUM School of Life Sciences, Technical University of Munich, Maximus-von-Imhof Forum 3, 85354 Freising, Germany
- Institute for Computational Systems Biology, University of Hamburg, Albert-Einstein-Ring 8-10, 22761 Hamburg, Germany
| | - Klaudia Adamowicz
- Institute for Computational Systems Biology, University of Hamburg, Albert-Einstein-Ring 8-10, 22761 Hamburg, Germany
| | - Johannes R Schmidt
- Department of Preclinical Development and Validation, Fraunhofer Institute for Cell Therapy and Immunology IZI, Perlickstr. 1, 04103 Leipzig, Germany
| | - Yuliya Burankova
- Institute for Computational Systems Biology, University of Hamburg, Albert-Einstein-Ring 8-10, 22761 Hamburg, Germany
- Chair of Proteomics and Bioanalytics, TUM School of Life Sciences, Technical University of Munich, Emil-Erlenmeyer-Forum 5, 85354 Freising, Germany
| | - Olga Zolotareva
- Data Science in Systems Biology, TUM School of Life Sciences, Technical University of Munich, Maximus-von-Imhof Forum 3, 85354 Freising, Germany
- Institute for Computational Systems Biology, University of Hamburg, Albert-Einstein-Ring 8-10, 22761 Hamburg, Germany
| | - Olga Tsoy
- Institute for Computational Systems Biology, University of Hamburg, Albert-Einstein-Ring 8-10, 22761 Hamburg, Germany
- Department of Computer Science, Vrije Universiteit Amsterdam, De Boelelaan 1111, 1081 HV, Amsterdam, The Netherlands
| | - Josch K Pauling
- LipiTUM, TUM School of Life Sciences, Technical University of Munich, Maximus-von-Imhof Forum 3, 85354 Freising, Germany
- Institute for Clinical Chemistry and Laboratory Medicine, University Hospital and Faculty of Medicine Carl Gustav Carus of the Dresden University of Technology, Fetscherstr. 74, 01307 Dresden, Germany
| | - Stefan Kalkhof
- Department of Preclinical Development and Validation, Fraunhofer Institute for Cell Therapy and Immunology IZI, Perlickstr. 1, 04103 Leipzig, Germany
- Fraunhofer Cluster of Excellence Immune-Mediated Diseases CIMD, Perlickstr. 1, 04103 Leipzig, Germany
- Institute for Bioanalysis, University of Applied Science Coburg, Friedrich-Streib-Str. 2, 96450 Coburg, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Albert-Einstein-Ring 8-10, 22761 Hamburg, Germany
- Department of Mathematics and Computer Science, University of Southern Denmark, Campusvej 55, 5230 Odense, Denmark
| | - Markus List
- Data Science in Systems Biology, TUM School of Life Sciences, Technical University of Munich, Maximus-von-Imhof Forum 3, 85354 Freising, Germany
- Munich Data Science Institute (MDSI), Technical University of Munich, Walther-von-Dyck-Straße 10, 85748 Garching, Germany
| | - Tanja Laske
- Institute for Computational Systems Biology, University of Hamburg, Albert-Einstein-Ring 8-10, 22761 Hamburg, Germany
- Viral Systems Modeling, Leibniz Institute of Virology, Martinistr. 52, 20251 Hamburg, Germany
| |
Collapse
|
2
|
Jiang Y, Rex DA, Schuster D, Neely BA, Rosano GL, Volkmar N, Momenzadeh A, Peters-Clarke TM, Egbert SB, Kreimer S, Doud EH, Crook OM, Yadav AK, Vanuopadath M, Hegeman AD, Mayta M, Duboff AG, Riley NM, Moritz RL, Meyer JG. Comprehensive Overview of Bottom-Up Proteomics Using Mass Spectrometry. ACS MEASUREMENT SCIENCE AU 2024; 4:338-417. [PMID: 39193565 PMCID: PMC11348894 DOI: 10.1021/acsmeasuresciau.3c00068] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 05/03/2024] [Accepted: 05/03/2024] [Indexed: 08/29/2024]
Abstract
Proteomics is the large scale study of protein structure and function from biological systems through protein identification and quantification. "Shotgun proteomics" or "bottom-up proteomics" is the prevailing strategy, in which proteins are hydrolyzed into peptides that are analyzed by mass spectrometry. Proteomics studies can be applied to diverse studies ranging from simple protein identification to studies of proteoforms, protein-protein interactions, protein structural alterations, absolute and relative protein quantification, post-translational modifications, and protein stability. To enable this range of different experiments, there are diverse strategies for proteome analysis. The nuances of how proteomic workflows differ may be challenging to understand for new practitioners. Here, we provide a comprehensive overview of different proteomics methods. We cover from biochemistry basics and protein extraction to biological interpretation and orthogonal validation. We expect this Review will serve as a handbook for researchers who are new to the field of bottom-up proteomics.
Collapse
Affiliation(s)
- Yuming Jiang
- Department
of Computational Biomedicine, Cedars Sinai
Medical Center, Los Angeles, California 90048, United States
- Smidt Heart
Institute, Cedars Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced
Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los
Angeles, California 90048, United States
| | - Devasahayam Arokia
Balaya Rex
- Center for
Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India
| | - Dina Schuster
- Department
of Biology, Institute of Molecular Systems
Biology, ETH Zurich, Zurich 8093, Switzerland
- Department
of Biology, Institute of Molecular Biology
and Biophysics, ETH Zurich, Zurich 8093, Switzerland
- Laboratory
of Biomolecular Research, Division of Biology and Chemistry, Paul Scherrer Institute, Villigen 5232, Switzerland
| | - Benjamin A. Neely
- Chemical
Sciences Division, National Institute of
Standards and Technology, NIST, Charleston, South Carolina 29412, United States
| | - Germán L. Rosano
- Mass
Spectrometry
Unit, Institute of Molecular and Cellular
Biology of Rosario, Rosario, 2000 Argentina
| | - Norbert Volkmar
- Department
of Biology, Institute of Molecular Systems
Biology, ETH Zurich, Zurich 8093, Switzerland
| | - Amanda Momenzadeh
- Department
of Computational Biomedicine, Cedars Sinai
Medical Center, Los Angeles, California 90048, United States
- Smidt Heart
Institute, Cedars Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced
Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los
Angeles, California 90048, United States
| | - Trenton M. Peters-Clarke
- Department
of Pharmaceutical Chemistry, University
of California—San Francisco, San Francisco, California, 94158, United States
| | - Susan B. Egbert
- Department
of Chemistry, University of Manitoba, Winnipeg, Manitoba, R3T 2N2 Canada
| | - Simion Kreimer
- Smidt Heart
Institute, Cedars Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced
Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los
Angeles, California 90048, United States
| | - Emma H. Doud
- Center
for Proteome Analysis, Indiana University
School of Medicine, Indianapolis, Indiana, 46202-3082, United States
| | - Oliver M. Crook
- Oxford
Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, United
Kingdom
| | - Amit Kumar Yadav
- Translational
Health Science and Technology Institute, NCR Biotech Science Cluster 3rd Milestone Faridabad-Gurgaon
Expressway, Faridabad, Haryana 121001, India
| | | | - Adrian D. Hegeman
- Departments
of Horticultural Science and Plant and Microbial Biology, University of Minnesota, Twin Cities, Minnesota 55108, United States
| | - Martín
L. Mayta
- School
of Medicine and Health Sciences, Center for Health Sciences Research, Universidad Adventista del Plata, Libertador San Martin 3103, Argentina
- Molecular
Biology Department, School of Pharmacy and Biochemistry, Universidad Nacional de Rosario, Rosario 2000, Argentina
| | - Anna G. Duboff
- Department
of Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Nicholas M. Riley
- Department
of Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Robert L. Moritz
- Institute
for Systems biology, Seattle, Washington 98109, United States
| | - Jesse G. Meyer
- Department
of Computational Biomedicine, Cedars Sinai
Medical Center, Los Angeles, California 90048, United States
- Smidt Heart
Institute, Cedars Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced
Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los
Angeles, California 90048, United States
| |
Collapse
|
3
|
Jiang Y, Rex DAB, Schuster D, Neely BA, Rosano GL, Volkmar N, Momenzadeh A, Peters-Clarke TM, Egbert SB, Kreimer S, Doud EH, Crook OM, Yadav AK, Vanuopadath M, Mayta ML, Duboff AG, Riley NM, Moritz RL, Meyer JG. Comprehensive Overview of Bottom-Up Proteomics using Mass Spectrometry. ARXIV 2023:arXiv:2311.07791v1. [PMID: 38013887 PMCID: PMC10680866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Proteomics is the large scale study of protein structure and function from biological systems through protein identification and quantification. "Shotgun proteomics" or "bottom-up proteomics" is the prevailing strategy, in which proteins are hydrolyzed into peptides that are analyzed by mass spectrometry. Proteomics studies can be applied to diverse studies ranging from simple protein identification to studies of proteoforms, protein-protein interactions, protein structural alterations, absolute and relative protein quantification, post-translational modifications, and protein stability. To enable this range of different experiments, there are diverse strategies for proteome analysis. The nuances of how proteomic workflows differ may be challenging to understand for new practitioners. Here, we provide a comprehensive overview of different proteomics methods to aid the novice and experienced researcher. We cover from biochemistry basics and protein extraction to biological interpretation and orthogonal validation. We expect this work to serve as a basic resource for new practitioners in the field of shotgun or bottom-up proteomics.
Collapse
Affiliation(s)
- Yuming Jiang
- Department of Computational Biomedicine, Cedars Sinai Medical Center
| | - Devasahayam Arokia Balaya Rex
- Center for Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India
| | - Dina Schuster
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich 8093, Switzerland; Department of Biology, Institute of Molecular Biology and Biophysics, ETH Zurich, Zurich 8093, Switzerland; Laboratory of Biomolecular Research, Division of Biology and Chemistry, Paul Scherrer Institute, Villigen 5232, Switzerland
| | - Benjamin A. Neely
- Chemical Sciences Division, National Institute of Standards and Technology, NIST Charleston · Funded by NIST
| | - Germán L. Rosano
- Mass Spectrometry Unit, Institute of Molecular and Cellular Biology of Rosario, Rosario, Argentina · Funded by Grant PICT 2019-02971 (Agencia I+D+i)
| | - Norbert Volkmar
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich 8093, Switzerland
| | - Amanda Momenzadeh
- Department of Computational Biomedicine, Cedars Sinai Medical Center, Los Angeles, California, USA
| | | | - Susan B. Egbert
- Department of Chemistry, University of Manitoba, Winnipeg, Cananda
| | - Simion Kreimer
- Smidt Heart Institute, Cedars Sinai Medical Center; Advanced Clinical Biosystems Research Institute, Cedars Sinai Medical Center
| | - Emma H. Doud
- Center for Proteome Analysis, Indiana University School of Medicine, Indianapolis, Indiana, USA
| | - Oliver M. Crook
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, United Kingdom
| | - Amit Kumar Yadav
- Translational Health Science and Technology Institute · Funded by Grant BT/PR16456/BID/7/624/2016 (Department of Biotechnology, India); Grant Translational Research Program (TRP) at THSTI funded by DBT
| | - Muralidharan Vanuopadath
- School of Biotechnology, Amrita Vishwa Vidyapeetham, Kollam-690 525, Kerala, India · Funded by Department of Health Research, Indian Council of Medical Research, Government of India (File No.R.12014/31/2022-HR)
| | - Martín L. Mayta
- School of Medicine and Health Sciences, Center for Health Sciences Research, Universidad Adventista del Plata, Libertador San Martín 3103, Argentina; Molecular Biology Department, School of Pharmacy and Biochemistry, Universidad Nacional de Rosario, Rosario 2000, Argentina
| | - Anna G. Duboff
- Department of Chemistry, University of Washington · Funded by Summer Research Acceleration Fellowship, Department of Chemistry, University of Washington
| | - Nicholas M. Riley
- Department of Chemistry, University of Washington · Funded by National Institutes of Health Grant R00 GM147304
| | - Robert L. Moritz
- Institute for Systems biology, Seattle, WA, USA, 98109 · Funded by National Institutes of Health Grants R01GM087221, R24GM127667, U19AG023122, S10OD026936; National Science Foundation Award 1920268
| | - Jesse G. Meyer
- Department of Computational Biomedicine, Cedars Sinai Medical Center · Funded by National Institutes of Health Grant R21 AG074234; National Institutes of Health Grant R35 GM142502
| |
Collapse
|
4
|
Bharucha T, Gangadharan B, Kumar A, Myall AC, Ayhan N, Pastorino B, Chanthongthip A, Vongsouvath M, Mayxay M, Sengvilaipaseuth O, Phonemixay O, Rattanavong S, O’Brien DP, Vendrell I, Fischer R, Kessler B, Turtle L, de Lamballerie X, Dubot-Pérès A, Newton PN, Zitzmann N, SEAe Consortium. Deep Proteomics Network and Machine Learning Analysis of Human Cerebrospinal Fluid in Japanese Encephalitis Virus Infection. J Proteome Res 2023; 22:1614-1629. [PMID: 37219084 PMCID: PMC10246887 DOI: 10.1021/acs.jproteome.2c00563] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Indexed: 05/24/2023]
Abstract
Japanese encephalitis virus is a leading cause of neurological infection in the Asia-Pacific region with no means of detection in more remote areas. We aimed to test the hypothesis of a Japanese encephalitis (JE) protein signature in human cerebrospinal fluid (CSF) that could be harnessed in a rapid diagnostic test (RDT), contribute to understanding the host response and predict outcome during infection. Liquid chromatography and tandem mass spectrometry (LC-MS/MS), using extensive offline fractionation and tandem mass tag labeling (TMT), enabled comparison of the deep CSF proteome in JE vs other confirmed neurological infections (non-JE). Verification was performed using data-independent acquisition (DIA) LC-MS/MS. 5,070 proteins were identified, including 4,805 human proteins and 265 pathogen proteins. Feature selection and predictive modeling using TMT analysis of 147 patient samples enabled the development of a nine-protein JE diagnostic signature. This was tested using DIA analysis of an independent group of 16 patient samples, demonstrating 82% accuracy. Ultimately, validation in a larger group of patients and different locations could help refine the list to 2-3 proteins for an RDT. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD034789 and 10.6019/PXD034789.
Collapse
Affiliation(s)
- Tehmina Bharucha
- Department
of Biochemistry, University of Oxford, OX1 3QU, Oxford, U.K.
- Kavli
Institute for Nanoscience Discovery, University
of Oxford, OX1 3QU, Oxford, U.K.
- Lao-Oxford-Mahosot
Hospital-Wellcome Trust Research Unit (LOMWRU), Microbiology Laboratory, Mahosot Hospital, Vientiane, 0100 Lao PDR
| | - Bevin Gangadharan
- Department
of Biochemistry, University of Oxford, OX1 3QU, Oxford, U.K.
- Kavli
Institute for Nanoscience Discovery, University
of Oxford, OX1 3QU, Oxford, U.K.
| | - Abhinav Kumar
- Department
of Biochemistry, University of Oxford, OX1 3QU, Oxford, U.K.
- Kavli
Institute for Nanoscience Discovery, University
of Oxford, OX1 3QU, Oxford, U.K.
| | - Ashleigh C. Myall
- Department
of Infectious Disease, Imperial College
London, London W12 0NN, U.K.
- Department
of Mathematics, Imperial College London, London W12 0NN, U.K.
| | - Nazli Ayhan
- Unité
Des Virus Emergents UVE, Aix Marseille Univ,
IRD190, INSERM 1207, IHU Méditerranée Infection, Marseille 13005, France
| | - Boris Pastorino
- Unité
Des Virus Emergents UVE, Aix Marseille Univ,
IRD190, INSERM 1207, IHU Méditerranée Infection, Marseille 13005, France
| | - Anisone Chanthongthip
- Lao-Oxford-Mahosot
Hospital-Wellcome Trust Research Unit (LOMWRU), Microbiology Laboratory, Mahosot Hospital, Vientiane, 0100 Lao PDR
| | - Manivanh Vongsouvath
- Lao-Oxford-Mahosot
Hospital-Wellcome Trust Research Unit (LOMWRU), Microbiology Laboratory, Mahosot Hospital, Vientiane, 0100 Lao PDR
| | - Mayfong Mayxay
- Lao-Oxford-Mahosot
Hospital-Wellcome Trust Research Unit (LOMWRU), Microbiology Laboratory, Mahosot Hospital, Vientiane, 0100 Lao PDR
- Institute
of Research and Education Development (IRED), University of Health Sciences, Ministry of Health, Vientiane 43130, Lao PDR
- Centre
for Tropical Medicine & Global Health, Nuffield Department of
Medicine, University of Oxford, Oxford OX3 7LG, U.K.
| | - Onanong Sengvilaipaseuth
- Lao-Oxford-Mahosot
Hospital-Wellcome Trust Research Unit (LOMWRU), Microbiology Laboratory, Mahosot Hospital, Vientiane, 0100 Lao PDR
| | - Ooyanong Phonemixay
- Lao-Oxford-Mahosot
Hospital-Wellcome Trust Research Unit (LOMWRU), Microbiology Laboratory, Mahosot Hospital, Vientiane, 0100 Lao PDR
| | - Sayaphet Rattanavong
- Lao-Oxford-Mahosot
Hospital-Wellcome Trust Research Unit (LOMWRU), Microbiology Laboratory, Mahosot Hospital, Vientiane, 0100 Lao PDR
| | - Darragh P. O’Brien
- Target
Discovery Institute, Centre for Medicines Discovery, Nuffield Department
of Medicine, University of Oxford, Oxford OX3 7FZ, U.K.
| | - Iolanda Vendrell
- Target
Discovery Institute, Centre for Medicines Discovery, Nuffield Department
of Medicine, University of Oxford, Oxford OX3 7FZ, U.K.
- Chinese
Academy of Medical Sciences Oxford Institute, Nuffield Department
of Medicine, University of Oxford, Oxford OX3 7BN, U.K.
| | - Roman Fischer
- Target
Discovery Institute, Centre for Medicines Discovery, Nuffield Department
of Medicine, University of Oxford, Oxford OX3 7FZ, U.K.
- Chinese
Academy of Medical Sciences Oxford Institute, Nuffield Department
of Medicine, University of Oxford, Oxford OX3 7BN, U.K.
| | - Benedikt Kessler
- Target
Discovery Institute, Centre for Medicines Discovery, Nuffield Department
of Medicine, University of Oxford, Oxford OX3 7FZ, U.K.
- Chinese
Academy of Medical Sciences Oxford Institute, Nuffield Department
of Medicine, University of Oxford, Oxford OX3 7BN, U.K.
| | - Lance Turtle
- Institute
of Infection, Veterinary and Ecological Sciences, Faculty of Health
and Life Sciences, University of Liverpool, Liverpool L69 7BE, U.K.
- Tropical
and Infectious Disease Unit, Liverpool University
Hospitals NHS Foundation Trust (Member of Liverpool Health Partners), Liverpool L69 7BE, U.K.
| | - Xavier de Lamballerie
- Unité
Des Virus Emergents UVE, Aix Marseille Univ,
IRD190, INSERM 1207, IHU Méditerranée Infection, Marseille 13005, France
| | - Audrey Dubot-Pérès
- Lao-Oxford-Mahosot
Hospital-Wellcome Trust Research Unit (LOMWRU), Microbiology Laboratory, Mahosot Hospital, Vientiane, 0100 Lao PDR
- Unité
Des Virus Emergents UVE, Aix Marseille Univ,
IRD190, INSERM 1207, IHU Méditerranée Infection, Marseille 13005, France
- Centre
for Tropical Medicine & Global Health, Nuffield Department of
Medicine, University of Oxford, Oxford OX3 7LG, U.K.
| | - Paul N. Newton
- Lao-Oxford-Mahosot
Hospital-Wellcome Trust Research Unit (LOMWRU), Microbiology Laboratory, Mahosot Hospital, Vientiane, 0100 Lao PDR
- Centre
for Tropical Medicine & Global Health, Nuffield Department of
Medicine, University of Oxford, Oxford OX3 7LG, U.K.
| | - Nicole Zitzmann
- Department
of Biochemistry, University of Oxford, OX1 3QU, Oxford, U.K.
- Kavli
Institute for Nanoscience Discovery, University
of Oxford, OX1 3QU, Oxford, U.K.
| | - SEAe Consortium
- Biology
of Infection Unit, Institut Pasteur, 75015 Paris France
| |
Collapse
|
5
|
Chion M, Carapito C, Bertrand F. Accounting for multiple imputation-induced variability for differential analysis in mass spectrometry-based label-free quantitative proteomics. PLoS Comput Biol 2022; 18:e1010420. [PMID: 36037245 PMCID: PMC9462777 DOI: 10.1371/journal.pcbi.1010420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 09/09/2022] [Accepted: 07/21/2022] [Indexed: 11/20/2022] Open
Abstract
Imputing missing values is common practice in label-free quantitative proteomics. Imputation aims at replacing a missing value with a user-defined one. However, the imputation itself may not be optimally considered downstream of the imputation process, as imputed datasets are often considered as if they had always been complete. Hence, the uncertainty due to the imputation is not adequately taken into account. We provide a rigorous multiple imputation strategy, leading to a less biased estimation of the parameters' variability thanks to Rubin's rules. The imputation-based peptide's intensities' variance estimator is then moderated using Bayesian hierarchical models. This estimator is finally included in moderated t-test statistics to provide differential analyses results. This workflow can be used both at peptide and protein-level in quantification datasets. Indeed, an aggregation step is included for protein-level results based on peptide-level quantification data. Our methodology, named mi4p, was compared to the state-of-the-art limma workflow implemented in the DAPAR R package, both on simulated and real datasets. We observed a trade-off between sensitivity and specificity, while the overall performance of mi4p outperforms DAPAR in terms of F-Score.
Collapse
Affiliation(s)
- Marie Chion
- Institut de Recherche Mathématique Avancée, UMR 7501, CNRS-Université de Strasbourg, Strasbourg, France
- Laboratoire de Spectrométrie de Masse Bio-Organique, Institut Pluridisciplinaire Hubert Curien, UMR 7178, CNRS-Université de Strasbourg, Strasbourg, France
- Laboratoire Mathématiques appliquées à Paris 5, UMR 8145, CNRS-Université Paris Cité, Paris, France
| | - Christine Carapito
- Laboratoire de Spectrométrie de Masse Bio-Organique, Institut Pluridisciplinaire Hubert Curien, UMR 7178, CNRS-Université de Strasbourg, Strasbourg, France
- Infrastructure Nationale de Protéomique ProFi - FR2048, 67087 Strasbourg, France
| | - Frédéric Bertrand
- Institut de Recherche Mathématique Avancée, UMR 7501, CNRS-Université de Strasbourg, Strasbourg, France
- Laboratoire Informatique et Société Numérique, Université de Technologie de Troyes, Troyes, France
| |
Collapse
|
6
|
Čuklina J, Lee CH, Williams EG, Sajic T, Collins BC, Rodríguez Martínez M, Sharma VS, Wendt F, Goetze S, Keele GR, Wollscheid B, Aebersold R, Pedrioli PGA. Diagnostics and correction of batch effects in large-scale proteomic studies: a tutorial. Mol Syst Biol 2021; 17:e10240. [PMID: 34432947 PMCID: PMC8447595 DOI: 10.15252/msb.202110240] [Citation(s) in RCA: 75] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 07/16/2021] [Accepted: 07/26/2021] [Indexed: 12/11/2022] Open
Abstract
Advancements in mass spectrometry-based proteomics have enabled experiments encompassing hundreds of samples. While these large sample sets deliver much-needed statistical power, handling them introduces technical variability known as batch effects. Here, we present a step-by-step protocol for the assessment, normalization, and batch correction of proteomic data. We review established methodologies from related fields and describe solutions specific to proteomic challenges, such as ion intensity drift and missing values in quantitative feature matrices. Finally, we compile a set of techniques that enable control of batch effect adjustment quality. We provide an R package, "proBatch", containing functions required for each step of the protocol. We demonstrate the utility of this methodology on five proteomic datasets each encompassing hundreds of samples and consisting of multiple experimental designs. In conclusion, we provide guidelines and tools to make the extraction of true biological signal from large proteomic studies more robust and transparent, ultimately facilitating reliable and reproducible research in clinical proteomics and systems biology.
Collapse
Affiliation(s)
- Jelena Čuklina
- Department of BiologyInstitute of Molecular Systems BiologyETH ZurichZurichSwitzerland
- PhD Program in Systems BiologyUniversity of Zurich and ETH ZurichZurichSwitzerland
- IBM Research EuropeRüschlikonSwitzerland
| | - Chloe H Lee
- Department of BiologyInstitute of Molecular Systems BiologyETH ZurichZurichSwitzerland
| | - Evan G Williams
- Department of BiologyInstitute of Molecular Systems BiologyETH ZurichZurichSwitzerland
- Luxembourg Centre for Systems BiomedicineUniversity of LuxembourgLuxembourgLuxembourg
| | - Tatjana Sajic
- Department of BiologyInstitute of Molecular Systems BiologyETH ZurichZurichSwitzerland
| | - Ben C Collins
- Department of BiologyInstitute of Molecular Systems BiologyETH ZurichZurichSwitzerland
- Queen’s University BelfastBelfastUK
| | | | - Varun S Sharma
- Department of BiologyInstitute of Molecular Systems BiologyETH ZurichZurichSwitzerland
| | - Fabian Wendt
- Department of Health Sciences and TechnologyInstitute of Translational MedicineETH ZurichZurichSwitzerland
| | - Sandra Goetze
- Department of Health Sciences and TechnologyInstitute of Translational MedicineETH ZurichZurichSwitzerland
- ETH ZürichPHRT‐CPACZürichSwitzerland
- SIB Swiss Institute of BioinformaticsLausanneSwitzerland
| | | | - Bernd Wollscheid
- Department of Health Sciences and TechnologyInstitute of Translational MedicineETH ZurichZurichSwitzerland
- ETH ZürichPHRT‐CPACZürichSwitzerland
- SIB Swiss Institute of BioinformaticsLausanneSwitzerland
| | - Ruedi Aebersold
- Department of BiologyInstitute of Molecular Systems BiologyETH ZurichZurichSwitzerland
- Faculty of ScienceUniversity of ZurichZurichSwitzerland
| | - Patrick G A Pedrioli
- Department of BiologyInstitute of Molecular Systems BiologyETH ZurichZurichSwitzerland
- Department of Health Sciences and TechnologyInstitute of Translational MedicineETH ZurichZurichSwitzerland
- ETH ZürichPHRT‐CPACZürichSwitzerland
- SIB Swiss Institute of BioinformaticsLausanneSwitzerland
| |
Collapse
|