1
|
Kardell O, Gronauer T, von Toerne C, Merl-Pham J, König AC, Barth TK, Mergner J, Ludwig C, Tüshaus J, Giesbertz P, Breimann S, Schweizer L, Müller T, Kliewer G, Distler U, Gomez-Zepeda D, Popp O, Qin D, Teupser D, Cox J, Imhof A, Küster B, Lichtenthaler SF, Krijgsveld J, Tenzer S, Mertins P, Coscia F, Hauck SM. Multicenter Longitudinal Quality Assessment of MS-Based Proteomics in Plasma and Serum. J Proteome Res 2025; 24:1017-1029. [PMID: 39918541 PMCID: PMC11894660 DOI: 10.1021/acs.jproteome.4c00644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2024] [Revised: 12/14/2024] [Accepted: 01/15/2025] [Indexed: 03/08/2025]
Abstract
Advancing MS-based proteomics toward clinical applications evolves around developing standardized start-to-finish and fit-for-purpose workflows for clinical specimens. Steps along the method design involve the determination and optimization of several bioanalytical parameters such as selectivity, sensitivity, accuracy, and precision. In a joint effort, eight proteomics laboratories belonging to the MSCoreSys initiative including the CLINSPECT-M, MSTARS, DIASyM, and SMART-CARE consortia performed a longitudinal round-robin study to assess the analysis performance of plasma and serum as clinically relevant samples. A variety of LC-MS/MS setups including mass spectrometer models from ThermoFisher and Bruker as well as LC systems from ThermoFisher, Evosep, and Waters Corporation were used in this study. As key performance indicators, sensitivity, precision, and reproducibility were monitored over time. Protein identifications range between 300 and 400 IDs across different state-of-the-art MS instruments, with timsTOF Pro, Orbitrap Exploris 480, and Q Exactive HF-X being among the top performers. Overall, 71 proteins are reproducibly detectable in all setups in both serum and plasma samples, and 22 of these proteins are FDA-approved biomarkers, which are reproducibly quantified (CV < 20% with label-free quantification). In total, the round-robin study highlights a promising baseline for bringing MS-based measurements of serum and plasma samples closer to clinical utility.
Collapse
Affiliation(s)
- Oliver Kardell
- Metabolomics
and Proteomics Core, Helmholtz Zentrum München, German Research Center for Environmental Health, Munich 80939, Germany
| | - Thomas Gronauer
- Metabolomics
and Proteomics Core, Helmholtz Zentrum München, German Research Center for Environmental Health, Munich 80939, Germany
| | - Christine von Toerne
- Metabolomics
and Proteomics Core, Helmholtz Zentrum München, German Research Center for Environmental Health, Munich 80939, Germany
| | - Juliane Merl-Pham
- Metabolomics
and Proteomics Core, Helmholtz Zentrum München, German Research Center for Environmental Health, Munich 80939, Germany
| | - Ann-Christine König
- Metabolomics
and Proteomics Core, Helmholtz Zentrum München, German Research Center for Environmental Health, Munich 80939, Germany
| | - Teresa K. Barth
- Clinical
Protein Analysis Unit (ClinZfP), Biomedical Center, Faculty of Medicine, LMU Munich, 82152 Martinsried, Germany
| | - Julia Mergner
- Bavarian
Center for Biomolecular Mass Spectrometry at Klinikum rechts der Isar
(BayBioMS@MRI), Technical University of
Munich, 80333 Munich, Germany
| | - Christina Ludwig
- Bavarian
Center for Biomolecular Mass Spectrometry (BayBioMS), School of Life
Sciences, Technical University of Munich, 85354 Freising, Germany
| | - Johanna Tüshaus
- Bavarian
Center for Biomolecular Mass Spectrometry (BayBioMS), School of Life
Sciences, Technical University of Munich, 85354 Freising, Germany
| | - Pieter Giesbertz
- German
Center for Neurodegenerative Diseases (DZNE) Munich, DZNE, Munich 81377, Germany
- Proteomics
and Bioanalytics, School of Life Sciences, Technical University of Munich, 85354 Freising, Germany
| | - Stephan Breimann
- German
Center for Neurodegenerative Diseases (DZNE) Munich, DZNE, Munich 81377, Germany
| | - Lisa Schweizer
- Department
of Proteomics and Signal Transduction, Max-Planck
Institute of Biochemistry, Martinsried 82152, Germany
| | - Torsten Müller
- German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
- Medical Faculty, Heidelberg University, 69120 Heidelberg, Germany
| | - Georg Kliewer
- German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
- Medical Faculty, Heidelberg University, 69120 Heidelberg, Germany
| | - Ute Distler
- Institute for Immunology, University Medical
Center of the Johannes Gutenberg University Mainz, Mainz 55131, Germany
| | - David Gomez-Zepeda
- German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
- Institute for Immunology, University Medical
Center of the Johannes Gutenberg University Mainz, Mainz 55131, Germany
- Immunoproteomics Unit, Helmholtz-Institute
for Translational Oncology (HI-TRON) Mainz, 55131 Mainz, Germany
| | - Oliver Popp
- Max-Delbrück-Center for Molecular
Medicine in the Helmholtz
Association (MDC), 13125 Berlin, Germany
| | - Di Qin
- Max-Delbrück-Center for Molecular
Medicine in the Helmholtz
Association (MDC), 13125 Berlin, Germany
| | - Daniel Teupser
- Institute
of Laboratory Medicine, University Hospital, LMU Munich, 81377 Munich, Germany
| | - Jürgen Cox
- Computational Systems Biochemistry Research Group, Max-Planck Institute of Biochemistry, Martinsried 82152, Germany
| | - Axel Imhof
- Clinical
Protein Analysis Unit (ClinZfP), Biomedical Center, Faculty of Medicine, LMU Munich, 82152 Martinsried, Germany
| | - Bernhard Küster
- Bavarian
Center for Biomolecular Mass Spectrometry (BayBioMS), School of Life
Sciences, Technical University of Munich, 85354 Freising, Germany
- Proteomics
and Bioanalytics, School of Life Sciences, Technical University of Munich, 85354 Freising, Germany
| | - Stefan F. Lichtenthaler
- German
Center for Neurodegenerative Diseases (DZNE) Munich, DZNE, Munich 81377, Germany
- Neuroproteomics, School of Medicine and
Health, Klinikum rechts der
Isar, Technical University of Munich, 81675 Munich, Germany
- Munich
Cluster for Systems Neurology (SyNergy), 81377 Munich, Germany
| | - Jeroen Krijgsveld
- German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
- Medical Faculty, Heidelberg University, 69120 Heidelberg, Germany
| | - Stefan Tenzer
- Institute for Immunology, University Medical
Center of the Johannes Gutenberg University Mainz, Mainz 55131, Germany
- Immunoproteomics Unit, Helmholtz-Institute
for Translational Oncology (HI-TRON) Mainz, 55131 Mainz, Germany
| | - Philipp Mertins
- Max-Delbrück-Center for Molecular
Medicine in the Helmholtz
Association (MDC), 13125 Berlin, Germany
| | - Fabian Coscia
- Max-Delbrück-Center for Molecular
Medicine in the Helmholtz
Association (MDC), 13125 Berlin, Germany
| | - Stefanie M. Hauck
- Metabolomics
and Proteomics Core, Helmholtz Zentrum München, German Research Center for Environmental Health, Munich 80939, Germany
| |
Collapse
|
2
|
Liu X, Sun H, Hou X, Sun J, Tang M, Zhang YB, Zhang Y, Sun W, Liu C. Standard operating procedure combined with comprehensive quality control system for multiple LC-MS platforms urinary proteomics. Nat Commun 2025; 16:1051. [PMID: 39865094 PMCID: PMC11770173 DOI: 10.1038/s41467-025-56337-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Accepted: 01/16/2025] [Indexed: 01/28/2025] Open
Abstract
Urinary proteomics is emerging as a potent tool for detecting sensitive and non-invasive biomarkers. At present, the comparability of urinary proteomics data across diverse liquid chromatography-mass spectrometry (LC-MS) platforms remains an area that requires investigation. In this study, we conduct a comprehensive evaluation of urinary proteome across multiple LC-MS platforms. To systematically analyze and assess the quality of large-scale urinary proteomics data, we develop a comprehensive quality control (QC) system named MSCohort, which extracted 81 metrics for individual experiment and the whole cohort quality evaluation. Additionally, we present a standard operating procedure (SOP) for high-throughput urinary proteome analysis based on MSCohort QC system. Our study involves 20 LC-MS platforms and reveals that, when combined with a comprehensive QC system and a unified SOP, the data generated by data-independent acquisition (DIA) workflow in urine QC samples exhibit high robustness, sensitivity, and reproducibility across multiple LC-MS platforms. Furthermore, we apply this SOP to hybrid benchmarking samples and clinical colorectal cancer (CRC) urinary proteome including 527 experiments. Across three different LC-MS platforms, the analyses report high quantitative reproducibility and consistent disease patterns. This work lays the groundwork for large-scale clinical urinary proteomics studies spanning multiple platforms, paving the way for precision medicine research.
Collapse
Grants
- 82170524 National Natural Science Foundation of China (National Science Foundation of China)
- 31901039 National Natural Science Foundation of China (National Science Foundation of China)
- 32171442 National Natural Science Foundation of China (National Science Foundation of China)
- This work was supported by grants from the National Key Research and Development Program of China (2021YFA1301602,2021YFA1301603, 2024YFA1307201 to C.L.), the National Natural Science Foundation of China (32171442 and 92474115 to C.L., 82170524 and 31901039 to W.S.), the Fundamental Research Funds for Central Universities, Beijing Municipal Public Welfare Development and Reform Pilot Project for Medical Research Institutes (JYY2018-7), CAMS Innovation Fund for Medical Sciences (2021-I2M-1-016, 2022-I2M-1-020), Beijing Natural Science Foundation-Daxing Innovation Joint Fund (L246002) and Biologic Medicine Information Center of China, National Scientific Data Sharing Platform for Population and Health.
Collapse
Affiliation(s)
- Xiang Liu
- School of Biological Science and Medical Engineering & School of Engineering Medicine, Beihang University, Beijing, China
| | - Haidan Sun
- Proteomics Center, Core Facility of Instrument, Institute of Basic Medical Sciences Chinese Academy of Medical Sciences, School of Basic Medicine Peking Union Medical College, Beijing, China
| | - Xinhang Hou
- School of Biological Science and Medical Engineering & School of Engineering Medicine, Beihang University, Beijing, China
| | - Jiameng Sun
- Proteomics Center, Core Facility of Instrument, Institute of Basic Medical Sciences Chinese Academy of Medical Sciences, School of Basic Medicine Peking Union Medical College, Beijing, China
| | - Min Tang
- School of Biological Science and Medical Engineering & School of Engineering Medicine, Beihang University, Beijing, China
| | - Yong-Biao Zhang
- School of Biological Science and Medical Engineering & School of Engineering Medicine, Beihang University, Beijing, China
| | - Yongqian Zhang
- School of Medical Technology, Beijing Institute of Technology, Beijing, China
| | - Wei Sun
- Proteomics Center, Core Facility of Instrument, Institute of Basic Medical Sciences Chinese Academy of Medical Sciences, School of Basic Medicine Peking Union Medical College, Beijing, China.
| | - Chao Liu
- School of Biological Science and Medical Engineering & School of Engineering Medicine, Beihang University, Beijing, China.
| |
Collapse
|
3
|
Seo Y, Kang I, Lee HJ, Hwang J, Kwak SH, Oh MK, Lee H, Min H. Simple and robust high-throughput serum proteomics workflow with low-microflow LC-MS/MS. Anal Bioanal Chem 2024; 416:7007-7018. [PMID: 39422715 PMCID: PMC11579186 DOI: 10.1007/s00216-024-05603-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2024] [Revised: 10/08/2024] [Accepted: 10/10/2024] [Indexed: 10/19/2024]
Abstract
Clinical proteomics has substantially advanced in identifying and quantifying proteins from biofluids, such as blood, contributing to the discovery of biomarkers. The throughput and reproducibility of serum proteomics for large-scale clinical sample analyses require improvements. High-throughput analysis typically relies on automated equipment, which can be costly and has limited accessibility. In this study, we present a rapid, high-throughput workflow low-microflow LC-MS/MS method without automation. This workflow was optimized to minimize the preparation time and costs by omitting the depletion and desalting steps. The developed method was applied to data-independent acquisition (DIA) analysis of 235 samples, and it consistently yielded approximately 6000 peptides and 600 protein groups, including 33 FDA-approved biomarkers. Our results demonstrate that an 18-min DIA high-throughput workflow, assessed through intermittently collected quality control samples, ensures reproducibility and stability even with 2 µL of serum. It was successfully used to analyze serum samples from patients with diabetes having chronic kidney disease (CKD), and could identify five dysregulated proteins across various CKD stages.
Collapse
Affiliation(s)
- Yoondam Seo
- Doping Control Center, Korea Institute of Science and Technology (KIST), Hwarang-Ro 14-Gil 5, Seongbuk-Gu, Seoul, 02792, Republic of Korea
- Department of Chemical and Biological Engineering, Korea University, Seoul, 02841, Republic of Korea
| | - Inseon Kang
- Doping Control Center, Korea Institute of Science and Technology (KIST), Hwarang-Ro 14-Gil 5, Seongbuk-Gu, Seoul, 02792, Republic of Korea
| | - Hyeon-Jeong Lee
- Doping Control Center, Korea Institute of Science and Technology (KIST), Hwarang-Ro 14-Gil 5, Seongbuk-Gu, Seoul, 02792, Republic of Korea
| | - Jiin Hwang
- Doping Control Center, Korea Institute of Science and Technology (KIST), Hwarang-Ro 14-Gil 5, Seongbuk-Gu, Seoul, 02792, Republic of Korea
| | - Soo Heon Kwak
- Department of Internal Medicine, Seoul National University Hospital and Seoul National University College of Medicine, Seoul, 03080, Republic of Korea
| | - Min-Kyu Oh
- Department of Chemical and Biological Engineering, Korea University, Seoul, 02841, Republic of Korea
| | - Hyunbeom Lee
- Center for Advanced Biomolecular Recognition, Korea Institute of Science and Technology (KIST), Hwarang-Ro 14-Gil 5, Seongbuk-Gu, Seoul, 02792, Republic of Korea.
- Department of HY-KIST Bio-Convergence, Hanyang University, Seoul, 04763, Republic of Korea.
| | - Hophil Min
- Doping Control Center, Korea Institute of Science and Technology (KIST), Hwarang-Ro 14-Gil 5, Seongbuk-Gu, Seoul, 02792, Republic of Korea.
- Divison of Bio-Medical Science & Technology, KIST School, University of Science and Technology, Seoul, 02792, Republic of Korea.
| |
Collapse
|
4
|
Cai Z, Apolinário S, Baião AR, Pacini C, Sousa MD, Vinga S, Reddel RR, Robinson PJ, Garnett MJ, Zhong Q, Gonçalves E. Synthetic augmentation of cancer cell line multi-omic datasets using unsupervised deep learning. Nat Commun 2024; 15:10390. [PMID: 39614072 PMCID: PMC11607321 DOI: 10.1038/s41467-024-54771-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 11/18/2024] [Indexed: 12/01/2024] Open
Abstract
Integrating diverse types of biological data is essential for a holistic understanding of cancer biology, yet it remains challenging due to data heterogeneity, complexity, and sparsity. Addressing this, our study introduces an unsupervised deep learning model, MOSA (Multi-Omic Synthetic Augmentation), specifically designed to integrate and augment the Cancer Dependency Map (DepMap). Harnessing orthogonal multi-omic information, this model successfully generates molecular and phenotypic profiles, resulting in an increase of 32.7% in the number of multi-omic profiles and thereby generating a complete DepMap for 1523 cancer cell lines. The synthetically enhanced data increases statistical power, uncovering less studied mechanisms associated with drug resistance, and refines the identification of genetic associations and clustering of cancer cell lines. By applying SHapley Additive exPlanations (SHAP) for model interpretation, MOSA reveals multi-omic features essential for cell clustering and biomarker identification related to drug and gene dependencies. This understanding is crucial for developing much-needed effective strategies to prioritize cancer targets.
Collapse
Affiliation(s)
- Zhaoxiang Cai
- ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, NSW, Australia
| | - Sofia Apolinário
- INESC-ID, 1000-029, Lisboa, Portugal
- Instituto Superior Técnico (IST), Universidade de Lisboa, 1049-001, Lisboa, Portugal
| | - Ana R Baião
- INESC-ID, 1000-029, Lisboa, Portugal
- Instituto Superior Técnico (IST), Universidade de Lisboa, 1049-001, Lisboa, Portugal
| | - Clare Pacini
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, CB10 1SA, UK
| | - Miguel D Sousa
- INESC-ID, 1000-029, Lisboa, Portugal
- Instituto Superior Técnico (IST), Universidade de Lisboa, 1049-001, Lisboa, Portugal
| | - Susana Vinga
- INESC-ID, 1000-029, Lisboa, Portugal
- Instituto Superior Técnico (IST), Universidade de Lisboa, 1049-001, Lisboa, Portugal
| | - Roger R Reddel
- ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, NSW, Australia
| | - Phillip J Robinson
- ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, NSW, Australia
| | - Mathew J Garnett
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, CB10 1SA, UK
| | - Qing Zhong
- ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, NSW, Australia.
| | - Emanuel Gonçalves
- INESC-ID, 1000-029, Lisboa, Portugal.
- Instituto Superior Técnico (IST), Universidade de Lisboa, 1049-001, Lisboa, Portugal.
| |
Collapse
|
5
|
Liu Y, Mei L, Liang C, Zhong CQ, Tong M, Yu R. Cross-Run Hybrid Features Improve the Identification of Data-Independent Acquisition Proteomics. ACS OMEGA 2024; 9:46362-46372. [PMID: 39583733 PMCID: PMC11579728 DOI: 10.1021/acsomega.4c07398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/12/2024] [Revised: 09/25/2024] [Accepted: 10/02/2024] [Indexed: 11/26/2024]
Abstract
The analysis of data-independent acquisition (DIA) mass spectrometry data is crucial for comprehensive proteomics studies. However, traditional single-run methods often fall short in terms of identification depth and consistency. We present HFDiscrim, a specialized multirun DIA analysis tool aimed at enhancing the depth and consistency of reliable peptide identifications of DIA analysis tools. HFDiscrim was extensively benchmarked on multiple data sets, including the MCB data set, the ccRCC data set, and a three-species benchmark mixture. Compared to PyProphet, HFDiscrim identified 22.04% more precursors, 19.1% more peptides, and 13.2% more proteins while maintaining a controllable false discovery rate. Furthermore, HFDiscrim demonstrated higher identification rates and improved reproducibility across multiple runs. HFDiscrim is publicly available at https://github.com/yachliu/HFDiscrim.
Collapse
Affiliation(s)
- Yachen Liu
- School
of Informatics, Xiamen University, Xiamen, Fujian 361000, China
- National
Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 361102, China
| | - Longfei Mei
- School
of Informatics, Xiamen University, Xiamen, Fujian 361000, China
| | - Chenyu Liang
- National
Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 361102, China
| | - Chuan-Qi Zhong
- School
of Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Mengsha Tong
- National
Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 361102, China
- School
of Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Rongshan Yu
- School
of Informatics, Xiamen University, Xiamen, Fujian 361000, China
- National
Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 361102, China
- Aginome
Scientific, Xiamen, Fujian 361005, China
| |
Collapse
|
6
|
Salahub C, Uhlmann J. Optimal Structured Matrix Approximation for Robustness to Incomplete Biosequence Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:2592-2597. [PMID: 38949937 DOI: 10.1109/tcbb.2024.3420903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/03/2024]
Abstract
We propose a general method for optimally approximating an arbitrary matrix by a structured matrix (circulant, Toeplitz/Hankel, etc.) and examine its use for estimating the spectra of genomic linkage disequilibrium matrices. This application is prototypical of a variety of genomic and proteomic problems that demand robustness to incomplete biosequence information. We perform a simulation study and corroborative test of our method using real genomic data from the Mouse Genome Database (Bult et al., 2019). The results confirm the predicted utility of the method and provide strong evidence of its potential value to a wide range of bioinformatics applications. Our optimal general matrix approximation method is expected to be of independent interest to an even broader range of applications in applied mathematics and engineering.
Collapse
|
7
|
Tsantilas KA, Merrihew GE, Robbins JE, Johnson RS, Park J, Plubell DL, Canterbury JD, Huang E, Riffle M, Sharma V, MacLean BX, Eckels J, Wu CC, Bereman MS, Spencer SE, Hoofnagle AN, MacCoss MJ. A Framework for Quality Control in Quantitative Proteomics. J Proteome Res 2024; 23:4392-4408. [PMID: 39248652 PMCID: PMC11973981 DOI: 10.1021/acs.jproteome.4c00363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/10/2024]
Abstract
A thorough evaluation of the quality, reproducibility, and variability of bottom-up proteomics data is necessary at every stage of a workflow, from planning to analysis. We share vignettes applying adaptable quality control (QC) measures to assess sample preparation, system function, and quantitative analysis. System suitability samples are repeatedly measured longitudinally with targeted methods, and we share examples where they are used on three instrument platforms to identify severe system failures and track function over months to years. Internal QCs incorporated at the protein and peptide levels allow our team to assess sample preparation issues and to differentiate system failures from sample-specific issues. External QC samples prepared alongside our experimental samples are used to verify the consistency and quantitative potential of our results during batch correction and normalization before assessing biological phenotypes. We combine these controls with rapid analysis (Skyline), longitudinal QC metrics (AutoQC), and server-based data deposition (PanoramaWeb). We propose that this integrated approach to QC is a useful starting point for groups to facilitate rapid quality control assessment to ensure that valuable instrument time is used to collect the best quality data possible. Data are available on Panorama Public and ProteomeXchange under the identifier PXD051318.
Collapse
Affiliation(s)
- Kristine A. Tsantilas
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| | - Gennifer E. Merrihew
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| | - Julia E. Robbins
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| | - Richard S. Johnson
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| | - Jea Park
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| | - Deanna L. Plubell
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| | - Jesse D. Canterbury
- Thermo Fisher Scientific, 355 River Oaks Parkway, San Jose, California 95134, United States
| | - Eric Huang
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| | - Michael Riffle
- Department of Biochemistry, University of Washington, Washington 98195, United States
| | - Vagisha Sharma
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| | - Brendan X. MacLean
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| | - Josh Eckels
- LabKey, 500 Union St #1000, Seattle, Washington 98101, United States
| | - Christine C. Wu
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| | - Michael S. Bereman
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27607
| | - Sandra E. Spencer
- Canada's Michael Smith Genome Sciences Centre (BC Cancer Research Institute), University of British Columbia, Vancouver, British Columbia V5Z 4S6, Canada
| | - Andrew N. Hoofnagle
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, Washington 98195, United States
| | - Michael J. MacCoss
- Department of Genome Sciences, University of Washington, Washington 98195, United States
| |
Collapse
|
8
|
Yu Y, Mai Y, Zheng Y, Shi L. Assessing and mitigating batch effects in large-scale omics studies. Genome Biol 2024; 25:254. [PMID: 39363244 PMCID: PMC11447944 DOI: 10.1186/s13059-024-03401-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 09/23/2024] [Indexed: 10/05/2024] Open
Abstract
Batch effects in omics data are notoriously common technical variations unrelated to study objectives, and may result in misleading outcomes if uncorrected, or hinder biomedical discovery if over-corrected. Assessing and mitigating batch effects is crucial for ensuring the reliability and reproducibility of omics data and minimizing the impact of technical variations on biological interpretation. In this review, we highlight the profound negative impact of batch effects and the urgent need to address this challenging problem in large-scale omics studies. We summarize potential sources of batch effects, current progress in evaluating and correcting them, and consortium efforts aiming to tackle them.
Collapse
Affiliation(s)
- Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China.
| | - Yuanbang Mai
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China.
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China.
- Cancer Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
- International Human Phenome Institutes (Shanghai), Shanghai, China.
| |
Collapse
|
9
|
Kohler D, Staniak M, Yu F, Nesvizhskii AI, Vitek O. An MSstats workflow for detecting differentially abundant proteins in large-scale data-independent acquisition mass spectrometry experiments with FragPipe processing. Nat Protoc 2024; 19:2915-2938. [PMID: 38769142 DOI: 10.1038/s41596-024-01000-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 03/11/2024] [Indexed: 05/22/2024]
Abstract
Technological advances in mass spectrometry and proteomics have made it possible to perform larger-scale and more-complex experiments. The volume and complexity of the resulting data create major challenges for downstream analysis. In particular, next-generation data-independent acquisition (DIA) experiments enable wider proteome coverage than more traditional targeted approaches but require computational workflows that can manage much larger datasets and identify peptide sequences from complex and overlapping spectral features. Data-processing tools such as FragPipe, DIA-NN and Spectronaut have undergone substantial improvements to process spectral features in a reasonable time. Statistical analysis tools are needed to draw meaningful comparisons between experimental samples, but these tools were also originally designed with smaller datasets in mind. This protocol describes an updated version of MSstats that has been adapted to be compatible with large-scale DIA experiments. A very large DIA experiment, processed with FragPipe, is used as an example to demonstrate different MSstats workflows. The choice of workflow depends on the user's computational resources. For datasets that are too large to fit into a standard computer's memory, we demonstrate the use of MSstatsBig, a companion R package to MSstats. The protocol also highlights key decisions that have a major effect on both the results and the processing time of the analysis. The MSstats processing can be expected to take 1-3 h depending on the usage of MSstatsBig. The protocol can be run in the point-and-click graphical user interface MSstatsShiny or implemented with minimal coding expertise in R.
Collapse
Affiliation(s)
- Devon Kohler
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
- Barnett Institute for Chemical and Biological Analysis, Northeastern University, Boston, MA, USA
| | | | - Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Alexey I Nesvizhskii
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Olga Vitek
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA.
- Barnett Institute for Chemical and Biological Analysis, Northeastern University, Boston, MA, USA.
| |
Collapse
|
10
|
Ran P, Wang Y, Li K, He S, Tan S, Lv J, Zhu J, Tang S, Feng J, Qin Z, Li Y, Huang L, Yin Y, Zhu L, Yang W, Ding C. STAVER: a standardized benchmark dataset-based algorithm for effective variation reduction in large-scale DIA-MS data. Brief Bioinform 2024; 25:bbae553. [PMID: 39504480 PMCID: PMC11540132 DOI: 10.1093/bib/bbae553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 09/12/2024] [Accepted: 10/19/2024] [Indexed: 11/08/2024] Open
Abstract
Mass spectrometry (MS)-based proteomics has become instrumental in comprehensively investigating complex biological systems. Data-independent acquisition (DIA)-MS, utilizing hybrid spectral library search strategies, allows for the simultaneous quantification of thousands of proteins, showing promise in enhancing protein identification and quantification precision. However, low-quality profiles can considerably undermine quantitative precision, resulting in inaccurate protein quantification. To tackle this challenge, we introduced STAVER, a novel algorithm that leverages standardized benchmark datasets to reduce non-biological variation in large-scale DIA-MS analyses. By eliminating unwanted noise in MS signals, STAVER significantly improved protein quantification precision, especially in hybrid spectral library searches. Moreover, we validated STAVER's robustness and applicability across multiple large-scale DIA datasets, demonstrating significantly enhanced precision and reproducibility of protein quantification. STAVER offers an innovative and effective approach for enhancing the quality of large-scale DIA proteomic data, facilitating cross-platform and cross-laboratory comparative analyses. This advancement significantly enhances the consistency and reliability of findings in clinical research. The complete package is available at https://github.com/Ran485/STAVER.
Collapse
Affiliation(s)
- Peng Ran
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Yunzhi Wang
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Kai Li
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Shiman He
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Subei Tan
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Jiacheng Lv
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Jiajun Zhu
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Shaoshuai Tang
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Jinwen Feng
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Zhaoyu Qin
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Yan Li
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Lin Huang
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Yanan Yin
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Lingli Zhu
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
| | - Wenjun Yang
- Department of Pediatric Orthopedics, Xinhua Hospital affiliated to Shanghai Jiao Tong University School of Medicine, No. 1665, Kongjiang Road, Yangpu District, Shanghai 200092, China
| | - Chen Ding
- Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China
- Departments of Cancer Research Institute, Affiliated Cancer Hospital of Xinjiang Medical University Xinjiang Key Laboratory of Translational Biomedical Engineering, Urumqi 830000, P. R. China
| |
Collapse
|
11
|
Rich JA, Fan Y, Chen Q, Meerzaman D, Stetler-Stevenson WG, Peeney D. Analysis of cancer cell line and tissue RNA sequencing data reveals an essential and dark matrisome. Matrix Biol Plus 2024; 23:100156. [PMID: 39049902 PMCID: PMC11267082 DOI: 10.1016/j.mbplus.2024.100156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Revised: 06/18/2024] [Accepted: 06/24/2024] [Indexed: 07/27/2024] Open
Abstract
Extracellular matrix remodeling is a hallmark of tissue development, homeostasis, and disease. The processes that mediate remodeling, and the consequences of such, are the topic of extensive focus in biomedical research. Cell culture methods represent a crucial tool utilized by those interested in matrisome function, the easiest of which are implemented with immortalized/cancer cell lines. These cell lines often form the foundations of a research proposal, or serve as vehicles of validation for other model systems. For these reasons, it is important to understand the complement of matrisome genes that are expressed when identifying appropriate cell culture models for hypothesis testing. To this end, we harvested bulk RNA sequencing data from the Cancer Cell Line Encyclopedia (CCLE) to assess matrisome gene expression in 1019 human cell lines. Our examination reveals that a large proportion of the matrisome is poorly represented in human cancer cell lines, with approximately 10% not expressed above threshold in any of the cell lines assayed. Conversely, we identify clusters of essential/common matrisome genes that are abundantly expressed in cell lines. To validate these observations against tissue data, we compared our findings with bulk RNA sequencing data from the Genotype-Tissue Expression (GTEx) portal and The Cancer Genome Atlas (TCGA) program. This comparison demonstrates general agreement between the "essential/common" and "dark/uncommon" matrisome across the three datasets, albeit with discordance observed in 59 matrisome genes between cell lines and tissues. Notably, all of the discordant genes are essential/common in tissues yet minimally expressed in cell lines, underscoring critical considerations for matrix biology researchers employing immortalized cell lines for their investigations.
Collapse
Affiliation(s)
- Joshua A. Rich
- Extracellular Matrix Pathology Section, Laboratory of Pathology, National Cancer Institute, National Institute of Health, Bethesda, MD, United States
| | - Yu Fan
- Computational Genomics and Bioinformatics Branch, Center for Biomedical Informatics & Information Technology, National Cancer Institute, National Institute of Health, Rockville, MD, United States
| | - Qingrong Chen
- Computational Genomics and Bioinformatics Branch, Center for Biomedical Informatics & Information Technology, National Cancer Institute, National Institute of Health, Rockville, MD, United States
| | - Daoud Meerzaman
- Computational Genomics and Bioinformatics Branch, Center for Biomedical Informatics & Information Technology, National Cancer Institute, National Institute of Health, Rockville, MD, United States
| | - William G. Stetler-Stevenson
- Extracellular Matrix Pathology Section, Laboratory of Pathology, National Cancer Institute, National Institute of Health, Bethesda, MD, United States
| | - David Peeney
- Extracellular Matrix Pathology Section, Laboratory of Pathology, National Cancer Institute, National Institute of Health, Bethesda, MD, United States
| |
Collapse
|
12
|
Fröhlich K, Fahrner M, Brombacher E, Seredynska A, Maldacker M, Kreutz C, Schmidt A, Schilling O. Data-Independent Acquisition: A Milestone and Prospect in Clinical Mass Spectrometry-Based Proteomics. Mol Cell Proteomics 2024; 23:100800. [PMID: 38880244 PMCID: PMC11380018 DOI: 10.1016/j.mcpro.2024.100800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 06/08/2024] [Accepted: 06/13/2024] [Indexed: 06/18/2024] Open
Abstract
Data-independent acquisition (DIA) has revolutionized the field of mass spectrometry (MS)-based proteomics over the past few years. DIA stands out for its ability to systematically sample all peptides in a given m/z range, allowing an unbiased acquisition of proteomics data. This greatly mitigates the issue of missing values and significantly enhances quantitative accuracy, precision, and reproducibility compared to many traditional methods. This review focuses on the critical role of DIA analysis software tools, primarily focusing on their capabilities and the challenges they address in proteomic research. Advances in MS technology, such as trapped ion mobility spectrometry, or high field asymmetric waveform ion mobility spectrometry require sophisticated analysis software capable of handling the increased data complexity and exploiting the full potential of DIA. We identify and critically evaluate leading software tools in the DIA landscape, discussing their unique features, and the reliability of their quantitative and qualitative outputs. We present the biological and clinical relevance of DIA-MS and discuss crucial publications that paved the way for in-depth proteomic characterization in patient-derived specimens. Furthermore, we provide a perspective on emerging trends in clinical applications and present upcoming challenges including standardization and certification of MS-based acquisition strategies in molecular diagnostics. While we emphasize the need for continuous development of software tools to keep pace with evolving technologies, we advise researchers against uncritically accepting the results from DIA software tools. Each tool may have its own biases, and some may not be as sensitive or reliable as others. Our overarching recommendation for both researchers and clinicians is to employ multiple DIA analysis tools, utilizing orthogonal analysis approaches to enhance the robustness and reliability of their findings.
Collapse
Affiliation(s)
- Klemens Fröhlich
- Proteomics Core Facility, Biozentrum Basel, University of Basel, Basel, Switzerland
| | - Matthias Fahrner
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany; German Cancer Consortium (DKTK) and Cancer Research Center (DKFZ), Freiburg, Germany
| | - Eva Brombacher
- Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center-University of Freiburg, Freiburg, Germany; Centre for Integrative Biological Signaling Studies (CIBSS), University of Freiburg, Freiburg, Germany; Spemann Graduate School of Biology and Medicine (SGBM), University of Freiburg, Freiburg, Germany; Faculty of Biology, University of Freiburg, Freiburg, Germany
| | - Adrianna Seredynska
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany; German Cancer Consortium (DKTK) and Cancer Research Center (DKFZ), Freiburg, Germany; Faculty of Biology, University of Freiburg, Freiburg, Germany
| | - Maximilian Maldacker
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany; Faculty of Biology, University of Freiburg, Freiburg, Germany
| | - Clemens Kreutz
- Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center-University of Freiburg, Freiburg, Germany; Centre for Integrative Biological Signaling Studies (CIBSS), University of Freiburg, Freiburg, Germany
| | - Alexander Schmidt
- Proteomics Core Facility, Biozentrum Basel, University of Basel, Basel, Switzerland
| | - Oliver Schilling
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany; German Cancer Consortium (DKTK) and Cancer Research Center (DKFZ), Freiburg, Germany.
| |
Collapse
|
13
|
Çelik MH, Gagneur J, Lim RG, Wu J, Thompson LM, Xie X. Identifying dysregulated regions in amyotrophic lateral sclerosis through chromatin accessibility outliers. HGG ADVANCES 2024; 5:100318. [PMID: 38872308 PMCID: PMC11260578 DOI: 10.1016/j.xhgg.2024.100318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Revised: 06/10/2024] [Accepted: 06/11/2024] [Indexed: 06/15/2024] Open
Abstract
The high heritability of amyotrophic lateral sclerosis (ALS) contrasts with its low molecular diagnosis rate post-genetic testing, pointing to potential undiscovered genetic factors. To aid the exploration of these factors, we introduced EpiOut, an algorithm to identify chromatin accessibility outliers that are regions exhibiting divergent accessibility from the population baseline in a single or few samples. Annotation of accessible regions with histone chromatin immunoprecipitation sequencing and Hi-C indicates that outliers are concentrated in functional loci, especially among promoters interacting with active enhancers. Across different omics levels, outliers are robustly replicated, and chromatin accessibility outliers are reliable predictors of gene expression outliers and aberrant protein levels. When promoter accessibility does not align with gene expression, our results indicate that molecular aberrations are more likely to be linked to post-transcriptional regulation rather than transcriptional regulation. Our findings demonstrate that the outlier detection paradigm can uncover dysregulated regions in rare diseases. EpiOut is available at github.com/uci-cbcl/EpiOut.
Collapse
Affiliation(s)
- Muhammed Hasan Çelik
- Department of Computer Science, University of California Irvine, Irvine, CA, USA; Center for Complex Biological Systems, University of California Irvine, Irvine, CA, USA
| | - Julien Gagneur
- Department of Informatics, Technical University of Munich, Garching, Germany; Helmholtz Association - Munich School for Data Science (MUDS), Munich, Germany; Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany; Institute of Computational Biology, Helmholtz Center Munich, Neuherberg, Germany
| | - Ryan G Lim
- Institute for Memory Impairments and Neurological Disorders, University of California Irvine, Irvine, CA 92697, USA
| | - Jie Wu
- Department of Biological Chemistry, University of California Irvine, Irvine, CA, USA
| | - Leslie M Thompson
- Institute for Memory Impairments and Neurological Disorders, University of California Irvine, Irvine, CA 92697, USA; Department of Biological Chemistry, University of California Irvine, Irvine, CA, USA; UCI MIND, University of California Irvine, Irvine, CA, USA; Department of Psychiatry and Human Behavior and Sue and Bill Gross Stem Cell Center, University of California Irvine, Irvine, CA, USA; Department of Neurobiology and Behavior, University of California Irvine, Irvine, CA, USA
| | - Xiaohui Xie
- Department of Computer Science, University of California Irvine, Irvine, CA, USA.
| |
Collapse
|
14
|
Bortel P, Hagn G, Skos L, Bileck A, Paulitschke V, Paulitschke P, Gleiter L, Mohr T, Gerner C, Meier-Menches SM. Memory effects of prior subculture may impact the quality of multiomic perturbation profiles. Proc Natl Acad Sci U S A 2024; 121:e2313851121. [PMID: 38976734 PMCID: PMC11260104 DOI: 10.1073/pnas.2313851121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Accepted: 06/03/2024] [Indexed: 07/10/2024] Open
Abstract
Mass spectrometry-based omics technologies are increasingly used in perturbation studies to map drug effects to biological pathways by identifying significant molecular events. Significance is influenced by fold change and variation of each molecular parameter, but also by multiple testing corrections. While the fold change is largely determined by the biological system, the variation is determined by experimental workflows. Here, it is shown that memory effects of prior subculture can influence the variation of perturbation profiles using the two colon carcinoma cell lines SW480 and HCT116. These memory effects are largely driven by differences in growth states that persist into the perturbation experiment. In SW480 cells, memory effects combined with moderate treatment effects amplify the variation in multiple omics levels, including eicosadomics, proteomics, and phosphoproteomics. With stronger treatment effects, the memory effect was less pronounced, as demonstrated in HCT116 cells. Subculture homogeneity was controlled by real-time monitoring of cell growth. Controlled homogeneous subculture resulted in a perturbation network of 321 causal conjectures based on combined proteomic and phosphoproteomic data, compared to only 58 causal conjectures without controlling subculture homogeneity in SW480 cells. Some cellular responses and regulatory events were identified that extend the mode of action of arsenic trioxide (ATO) only when accounting for these memory effects. Controlled prior subculture led to the finding of a synergistic combination treatment of ATO with the thioredoxin reductase 1 inhibitor auranofin, which may prove useful in the management of NRF2-mediated resistance mechanisms.
Collapse
Affiliation(s)
- Patricia Bortel
- Department of Analytical Chemistry, Faculty of Chemistry, University of Vienna, Vienna1090, Austria
- Vienna Doctoral School in Chemistry, University of Vienna, Vienna1090, Austria
| | - Gerhard Hagn
- Department of Analytical Chemistry, Faculty of Chemistry, University of Vienna, Vienna1090, Austria
- Vienna Doctoral School in Chemistry, University of Vienna, Vienna1090, Austria
| | - Lukas Skos
- Department of Analytical Chemistry, Faculty of Chemistry, University of Vienna, Vienna1090, Austria
- Vienna Doctoral School in Chemistry, University of Vienna, Vienna1090, Austria
| | - Andrea Bileck
- Department of Analytical Chemistry, Faculty of Chemistry, University of Vienna, Vienna1090, Austria
- Joint Metabolome Facility, University of Vienna and Medical University of Vienna, Vienna1090, Austria
| | - Verena Paulitschke
- Department of Dermatology, Medical University of Vienna, Vienna1090, Austria
| | - Philipp Paulitschke
- PHIO scientific GmbH, Munich81371, Germany
- Faculty of Physics, Ludwig-Maximilians University of Munich, Munich80539, Germany
| | | | - Thomas Mohr
- Department of Analytical Chemistry, Faculty of Chemistry, University of Vienna, Vienna1090, Austria
- Center of Cancer Research, Department of Medicine I, Medical University of Vienna and Comprehensive Cancer Center, Vienna1090, Austria
| | - Christopher Gerner
- Department of Analytical Chemistry, Faculty of Chemistry, University of Vienna, Vienna1090, Austria
- Joint Metabolome Facility, University of Vienna and Medical University of Vienna, Vienna1090, Austria
| | - Samuel M. Meier-Menches
- Department of Analytical Chemistry, Faculty of Chemistry, University of Vienna, Vienna1090, Austria
- Joint Metabolome Facility, University of Vienna and Medical University of Vienna, Vienna1090, Austria
- Institute of Inorganic Chemistry, Faculty of Chemistry, University of Vienna, Vienna1090, Austria
| |
Collapse
|
15
|
Humphries EM, Xavier D, Ashman K, Hains PG, Robinson PJ. High-Throughput Proteomics and Phosphoproteomics of Rat Tissues Using Microflow Zeno SWATH. J Proteome Res 2024; 23:2355-2366. [PMID: 38819404 DOI: 10.1021/acs.jproteome.4c00010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
High-throughput tissue proteomics has great potential in the advancement of precision medicine. Here, we investigated the combined sensitivity of trap-elute microflow liquid chromatography with a ZenoTOF for DIA proteomics and phosphoproteomics. Method optimization was conducted on HEK293T cell lines to determine the optimal variable window size, MS2 accumulation time and gradient length. The ZenoTOF 7600 was then compared to the previous generation TripleTOF 6600 using eight rat organs, finding up to 23% more proteins using a fifth of the sample load and a third of the instrument time. Spectral reference libraries generated from Zeno SWATH data in FragPipe (MSFragger-DIA/DIA-NN) contained 4 times more fragment ions than the DIA-NN only library and quantified more proteins. Replicate single-shot phosphopeptide enrichments of 50-100 μg of rat tryptic peptide were analyzed by microflow HPLC using Zeno SWATH without fractionation. Using Spectronaut we quantified a shallow phosphoproteome containing 1000-3000 phosphoprecursors per organ. Promisingly, clear hierarchical clustering of organs was observed with high Pearson correlation coefficients >0.95 between replicate enrichments and median CV of 20%. The combined sensitivity of microflow HPLC with Zeno SWATH allows for the high-throughput quantitation of an extensive proteome and shallow phosphoproteome from small tissue samples.
Collapse
Affiliation(s)
- Erin M Humphries
- ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales 2145, Australia
| | - Dylan Xavier
- ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales 2145, Australia
| | - Keith Ashman
- Sciex, 96 Ricketts Road,Mount Waverley, Victoria 3149, Australia
| | - Peter G Hains
- ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales 2145, Australia
| | - Phillip J Robinson
- ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales 2145, Australia
| |
Collapse
|
16
|
Calado CRC. Bridging the gap between target-based and phenotypic-based drug discovery. Expert Opin Drug Discov 2024; 19:789-798. [PMID: 38747562 DOI: 10.1080/17460441.2024.2355330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 05/10/2024] [Indexed: 06/26/2024]
Abstract
INTRODUCTION The unparalleled progress in science of the last decades has brought a better understanding of the molecular mechanisms of diseases. This promoted drug discovery processes based on a target approach. However, despite the high promises associated, a critical decrease in the number of first-in-class drugs has been observed. AREAS COVERED This review analyses the challenges, advances, and opportunities associated with the main strategies of the drug discovery process, i.e. based on a rational target approach and on an empirical phenotypic approach. This review also evaluates how the gap between these two crossroads can be bridged toward a more efficient drug discovery process. EXPERT OPINION The critical lack of knowledge of the complex biological networks is leading to targets not relevant for the clinical context or to drugs that present undesired adverse effects. The phenotypic systems designed by considering available molecular mechanisms can mitigate these knowledge gaps. Associated with the expansion of the chemical space and other technologies, these designs can lead to more efficient drug discoveries. Technological and scientific knowledge should also be applied to identify, as early as possible, both drug targets and mechanisms of action, leading to a more efficient drug discovery pipeline.
Collapse
Affiliation(s)
- Cecília R C Calado
- ISEL-Instituto Superior de Engenharia de Lisboa, Instituto Politécnico de Lisboa, Lisboa, Portugal
- iBB - Institute for Bioengineering and Biosciences, i4HB - The Associate Laboratory Institute for Health and Bioeconomy, IST - Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal
| |
Collapse
|
17
|
Webel H, Niu L, Nielsen AB, Locard-Paulet M, Mann M, Jensen LJ, Rasmussen S. Imputation of label-free quantitative mass spectrometry-based proteomics data using self-supervised deep learning. Nat Commun 2024; 15:5405. [PMID: 38926340 PMCID: PMC11208500 DOI: 10.1038/s41467-024-48711-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Accepted: 05/13/2024] [Indexed: 06/28/2024] Open
Abstract
Imputation techniques provide means to replace missing measurements with a value and are used in almost all downstream analysis of mass spectrometry (MS) based proteomics data using label-free quantification (LFQ). Here we demonstrate how collaborative filtering, denoising autoencoders, and variational autoencoders can impute missing values in the context of LFQ at different levels. We applied our method, proteomics imputation modeling mass spectrometry (PIMMS), to an alcohol-related liver disease (ALD) cohort with blood plasma proteomics data available for 358 individuals. Removing 20 percent of the intensities we were able to recover 15 out of 17 significant abundant protein groups using PIMMS-VAE imputations. When analyzing the full dataset we identified 30 additional proteins (+13.2%) that were significantly differentially abundant across disease stages compared to no imputation and found that some of these were predictive of ALD progression in machine learning models. We, therefore, suggest the use of deep learning approaches for imputing missing values in MS-based proteomics on larger datasets and provide workflows for these.
Collapse
Affiliation(s)
- Henry Webel
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark
| | - Lili Niu
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark
| | - Annelaura Bach Nielsen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark
| | - Marie Locard-Paulet
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark
- Institut de Pharmacologie et de Biologie Structurale (IPBS), Université de Toulouse, CNRS, Université Toulouse III - Paul Sabatier (UT3), Toulouse, France
| | - Matthias Mann
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Lars Juhl Jensen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark
| | - Simon Rasmussen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark.
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark.
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.
| |
Collapse
|
18
|
Padhye BD, Nawaz U, Hains PG, Reddel RR, Robinson PJ, Zhong Q, Poulos RC. Proteomic insights into paediatric cancer: Unravelling molecular signatures and therapeutic opportunities. Pediatr Blood Cancer 2024; 71:e30980. [PMID: 38556739 DOI: 10.1002/pbc.30980] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 03/13/2024] [Accepted: 03/14/2024] [Indexed: 04/02/2024]
Abstract
Survival rates in some paediatric cancers have improved greatly over recent decades, in part due to the identification of diagnostic, prognostic and predictive molecular signatures, and the development of risk-directed therapies. However, other paediatric cancers have proved difficult to treat, and there is an urgent need to identify novel biomarkers that reveal therapeutic opportunities. The proteome is the total set of expressed proteins present in a cell or tissue at a point in time, and is vastly more dynamic than the genome. Proteomics holds significant promise for cancer research, as proteins are ultimately responsible for cellular phenotype and are the target of most anticancer drugs. Here, we review the discoveries, opportunities and challenges of proteomic analyses in paediatric cancer, with a focus on mass spectrometry (MS)-based approaches. Accelerating incorporation of proteomics into paediatric precision medicine has the potential to improve survival and quality of life for children with cancer.
Collapse
Affiliation(s)
- Bhavna D Padhye
- Cancer Centre for Children, The Children's Hospital at Westmead, Westmead, New South Wales, Australia
- Kids Research, Children's Cancer Research Unit, The Children's Hospital at Westmead, Westmead, New South Wales, Australia
| | - Urwah Nawaz
- ProCan, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, Australia
| | - Peter G Hains
- ProCan, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, Australia
| | - Roger R Reddel
- ProCan, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, Australia
| | - Phillip J Robinson
- ProCan, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, Australia
| | - Qing Zhong
- ProCan, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, Australia
| | - Rebecca C Poulos
- ProCan, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, Australia
| |
Collapse
|
19
|
Pelletier SJ, Leclercq M, Roux-Dalvai F, de Geus MB, Leslie S, Wang W, Lam TT, Nairn AC, Arnold SE, Carlyle BC, Precioso F, Droit A. BERNN: Enhancing classification of Liquid Chromatography Mass Spectrometry data with batch effect removal neural networks. Nat Commun 2024; 15:3777. [PMID: 38710683 PMCID: PMC11074280 DOI: 10.1038/s41467-024-48177-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 04/24/2024] [Indexed: 05/08/2024] Open
Abstract
Liquid Chromatography Mass Spectrometry (LC-MS) is a powerful method for profiling complex biological samples. However, batch effects typically arise from differences in sample processing protocols, experimental conditions, and data acquisition techniques, significantly impacting the interpretability of results. Correcting batch effects is crucial for the reproducibility of omics research, but current methods are not optimal for the removal of batch effects without compressing the genuine biological variation under study. We propose a suite of Batch Effect Removal Neural Networks (BERNN) to remove batch effects in large LC-MS experiments, with the goal of maximizing sample classification performance between conditions. More importantly, these models must efficiently generalize in batches not seen during training. A comparison of batch effect correction methods across five diverse datasets demonstrated that BERNN models consistently showed the strongest sample classification performance. However, the model producing the greatest classification improvements did not always perform best in terms of batch effect removal. Finally, we show that the overcorrection of batch effects resulted in the loss of some essential biological variability. These findings highlight the importance of balancing batch effect removal while preserving valuable biological diversity in large-scale LC-MS experiments.
Collapse
Affiliation(s)
- Simon J Pelletier
- Computational Biology Laboratory, CHU de Québec - Université Laval Research Center, Québec City, QC, Canada
| | - Mickaël Leclercq
- Computational Biology Laboratory, CHU de Québec - Université Laval Research Center, Québec City, QC, Canada
| | - Florence Roux-Dalvai
- Computational Biology Laboratory, CHU de Québec - Université Laval Research Center, Québec City, QC, Canada
- Proteomics Platform, CHU de Québec - Université Laval Research Center, Québec City, QC, Canada
| | - Matthijs B de Geus
- Massachusetts General Hospital Department of Neurology, Charlestown, MA, USA
- Leiden University Medical Center, Leiden, The Netherlands
| | - Shannon Leslie
- Yale Department of Psychiatry, New Haven, CT, USA
- Janssen Pharmaceuticals, San Diego, CA, USA
| | - Weiwei Wang
- Keck MS & Proteomics Resource, Yale School of Medicine, New Haven, CT, USA
| | - TuKiet T Lam
- Keck MS & Proteomics Resource, Yale School of Medicine, New Haven, CT, USA
- Yale School of Medicine, Department of Molecular Biophysics and Biochemistry, New Haven, CT, USA
| | | | - Steven E Arnold
- Massachusetts General Hospital Department of Neurology, Charlestown, MA, USA
| | - Becky C Carlyle
- Massachusetts General Hospital Department of Neurology, Charlestown, MA, USA
- Oxford University Department of Physiology Anatomy and Genetics, Oxford, UK
- Kavli Institute for Nanoscience Discovery, Oxford, UK
| | - Frédéric Precioso
- Université Côte d'Azur, CNRS, INRIA, I3S, Sophia Antipolis, Nice, France
| | - Arnaud Droit
- Computational Biology Laboratory, CHU de Québec - Université Laval Research Center, Québec City, QC, Canada.
- Proteomics Platform, CHU de Québec - Université Laval Research Center, Québec City, QC, Canada.
| |
Collapse
|
20
|
Coorssen JR, Padula MP. Proteomics-The State of the Field: The Definition and Analysis of Proteomes Should Be Based in Reality, Not Convenience. Proteomes 2024; 12:14. [PMID: 38651373 PMCID: PMC11036260 DOI: 10.3390/proteomes12020014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Revised: 04/17/2024] [Accepted: 04/17/2024] [Indexed: 04/25/2024] Open
Abstract
With growing recognition and acknowledgement of the genuine complexity of proteomes, we are finally entering the post-proteogenomic era. Routine assessment of proteomes as inferred correlates of gene sequences (i.e., canonical 'proteins') cannot provide the necessary critical analysis of systems-level biology that is needed to understand underlying molecular mechanisms and pathways or identify the most selective biomarkers and therapeutic targets. These critical requirements demand the analysis of proteomes at the level of proteoforms/protein species, the actual active molecular players. Currently, only highly refined integrated or integrative top-down proteomics (iTDP) enables the analytical depth necessary to provide routine, comprehensive, and quantitative proteome assessments across the widest range of proteoforms inherent to native systems. Here we provide a broad perspective of the field, taking in historical and current realities, to establish a more balanced understanding of where the field has come from (in particular during the ten years since Proteomes was launched), current issues, and how things likely need to proceed if necessary deep proteome analyses are to succeed. We base this in our firm belief that the best proteomic analyses reflect, as closely as possible, the native sample at the moment of sampling. We also seek to emphasise that this and future analytical approaches are likely best based on the broad recognition and exploitation of the complementarity of currently successful approaches. This also emphasises the need to continuously evaluate and further optimize established approaches, to avoid complacency in thinking and expectations but also to promote the critical and careful development and introduction of new approaches, most notably those that address proteoforms. Above all, we wish to emphasise that a rigorous focus on analytical quality must override current thinking that largely values analytical speed; the latter would certainly be nice, if only proteoforms could thus be effectively, routinely, and quantitatively assessed. Alas, proteomes are composed of proteoforms, not molecular species that can be amplified or that directly mirror genes (i.e., 'canonical'). The problem is hard, and we must accept and address it as such, but the payoff in playing this longer game of rigorous deep proteome analyses is the promise of far more selective biomarkers, drug targets, and truly personalised or even individualised medicine.
Collapse
Affiliation(s)
- Jens R. Coorssen
- Department of Biological Sciences, Faculty of Mathematics and Science, Brock University, St. Catharines, ON L2S 3A1, Canada
- Institute for Globally Distributed Open Research and Education (IGDORE), St. Catharines, ON L2N 4X2, Canada
| | - Matthew P. Padula
- School of Life Sciences and Proteomics, Lipidomics and Metabolomics Core Facility, Faculty of Science, University of Technology Sydney, Sydney, NSW 2007, Australia
| |
Collapse
|
21
|
Xavier D, Lucas N, Williams SG, Koh JMS, Ashman K, Loudon C, Reddel R, Hains PG, Robinson PJ. Heat 'n Beat: A Universal High-Throughput End-to-End Proteomics Sample Processing Platform in under an Hour. Anal Chem 2024; 96:4093-4102. [PMID: 38427620 DOI: 10.1021/acs.analchem.3c04708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/03/2024]
Abstract
Proteomic analysis by mass spectrometry of small (≤2 mg) solid tissue samples from diverse formats requires high throughput and comprehensive proteome coverage. We developed a nearly universal, rapid, and robust protocol for sample preparation, suitable for high-throughput projects that encompass most cell or tissue types. This end-to-end workflow extends from original sample to loading the mass spectrometer and is centered on a one-tube homogenization and digestion method called Heat 'n Beat (HnB). It is applicable to most tissues, regardless of how they were fixed or embedded. Sample preparation was divided into separate challenges. The initial sample washing and final peptide cleanup steps were adapted to three tissue sources: fresh frozen (FF), optimal cutting temperature (OCT) compound embedded (FF-OCT), and formalin-fixed paraffin embedded (FFPE). Third, for core processing, tissue disruption and lysis were decreased to a 7 min heat and homogenization treatment, and reduction, alkylation, and proteolysis were optimized into a single step. The refinements produced near doubled peptide yield when compared to our earlier method ABLE delivered a consistently high digestion efficiency of 85-90%, reported by ProteinPilot, and required only 38 min for core processing in a single tube, with the total processing time being 53-63 min. The robustness of HnB was demonstrated on six organ types, a cell line, and a cancer biopsy. Its suitability for high-throughput applications was demonstrated on a set of 1171 FF-OCT human cancer biopsies, which were processed for end-to-end completion in 92 h, producing highly consistent peptide yield and quality for over 3513 MS runs.
Collapse
Affiliation(s)
- Dylan Xavier
- ProCan, Faculty of Medicine and Health, The University of Sydney, Children's Medical Research Institute, Westmead, NSW 2145, Australia
| | - Natasha Lucas
- ProCan, Faculty of Medicine and Health, The University of Sydney, Children's Medical Research Institute, Westmead, NSW 2145, Australia
| | - Steven G Williams
- ProCan, Faculty of Medicine and Health, The University of Sydney, Children's Medical Research Institute, Westmead, NSW 2145, Australia
| | - Jennifer M S Koh
- ProCan, Faculty of Medicine and Health, The University of Sydney, Children's Medical Research Institute, Westmead, NSW 2145, Australia
| | - Keith Ashman
- ProCan, Faculty of Medicine and Health, The University of Sydney, Children's Medical Research Institute, Westmead, NSW 2145, Australia
| | - Clare Loudon
- ProCan, Faculty of Medicine and Health, The University of Sydney, Children's Medical Research Institute, Westmead, NSW 2145, Australia
| | - Roger Reddel
- ProCan, Faculty of Medicine and Health, The University of Sydney, Children's Medical Research Institute, Westmead, NSW 2145, Australia
| | - Peter G Hains
- ProCan, Faculty of Medicine and Health, The University of Sydney, Children's Medical Research Institute, Westmead, NSW 2145, Australia
| | - Phillip J Robinson
- ProCan, Faculty of Medicine and Health, The University of Sydney, Children's Medical Research Institute, Westmead, NSW 2145, Australia
| |
Collapse
|
22
|
Yang Y, Mirzaei G. Performance analysis of data resampling on class imbalance and classification techniques on multi-omics data for cancer classification. PLoS One 2024; 19:e0293607. [PMID: 38422094 PMCID: PMC10903850 DOI: 10.1371/journal.pone.0293607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 10/17/2023] [Indexed: 03/02/2024] Open
Abstract
Cancer, in any of its forms, remains a significant public health concern worldwide. Advances in early detection and treatment could lead to a decline in the overall death rate from cancer in recent decades. Therefore, tumor prediction and classification play an important role in fighting cancer. This study built computational models for a joint analysis of RNA seq, copy number variation (CNV), and DNA methylation to classify normal and tumor samples across liver cancer, breast cancer, and colon adenocarcinoma from The Cancer Genome Atlas (TCGA) dataset. Total of 18 machine learning methods were evaluated based on the AUC, precision, recall, and F-measure. Besides, five techniques were compared to ameliorate problems of class imbalance in the cancer datasets. Synthetic Minority Oversampling Technique (SMOTE) demonstrated the best performance. The results indicate that the model applying Stochastic Gradient Descent (SGD) for learning binary class SVM with hinge loss has the highest classification results on liver cancer and breast cancer datasets, with accuracy over 99% and AUC greater than or equal to 0.999. For colon adenocarcinoma dataset, both SGD and Sequential Minimal Optimization (SMO) that implements John Platt's sequential minimal optimization algorithm for training a support vector machine shows an outstanding classification performance with accuracy of 100%, AUC, precision, recall, and F-measure all at 1.000.
Collapse
Affiliation(s)
- Yuting Yang
- Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio, United States of America
| | - Golrokh Mirzaei
- Department of Computer Science and Engineering, The Ohio State University, Marion, Ohio, United States of America
| |
Collapse
|
23
|
Henke AN, Chilukuri S, Langan LM, Brooks BW. Reporting and reproducibility: Proteomics of fish models in environmental toxicology and ecotoxicology. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 912:168455. [PMID: 37979845 DOI: 10.1016/j.scitotenv.2023.168455] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 11/06/2023] [Accepted: 11/07/2023] [Indexed: 11/20/2023]
Abstract
Environmental toxicology and ecotoxicology research efforts are employing proteomics with fish models as New Approach Methodologies, along with in silico, in vitro and other omics techniques to elucidate hazards of toxicants and toxins. We performed a critical review of toxicology studies with fish models using proteomics and reported fundamental parameters across experimental design, sample preparation, mass spectrometry, and bioinformatics of fish, which represent alternative vertebrate models in environmental toxicology, and routinely studied animals in ecotoxicology. We observed inconsistencies in reporting and methodologies among experimental designs, sample preparations, data acquisitions and bioinformatics, which can affect reproducibility of experimental results. We identified a distinct need to develop reporting guidelines for proteomics use in environmental toxicology and ecotoxicology, increased QA/QC throughout studies, and method optimization with an emphasis on reducing inconsistencies among studies. Several recommendations are offered as logical steps to advance development and application of this emerging research area to understand chemical hazards to public health and the environment.
Collapse
Affiliation(s)
- Abigail N Henke
- Department of Biology, Baylor University Waco, TX, USA; Center for Reservoir and Aquatic Systems Research (CRASR), Baylor University Waco, TX, USA
| | | | - Laura M Langan
- Department of Environmental Science, Baylor University Waco, TX, USA; Center for Reservoir and Aquatic Systems Research (CRASR), Baylor University Waco, TX, USA.
| | - Bryan W Brooks
- Department of Environmental Science, Baylor University Waco, TX, USA; Center for Reservoir and Aquatic Systems Research (CRASR), Baylor University Waco, TX, USA.
| |
Collapse
|
24
|
Hartley B, Bassiouni W, Roczkowsky A, Fahlman R, Schulz R, Julien O. Data-Independent Acquisition Proteomics and N-Terminomics Methods Reveal Alterations in Mitochondrial Function and Metabolism in Ischemic-Reperfused Hearts. J Proteome Res 2024; 23:844-856. [PMID: 38264990 PMCID: PMC10846531 DOI: 10.1021/acs.jproteome.3c00754] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Revised: 01/06/2024] [Accepted: 01/10/2024] [Indexed: 01/25/2024]
Abstract
Myocardial ischemia-reperfusion (IR) (stunning) injury triggers changes in the proteome and degradome of the heart. Here, we utilize quantitative proteomics and comprehensive degradomics to investigate the molecular mechanisms of IR injury in isolated rat hearts. The control group underwent aerobic perfusion, while the IR injury group underwent 20 min of ischemia and 30 min of reperfusion to induce a stunning injury. As MMP-2 activation has been shown to contribute to myocardial injury, hearts also underwent IR injury with ARP-100, an MMP-2-preferring inhibitor, to dissect the contribution of MMP-2 to IR injury. Using data-independent acquisition (DIA) and mass spectroscopy, we quantified 4468 proteins in ventricular extracts, whereby 447 proteins showed significant alterations among the three groups. We then used subtiligase-mediated N-terminomic labeling to identify more than a hundred specific cleavage sites. Among these protease substrates, 15 were identified following IR injury. We identified alterations in numerous proteins involved in mitochondrial function and metabolism following IR injury. Our findings provide valuable insights into the biochemical mechanisms of myocardial IR injury, suggesting alterations in reactive oxygen/nitrogen species handling and generation, fatty acid metabolism, mitochondrial function and metabolism, and cardiomyocyte contraction.
Collapse
Affiliation(s)
- Bridgette Hartley
- Department
of Biochemistry, University of Alberta, Edmonton T6G 2H7, Canada
| | - Wesam Bassiouni
- Department
of Pharmacology, University of Alberta, Edmonton T6G 2S2, Canada
| | - Andrej Roczkowsky
- Department
of Pharmacology, University of Alberta, Edmonton T6G 2S2, Canada
| | - Richard Fahlman
- Department
of Biochemistry, University of Alberta, Edmonton T6G 2H7, Canada
| | - Richard Schulz
- Department
of Pharmacology, University of Alberta, Edmonton T6G 2S2, Canada
- Department
of Pediatrics, University of Alberta, Edmonton T6G 2S2, Canada
| | - Olivier Julien
- Department
of Biochemistry, University of Alberta, Edmonton T6G 2H7, Canada
| |
Collapse
|
25
|
Zhong Q, Sun R, Aref AT, Noor Z, Anees A, Zhu Y, Lucas N, Poulos RC, Lyu M, Zhu T, Chen GB, Wang Y, Ding X, Rutishauser D, Rupp NJ, Rueschoff JH, Poyet C, Hermanns T, Fankhauser C, Rodríguez Martínez M, Shao W, Buljan M, Neumann JF, Beyer A, Hains PG, Reddel RR, Robinson PJ, Aebersold R, Guo T, Wild PJ. Proteomic-based stratification of intermediate-risk prostate cancer patients. Life Sci Alliance 2024; 7:e202302146. [PMID: 38052461 PMCID: PMC10698198 DOI: 10.26508/lsa.202302146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 11/22/2023] [Accepted: 11/23/2023] [Indexed: 12/07/2023] Open
Abstract
Gleason grading is an important prognostic indicator for prostate adenocarcinoma and is crucial for patient treatment decisions. However, intermediate-risk patients diagnosed in the Gleason grade group (GG) 2 and GG3 can harbour either aggressive or non-aggressive disease, resulting in under- or overtreatment of a significant number of patients. Here, we performed proteomic, differential expression, machine learning, and survival analyses for 1,348 matched tumour and benign sample runs from 278 patients. Three proteins (F5, TMEM126B, and EARS2) were identified as candidate biomarkers in patients with biochemical recurrence. Multivariate Cox regression yielded 18 proteins, from which a risk score was constructed to dichotomize prostate cancer patients into low- and high-risk groups. This 18-protein signature is prognostic for the risk of biochemical recurrence and completely independent of the intermediate GG. Our results suggest that markers generated by computational proteomic profiling have the potential for clinical applications including integration into prostate cancer management.
Collapse
Affiliation(s)
- Qing Zhong
- ProCan, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, Australia
| | - Rui Sun
- iMarker Lab, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, China
| | - Adel T Aref
- ProCan, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, Australia
| | - Zainab Noor
- ProCan, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, Australia
| | - Asim Anees
- ProCan, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, Australia
| | - Yi Zhu
- iMarker Lab, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, China
| | - Natasha Lucas
- ProCan, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, Australia
| | - Rebecca C Poulos
- ProCan, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, Australia
| | - Mengge Lyu
- iMarker Lab, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, China
| | - Tiansheng Zhu
- iMarker Lab, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, China
| | - Guo-Bo Chen
- Urology & Nephrology Center, Department of Urology, Clinical Research Institute, Zhejiang Provincial People's Hospital, People's Hospital of Hangzhou Medical College, Hangzhou, China
| | - Yingrui Wang
- iMarker Lab, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, China
| | - Xuan Ding
- iMarker Lab, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, China
| | - Dorothea Rutishauser
- Department of Pathology and Molecular Pathology, University Hospital Zürich, Zürich, Switzerland
| | - Niels J Rupp
- Department of Pathology and Molecular Pathology, University Hospital Zürich, Zürich, Switzerland
| | - Jan H Rueschoff
- Department of Pathology and Molecular Pathology, University Hospital Zürich, Zürich, Switzerland
| | - Cédric Poyet
- Department of Urology, University Hospital Zürich, Zürich, Switzerland
| | - Thomas Hermanns
- Department of Urology, University Hospital Zürich, Zürich, Switzerland
| | - Christian Fankhauser
- Department of Urology, University Hospital Zürich, Zürich, Switzerland
- Department of Urology, Cantonal Hospital Lucerne, Lucerne, Switzerland
| | | | - Wenguang Shao
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Marija Buljan
- Empa - Swiss Federal Laboratories for Materials Science and Technology, St. Gallen, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | | | - Peter G Hains
- ProCan, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, Australia
| | - Roger R Reddel
- ProCan, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, Australia
| | - Phillip J Robinson
- ProCan, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, Australia
| | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland
- Faculty of Science, University of Zürich, Zürich, Switzerland
| | - Tiannan Guo
- iMarker Lab, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, China
| | - Peter J Wild
- Goethe University Frankfurt, Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt am Main, Germany
- Frankfurt Institute for Advanced Studies, Frankfurt am Main, Germany
| |
Collapse
|
26
|
Low RRJ, Fung KY, Dagley LF, Yousef J, Emery-Corbin SJ, Putoczki TL. Unbiased Quantitative Proteomics of Organoid Models of Pancreatic Cancer. Methods Mol Biol 2024; 2823:77-93. [PMID: 39052215 DOI: 10.1007/978-1-0716-3922-1_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/27/2024]
Abstract
Pancreatic ductal adenocarcinoma (PDAC) is a lethal solid malignancy with many patients succumbing to the disease within 6 months of diagnosis. The mechanisms that underlie PDAC initiation and progression are poorly understood. Current treatment options are primarily limited to chemotherapy, which is often provided with palliative intent. Unfortunately, there are no robust biomarkers to guide treatment selection or monitor treatment response. This is concerning given the increasing incidence of this cancer. We and others have generated organoid models to explore the biology underlying PDAC with the goal of identifying new therapeutic targets. Here we provide protocols to generate a preclinical PDAC organoid model and methods to use these to define the proteomic landscape of this cancer.
Collapse
Affiliation(s)
- Ronnie Ren Jie Low
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, VIC, Australia
- Currently at the DSB Repair Metabolism Laboratory, The Francis Crick Institute, London, UK
| | - Ka Yee Fung
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, VIC, Australia
| | - Laura F Dagley
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, VIC, Australia
| | - Jumana Yousef
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, VIC, Australia
| | - Samantha J Emery-Corbin
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, VIC, Australia
| | - Tracy L Putoczki
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia.
- Department of Medical Biology, The University of Melbourne, Parkville, VIC, Australia.
| |
Collapse
|
27
|
Bichmann L, Gupta S, Röst H. Data-Independent Acquisition Peptidomics. Methods Mol Biol 2024; 2758:77-88. [PMID: 38549009 DOI: 10.1007/978-1-0716-3646-6_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/02/2024]
Abstract
In recent years, data-independent acquisition (DIA) has emerged as a powerful analysis method in biological mass spectrometry (MS). Compared to the previously predominant data-dependent acquisition (DDA), it offers a way to achieve greater reproducibility, sensitivity, and dynamic range in MS measurements. To make DIA accessible to non-expert users, a multifunctional, automated high-throughput pipeline DIAproteomics was implemented in the computational workflow framework "Nextflow" ( https://nextflow.io ). This allows high-throughput processing of proteomics and peptidomics DIA datasets on diverse computing infrastructures. This chapter provides a short summary and usage protocol guide for the most important modes of operation of this pipeline regarding the analysis of peptidomics datasets using the command line. In brief, DIAproteomics is a wrapper around the OpenSwathWorkflow and relies on either existing or ad-hoc generated spectral libraries from matching DDA runs. The OpenSwathWorkflow extracts chromatograms from the DIA runs and performs chromatographic peak-picking. Further downstream of the pipeline, these peaks are scored, aligned, and statistically evaluated for qualitative and quantitative differences across conditions depending on the user's interest. DIAproteomics is open-source and available under a permissive license. We encourage the scientific community to use or modify the pipeline to meet their specific requirements.
Collapse
Affiliation(s)
- Leon Bichmann
- Department of Computer Science, Applied Bioinformatics, University of Tübingen, Tübingen, Germany
| | - Shubham Gupta
- Donnelly Center for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Hannes Röst
- Donnelly Center for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
28
|
Abid MSR, Qiu H, Checco JW. Label-Free Quantitation of Endogenous Peptides. Methods Mol Biol 2024; 2758:125-150. [PMID: 38549012 PMCID: PMC11027169 DOI: 10.1007/978-1-0716-3646-6_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/02/2024]
Abstract
Liquid chromatography-mass spectrometry (LC-MS)-based peptidomics methods allow for the detection and identification of many peptides in a complex biological mixture in an untargeted manner. Quantitative peptidomics approaches allow for comparisons of peptide abundance between different samples, allowing one to draw conclusions about peptide differences as a function of experimental treatment or physiology. While stable isotope labeling is a powerful approach for quantitative proteomics and peptidomics, advances in mass spectrometry instrumentation and analysis tools have allowed label-free methods to gain popularity in recent years. In a general label-free quantitative peptidomics experiment, peak intensity information for each peptide is compared across multiple LC-MS runs. Here, we outline a general approach for label-free quantitative peptidomics experiments, including steps for sample preparation, LC-MS data acquisition, data processing, and statistical analysis. Special attention is paid to address run-to-run variability, which can lead to several major problems in label-free experiments. Overall, our method provides researchers with a framework for the development of their own quantitative peptidomics workflows applicable to quantitation of peptides from a wide variety of different biological sources.
Collapse
Affiliation(s)
| | - Haowen Qiu
- Center for Biotechnology, University of Nebraska-Lincoln, Lincoln, NE, USA
- The Nebraska Center for Integrated Biomolecular Communication (NCIBC), University of Nebraska-Lincoln, Lincoln, NE, USA
| | - James W Checco
- Department of Chemistry, University of Nebraska-Lincoln, Lincoln, NE, USA.
- The Nebraska Center for Integrated Biomolecular Communication (NCIBC), University of Nebraska-Lincoln, Lincoln, NE, USA.
| |
Collapse
|
29
|
Burnie J, Fernandes C, Chaphekar D, Wei D, Ahmed S, Persaud AT, Khader N, Cicala C, Arthos J, Tang VA, Guzzo C. Identification of CD38, CD97, and CD278 on the HIV surface using a novel flow virometry screening assay. Sci Rep 2023; 13:23025. [PMID: 38155248 PMCID: PMC10754950 DOI: 10.1038/s41598-023-50365-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 12/19/2023] [Indexed: 12/30/2023] Open
Abstract
While numerous cellular proteins in the HIV envelope are known to alter virus infection, methodology to rapidly phenotype the virion surface in a high throughput, single virion manner is lacking. Thus, many human proteins may exist on the virion surface that remain undescribed. Herein, we developed a novel flow virometry screening assay to discover new proteins on the surface of HIV particles. By screening a CD4+ T cell line and its progeny virions, along with four HIV isolates produced in primary cells, we discovered 59 new candidate proteins in the HIV envelope that were consistently detected across diverse HIV isolates. Among these discoveries, CD38, CD97, and CD278 were consistently present at high levels on virions when using orthogonal techniques to corroborate flow virometry results. This study yields new discoveries about virus biology and demonstrates the utility and feasibility of a novel flow virometry assay to phenotype individual virions.
Collapse
Affiliation(s)
- Jonathan Burnie
- Department of Biological Sciences, University of Toronto Scarborough, 1265 Military Trail, Toronto, ON, Canada
- Department of Cell and Systems Biology, University of Toronto, 25 Harbord Street, Toronto, ON, Canada
| | - Claire Fernandes
- Department of Biological Sciences, University of Toronto Scarborough, 1265 Military Trail, Toronto, ON, Canada
- Department of Cell and Systems Biology, University of Toronto, 25 Harbord Street, Toronto, ON, Canada
| | - Deepa Chaphekar
- Department of Biological Sciences, University of Toronto Scarborough, 1265 Military Trail, Toronto, ON, Canada
- Department of Cell and Systems Biology, University of Toronto, 25 Harbord Street, Toronto, ON, Canada
| | - Danlan Wei
- Laboratory of Immunoregulation, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Shubeen Ahmed
- Department of Biological Sciences, University of Toronto Scarborough, 1265 Military Trail, Toronto, ON, Canada
| | - Arvin Tejnarine Persaud
- Department of Biological Sciences, University of Toronto Scarborough, 1265 Military Trail, Toronto, ON, Canada
- Department of Cell and Systems Biology, University of Toronto, 25 Harbord Street, Toronto, ON, Canada
| | - Nawrah Khader
- Department of Cell and Systems Biology, University of Toronto, 25 Harbord Street, Toronto, ON, Canada
| | - Claudia Cicala
- Laboratory of Immunoregulation, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | - James Arthos
- Laboratory of Immunoregulation, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Vera A Tang
- Flow Cytometry and Virometry Core Facility, Department of Biochemistry, Microbiology, and Immunology, Faculty of Medicine, University of Ottawa, 451 Smyth Road, Ottawa, ON, Canada
| | - Christina Guzzo
- Department of Biological Sciences, University of Toronto Scarborough, 1265 Military Trail, Toronto, ON, Canada.
- Department of Cell and Systems Biology, University of Toronto, 25 Harbord Street, Toronto, ON, Canada.
| |
Collapse
|
30
|
Kitata RB, Yang JC, Chen YJ. Advances in data-independent acquisition mass spectrometry towards comprehensive digital proteome landscape. MASS SPECTROMETRY REVIEWS 2023; 42:2324-2348. [PMID: 35645145 DOI: 10.1002/mas.21781] [Citation(s) in RCA: 69] [Impact Index Per Article: 34.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2021] [Revised: 12/17/2021] [Accepted: 01/21/2022] [Indexed: 06/15/2023]
Abstract
The data-independent acquisition mass spectrometry (DIA-MS) has rapidly evolved as a powerful alternative for highly reproducible proteome profiling with a unique strength of generating permanent digital maps for retrospective analysis of biological systems. Recent advancements in data analysis software tools for the complex DIA-MS/MS spectra coupled to fast MS scanning speed and high mass accuracy have greatly expanded the sensitivity and coverage of DIA-based proteomics profiling. Here, we review the evolution of the DIA-MS techniques, from earlier proof-of-principle of parallel fragmentation of all-ions or ions in selected m/z range, the sequential window acquisition of all theoretical mass spectra (SWATH-MS) to latest innovations, recent development in computation algorithms for data informatics, and auxiliary tools and advanced instrumentation to enhance the performance of DIA-MS. We further summarize recent applications of DIA-MS and experimentally-derived as well as in silico spectra library resources for large-scale profiling to facilitate biomarker discovery and drug development in human diseases with emphasis on the proteomic profiling coverage. Toward next-generation DIA-MS for clinical proteomics, we outline the challenges in processing multi-dimensional DIA data set and large-scale clinical proteomics, and continuing need in higher profiling coverage and sensitivity.
Collapse
Affiliation(s)
| | - Jhih-Ci Yang
- Institute of Chemistry, Academia Sinica, Taipei, Taiwan
- Sustainable Chemical Science and Technology, Taiwan International Graduate Program, Academia Sinica and National Yang Ming Chiao Tung University, Taipei, Taiwan
- Department of Applied Chemistry, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Yu-Ju Chen
- Institute of Chemistry, Academia Sinica, Taipei, Taiwan
- Sustainable Chemical Science and Technology, Taiwan International Graduate Program, Academia Sinica and National Yang Ming Chiao Tung University, Taipei, Taiwan
- Department of Chemistry, National Taiwan University, Taipei, Taiwan
| |
Collapse
|
31
|
Hay BN, Akinlaja MO, Baker TC, Houfani AA, Stacey RG, Foster LJ. Integration of data-independent acquisition (DIA) with co-fractionation mass spectrometry (CF-MS) to enhance interactome mapping capabilities. Proteomics 2023; 23:e2200278. [PMID: 37144656 DOI: 10.1002/pmic.202200278] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 04/03/2023] [Accepted: 04/14/2023] [Indexed: 05/06/2023]
Abstract
Proteomics technologies are continually advancing, providing opportunities to develop stronger and more robust protein interaction networks (PINs). In part, this is due to the ever-growing number of high-throughput proteomics methods that are available. This review discusses how data-independent acquisition (DIA) and co-fractionation mass spectrometry (CF-MS) can be integrated to enhance interactome mapping abilities. Furthermore, integrating these two techniques can improve data quality and network generation through extended protein coverage, less missing data, and reduced noise. CF-DIA-MS shows promise in expanding our knowledge of interactomes, notably for non-model organisms (NMOs). CF-MS is a valuable technique on its own, but upon the integration of DIA, the potential to develop robust PINs increases, offering a unique approach for researchers to gain an in-depth understanding into the dynamics of numerous biological processes.
Collapse
Affiliation(s)
- Brenna N Hay
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Mopelola O Akinlaja
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Teesha C Baker
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Aicha Asma Houfani
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - R Greg Stacey
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Leonard J Foster
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
32
|
Gupta S, Sing JC, Röst HL. Achieving quantitative reproducibility in label-free multisite DIA experiments through multirun alignment. Commun Biol 2023; 6:1101. [PMID: 37903988 PMCID: PMC10616189 DOI: 10.1038/s42003-023-05437-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 10/10/2023] [Indexed: 11/01/2023] Open
Abstract
DIA is a mainstream method for quantitative proteomics, but consistent quantification across multiple LC-MS/MS instruments remains a bottleneck in parallelizing data acquisition. One reason for this inconsistency and missing quantification is the retention time shift which current software does not adequately address for runs from multiple sites. We present multirun chromatogram alignment strategies to map peaks across columns, including the traditional reference-based Star method, and two novel approaches: MST and Progressive alignment. These reference-free strategies produce a quantitatively accurate data-matrix, even from heterogeneous multi-column studies. Progressive alignment also generates merged chromatograms from all runs which has not been previously achieved for LC-MS/MS data. First, we demonstrate the effectiveness of multirun alignment strategies on a gold-standard annotated dataset, resulting in a threefold reduction in quantitation error-rate compared to non-aligned DIA results. Subsequently, on a multi-species dataset that DIAlignR effectively controls the quantitative error rate, improves precision in protein measurements, and exhibits conservative peak alignment. We next show that the MST alignment reduces cross-site CV by 50% for highly abundant proteins when applied to a dataset from 11 different LC-MS/MS setups. Finally, the reanalysis of 949 plasma runs with multirun alignment revealed a more than 50% increase in insulin resistance (IR) and respiratory viral infection (RVI) proteins, identifying 11 and 13 proteins respectively, compared to prior analysis without it. The three strategies are implemented in our DIAlignR workflow (>2.3) and can be combined with linear, non-linear, or hybrid pairwise alignment.
Collapse
Affiliation(s)
- Shubham Gupta
- Terrence Donnelly Centre for Cellular & Biomolecular Research, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Justin C Sing
- Terrence Donnelly Centre for Cellular & Biomolecular Research, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Hannes L Röst
- Terrence Donnelly Centre for Cellular & Biomolecular Research, University of Toronto, Toronto, ON, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.
- Department of Computer Science, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
33
|
Reilly L, Lara E, Ramos D, Li Z, Pantazis CB, Stadler J, Santiana M, Roberts J, Faghri F, Hao Y, Nalls MA, Narayan P, Liu Y, Singleton AB, Cookson MR, Ward ME, Qi YA. A fully automated FAIMS-DIA mass spectrometry-based proteomic pipeline. CELL REPORTS METHODS 2023; 3:100593. [PMID: 37729920 PMCID: PMC10626189 DOI: 10.1016/j.crmeth.2023.100593] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 06/30/2023] [Accepted: 08/24/2023] [Indexed: 09/22/2023]
Abstract
Here, we present a standardized, "off-the-shelf" proteomics pipeline working in a single 96-well plate to achieve deep coverage of cellular proteomes with high throughput and scalability. This integrated pipeline streamlines a fully automated sample preparation platform, a data-independent acquisition (DIA) coupled with high-field asymmetric waveform ion mobility spectrometer (FAIMS) interface, and an optimized library-free DIA database search strategy. Our systematic evaluation of FAIMS-DIA showing single compensation voltage (CV) at -35 V not only yields the deepest proteome coverage but also best correlates with DIA without FAIMS. Our in-depth comparison of direct-DIA database search engines shows that Spectronaut outperforms others, providing the highest quantifiable proteins. Next, we apply three common DIA strategies in characterizing human induced pluripotent stem cell (iPSC)-derived neurons and show single-shot mass spectrometry (MS) using single-CV (-35 V)-FAIMS-DIA results in >9,000 quantifiable proteins with <10% missing values, as well as superior reproducibility and accuracy compared with other existing DIA methods.
Collapse
Affiliation(s)
- Luke Reilly
- Center for Alzheimer's and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
| | - Erika Lara
- Center for Alzheimer's and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
| | - Daniel Ramos
- Center for Alzheimer's and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
| | - Ziyi Li
- Center for Alzheimer's and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA; Data Tecnica International, LLC, Glen Echo, MD, USA
| | - Caroline B Pantazis
- Center for Alzheimer's and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
| | - Julia Stadler
- Center for Alzheimer's and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
| | - Marianita Santiana
- Center for Alzheimer's and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
| | - Jessica Roberts
- Center for Alzheimer's and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
| | - Faraz Faghri
- Center for Alzheimer's and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA; Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA; Data Tecnica International, LLC, Glen Echo, MD, USA
| | - Ying Hao
- Center for Alzheimer's and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
| | - Mike A Nalls
- Center for Alzheimer's and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA; Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA; Data Tecnica International, LLC, Glen Echo, MD, USA
| | - Priyanka Narayan
- Center for Alzheimer's and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA; Genetics and Biochemistry Branch, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20814, USA
| | - Yansheng Liu
- Department of Pharmacology, Yale University School of Medicine, New Haven, CT 06510, USA
| | - Andrew B Singleton
- Center for Alzheimer's and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA; Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA
| | - Mark R Cookson
- Center for Alzheimer's and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA; Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA
| | - Michael E Ward
- Center for Alzheimer's and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA; National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
| | - Yue A Qi
- Center for Alzheimer's and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
34
|
Searle BC, Chien A, Koller A, Hawke D, Herren AW, Kim Kim J, Lee KA, Leib RD, Nelson AJ, Patel P, Ren JM, Stemmer PM, Zhu Y, Neely BA, Patel B. A Multipathway Phosphopeptide Standard for Rapid Phosphoproteomics Assay Development. Mol Cell Proteomics 2023; 22:100639. [PMID: 37657519 PMCID: PMC10561125 DOI: 10.1016/j.mcpro.2023.100639] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 08/22/2023] [Accepted: 08/24/2023] [Indexed: 09/03/2023] Open
Abstract
Recent advances in methodology have made phosphopeptide analysis a tractable problem for many proteomics researchers. There are now a wide variety of robust and accessible enrichment strategies to generate phosphoproteomes while free or inexpensive software tools for quantitation and site localization have simplified phosphoproteome analysis workflow tremendously. As a research group under the Association for Biomolecular Resource Facilities umbrella, the Proteomics Standards Research Group has worked to develop a multipathway phosphopeptide standard based on a mixture of heavy-labeled phosphopeptides designed to enable researchers to rapidly develop assays. This mixture contains 131 mass spectrometry vetted phosphopeptides specifically chosen to cover as many known biologically interesting phosphosites as possible from seven different signaling networks: AMPK signaling, death and apoptosis signaling, ErbB signaling, insulin/insulin-like growth factor-1 signaling, mTOR signaling, PI3K/AKT signaling, and stress (p38/SAPK/JNK) signaling. Here, we describe a characterization of this mixture spiked into a HeLa tryptic digest stimulated with both epidermal growth factor and insulin-like growth factor-1 to activate the MAPK and PI3K/AKT/mTOR pathways. We further demonstrate a comparison of phosphoproteomic profiling of HeLa performed independently in five labs using this phosphopeptide mixture with data-independent acquisition. Despite different experimental and instrumentation processes, we found that labs could produce reproducible, harmonized datasets by reporting measurements as ratios to the standard, while intensity measurements showed lower consistency between labs even after normalization. Our results suggest that widely available, biologically relevant phosphopeptide standards can act as a quantitative "yardstick" across laboratories and sample preparations enabling experimental designs larger than a single laboratory can perform. Raw data files are publicly available in the MassIVE dataset MSV000090564.
Collapse
Affiliation(s)
- Brian C Searle
- Department of Biomedical Informatics, The Ohio State University, Columbus, Ohio, USA; Pelotonia Institute for Immuno-Oncology, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio, USA.
| | - Allis Chien
- Mass Spectrometry Center, Stanford University, Stanford, California, USA
| | | | | | - Anthony W Herren
- UC Davis Genome Center, Proteomics Core, University of California Davis, Davis California, USA
| | - Jenny Kim Kim
- Herbert Irving Comprehensive Cancer Center, Columbia University Medical Center, New York, New York, USA
| | - Kimberly A Lee
- Cell Signaling Technology, Inc, Danvers, Massachusetts, USA
| | - Ryan D Leib
- Mass Spectrometry Center, Stanford University, Stanford, California, USA
| | | | - Purvi Patel
- Herbert Irving Comprehensive Cancer Center, Columbia University Medical Center, New York, New York, USA
| | - Jian Min Ren
- Cell Signaling Technology, Inc, Danvers, Massachusetts, USA
| | - Paul M Stemmer
- Department of Pharmaceutical Sciences, Wayne State University, Detroit, Michigan, USA
| | - Yiying Zhu
- Cell Signaling Technology, Inc, Danvers, Massachusetts, USA
| | - Benjamin A Neely
- National Institute of Standards and Technology, Charleston, South Carolina, USA
| | - Bhavin Patel
- Thermo Fisher Scientific, Rockford, Illinois, USA
| |
Collapse
|
35
|
Buljan M, Banaei-Esfahani A, Blattmann P, Meier-Abt F, Shao W, Vitek O, Tang H, Aebersold R. A computational framework for the inference of protein complex remodeling from whole-proteome measurements. Nat Methods 2023; 20:1523-1529. [PMID: 37749212 PMCID: PMC10555833 DOI: 10.1038/s41592-023-02011-w] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Accepted: 08/16/2023] [Indexed: 09/27/2023]
Abstract
Protein complexes are responsible for the enactment of most cellular functions. For the protein complex to form and function, its subunits often need to be present at defined quantitative ratios. Typically, global changes in protein complex composition are assessed with experimental approaches that tend to be time consuming. Here, we have developed a computational algorithm for the detection of altered protein complexes based on the systematic assessment of subunit ratios from quantitative proteomic measurements. We applied it to measurements from breast cancer cell lines and patient biopsies and were able to identify strong remodeling of HDAC2 epigenetic complexes in more aggressive forms of cancer. The presented algorithm is available as an R package and enables the inference of changes in protein complex states by extracting functionally relevant information from bottom-up proteomic datasets.
Collapse
Affiliation(s)
- Marija Buljan
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.
- EMPA, Swiss Federal Laboratories for Materials Science and Technology, St Gallen, Switzerland.
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.
| | - Amir Banaei-Esfahani
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
- Department of Pathology and Molecular Pathology, University Hospital Zurich, Zurich, Switzerland
| | - Peter Blattmann
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
- Idorsia Pharmaceuticals, Allschwil, Switzerland
| | - Fabienne Meier-Abt
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
- Department of Medical Oncology and Hematology, University and University Hospital Zurich, Zurich, Switzerland
- Institute of Medical Genetics, University of Zurich, Zurich, Switzerland
| | - Wenguang Shao
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
- State Key Laboratory of Microbial Metabolism, School of Life Science & Biotechnology, and Joint International Research Laboratory of Metabolic & Developmental Sciences, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Olga Vitek
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Hua Tang
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.
- Faculty of Science, University of Zurich, Zurich, Switzerland.
| |
Collapse
|
36
|
Yu Y, Zhang N, Mai Y, Ren L, Chen Q, Cao Z, Chen Q, Liu Y, Hou W, Yang J, Hong H, Xu J, Tong W, Dong L, Shi L, Fang X, Zheng Y. Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method. Genome Biol 2023; 24:201. [PMID: 37674217 PMCID: PMC10483871 DOI: 10.1186/s13059-023-03047-z] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 05/18/2023] [Indexed: 09/08/2023] Open
Abstract
BACKGROUND Batch effects are notoriously common technical variations in multiomics data and may result in misleading outcomes if uncorrected or over-corrected. A plethora of batch-effect correction algorithms are proposed to facilitate data integration. However, their respective advantages and limitations are not adequately assessed in terms of omics types, the performance metrics, and the application scenarios. RESULTS As part of the Quartet Project for quality control and data integration of multiomics profiling, we comprehensively assess the performance of seven batch effect correction algorithms based on different performance metrics of clinical relevance, i.e., the accuracy of identifying differentially expressed features, the robustness of predictive models, and the ability of accurately clustering cross-batch samples into their own donors. The ratio-based method, i.e., by scaling absolute feature values of study samples relative to those of concurrently profiled reference material(s), is found to be much more effective and broadly applicable than others, especially when batch effects are completely confounded with biological factors of study interests. We further provide practical guidelines for implementing the ratio based approach in increasingly large-scale multiomics studies. CONCLUSIONS Multiomics measurements are prone to batch effects, which can be effectively corrected using ratio-based scaling of the multiomics data. Our study lays the foundation for eliminating batch effects at a ratio scale.
Collapse
Affiliation(s)
- Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Naixin Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yuanbang Mai
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qiaochu Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Zehui Cao
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qingwang Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yaqing Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Wanwan Hou
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jingcheng Yang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
- Greater Bay Area Institute of Precision Medicine, Guangzhou, Guangdong, China
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | | | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
- International Human Phenome Institutes, Shanghai, China.
| | - Xiang Fang
- National Institute of Metrology, Beijing, China.
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
| |
Collapse
|
37
|
Abstract
Missing values are a notable challenge when analyzing mass spectrometry-based proteomics data. While the field is still actively debating the best practices, the challenge increased with the emergence of mass spectrometry-based single-cell proteomics and the dramatic increase in missing values. A popular approach to deal with missing values is to perform imputation. Imputation has several drawbacks for which alternatives exist, but currently, imputation is still a practical solution widely adopted in single-cell proteomics data analysis. This perspective discusses the advantages and drawbacks of imputation. We also highlight 5 main challenges linked to missing value management in single-cell proteomics. Future developments should aim to solve these challenges, whether it is through imputation or data modeling. The perspective concludes with recommendations for reporting missing values, for reporting methods that deal with missing values, and for proper encoding of missing values.
Collapse
Affiliation(s)
- Christophe Vanderaa
- Computational Biology and Bioinformatics Unit (CBIO), de Duve Institute, UCLouvain, 1200 Brussels, Belgium
| | - Laurent Gatto
- Computational Biology and Bioinformatics Unit (CBIO), de Duve Institute, UCLouvain, 1200 Brussels, Belgium
| |
Collapse
|
38
|
Birhanu AG. Mass spectrometry-based proteomics as an emerging tool in clinical laboratories. Clin Proteomics 2023; 20:32. [PMID: 37633929 PMCID: PMC10464495 DOI: 10.1186/s12014-023-09424-x] [Citation(s) in RCA: 46] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Accepted: 08/03/2023] [Indexed: 08/28/2023] Open
Abstract
Mass spectrometry (MS)-based proteomics have been increasingly implemented in various disciplines of laboratory medicine to identify and quantify biomolecules in a variety of biological specimens. MS-based proteomics is continuously expanding and widely applied in biomarker discovery for early detection, prognosis and markers for treatment response prediction and monitoring. Furthermore, making these advanced tests more accessible and affordable will have the greatest healthcare benefit.This review article highlights the new paradigms MS-based clinical proteomics has created in microbiology laboratories, cancer research and diagnosis of metabolic disorders. The technique is preferred over conventional methods in disease detection and therapy monitoring for its combined advantages in multiplexing capacity, remarkable analytical specificity and sensitivity and low turnaround time.Despite the achievements in the development and adoption of a number of MS-based clinical proteomics practices, more are expected to undergo transition from bench to bedside in the near future. The review provides insights from early trials and recent progresses (mainly covering literature from the NCBI database) in the application of proteomics in clinical laboratories.
Collapse
|
39
|
Midha MK, Kapil C, Maes M, Baxter DH, Morrone SR, Prokop TJ, Moritz RL. Vacuum Insulated Probe Heated Electrospray Ionization Source Enhances Microflow Rate Chromatography Signals in the Bruker timsTOF Mass Spectrometer. J Proteome Res 2023; 22:2525-2537. [PMID: 37294184 PMCID: PMC11060334 DOI: 10.1021/acs.jproteome.3c00305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
By far the largest contribution to ion detectability in liquid chromatography-driven mass spectrometry-based proteomics is the efficient generation of peptide molecular ions by the electrospray source. To maximize the transfer of peptides from the liquid to gaseous phase and allow molecular ions to enter the mass spectrometer at microspray flow rates, an efficient electrospray process is required. Here we describe the superior performance of newly design vacuum insulated probe heated electrospray ionization (VIP-HESI) source coupled to a Bruker timsTOF PRO mass spectrometer operated in microspray mode. VIP-HESI significantly improves chromatography signals in comparison to electrospray ionization (ESI) and nanospray ionization using the captivespray (CS) source and provides increased protein detection with higher quantitative precision, enhancing reproducibility of sample injection amounts. Protein quantitation of human K562 lymphoblast samples displayed excellent chromatographic retention time reproducibility (<10% coefficient of variation (CV)) with no signal degradation over extended periods of time, and a mouse plasma proteome analysis identified 12% more plasma protein groups allowing large-scale analysis to proceed with confidence (1,267 proteins at 0.4% CV). We show that the Slice-PASEF VIP-HESI mode is sensitive in identifying low amounts of peptide without losing quantitative precision. We demonstrate that VIP-HESI coupled with microflow rate chromatography achieves a higher depth of coverage and run-to-run reproducibility for a broad range of proteomic applications. Data and spectral libraries are available via ProteomeXchange (PXD040497).
Collapse
Affiliation(s)
- Mukul K Midha
- Institute for Systems Biology, 401 Terry Avenue N, Seattle, Washington 98109, United States
| | - Charu Kapil
- Institute for Systems Biology, 401 Terry Avenue N, Seattle, Washington 98109, United States
| | - Michal Maes
- Institute for Systems Biology, 401 Terry Avenue N, Seattle, Washington 98109, United States
| | - David H Baxter
- Institute for Systems Biology, 401 Terry Avenue N, Seattle, Washington 98109, United States
| | - Seamus R Morrone
- Institute for Systems Biology, 401 Terry Avenue N, Seattle, Washington 98109, United States
| | - Timothy J Prokop
- Institute for Systems Biology, 401 Terry Avenue N, Seattle, Washington 98109, United States
| | - Robert L Moritz
- Institute for Systems Biology, 401 Terry Avenue N, Seattle, Washington 98109, United States
| |
Collapse
|
40
|
Sun W, Lin Y, Huang Y, Chan J, Terrillon S, Rosenbaum AI, Contrepois K. Robust and High-Throughput Analytical Flow Proteomics Analysis of Cynomolgus Monkey and Human Matrices with Zeno SWATH Data Independent Acquisition. Mol Cell Proteomics 2023:100562. [PMID: 37142056 DOI: 10.1016/j.mcpro.2023.100562] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 04/17/2023] [Accepted: 04/26/2023] [Indexed: 05/06/2023] Open
Abstract
Modern mass spectrometers routinely allow deep proteome coverage in a single experiment. These methods are typically operated at nano and micro flow regimes, but they often lack throughput and chromatographic robustness, which is critical for large-scale studies. In this context, we have developed, optimized and benchmarked LC-MS methods combining the robustness and throughput of analytical flow chromatography with the added sensitivity provided by the Zeno trap across a wide range of cynomolgus monkey and human matrices of interest for toxicological studies and clinical biomarker discovery. SWATH data independent acquisition (DIA) experiments with Zeno trap activated (Zeno SWATH DIA) provided a clear advantage over conventional SWATH DIA in all sample types tested with improved sensitivity, quantitative robustness and signal linearity as well as increased protein coverage by up to 9-fold. Using a 10-min gradient chromatography, up to 3,300 proteins were identified in tissues at 2 μg peptide load. Importantly, the performance gains with Zeno SWATH translated into better biological pathway representation and improved the ability to identify dysregulated proteins and pathways associated with two metabolic diseases in human plasma. Finally, we demonstrate that this method is highly stable over time with the acquisition of reliable data over the injection of 1,000+ samples (14.2 days of uninterrupted acquisition) without the need for human intervention or normalization. Altogether, Zeno SWATH DIA methodology allows fast, sensitive and robust proteomic workflows using analytical flow and is amenable to large-scale studies. This work provides detailed method performance assessment on a variety of relevant biological matrices and serves as a valuable resource for the proteomics community.
Collapse
Affiliation(s)
- Weiwen Sun
- Integrated Bioanalysis, Clinical Pharmacology and Safety Sciences, R&D, AstraZeneca, South San Francisco, CA 94080, USA
| | - Yuan Lin
- Integrated Bioanalysis, Clinical Pharmacology and Safety Sciences, R&D, AstraZeneca, South San Francisco, CA 94080, USA
| | - Yue Huang
- Integrated Bioanalysis, Clinical Pharmacology and Safety Sciences, R&D, AstraZeneca, South San Francisco, CA 94080, USA
| | - Josolyn Chan
- Integrated Bioanalysis, Clinical Pharmacology and Safety Sciences, R&D, AstraZeneca, South San Francisco, CA 94080, USA
| | - Sonia Terrillon
- Integrated Bioanalysis, Clinical Pharmacology and Safety Sciences, R&D, AstraZeneca, South San Francisco, CA 94080, USA
| | - Anton I Rosenbaum
- Integrated Bioanalysis, Clinical Pharmacology and Safety Sciences, R&D, AstraZeneca, South San Francisco, CA 94080, USA.
| | - Kévin Contrepois
- Integrated Bioanalysis, Clinical Pharmacology and Safety Sciences, R&D, AstraZeneca, South San Francisco, CA 94080, USA.
| |
Collapse
|
41
|
Messner CB, Demichev V, Wang Z, Hartl J, Kustatscher G, Mülleder M, Ralser M. Mass spectrometry-based high-throughput proteomics and its role in biomedical studies and systems biology. Proteomics 2023; 23:e2200013. [PMID: 36349817 DOI: 10.1002/pmic.202200013] [Citation(s) in RCA: 42] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 10/13/2022] [Accepted: 10/13/2022] [Indexed: 11/11/2022]
Abstract
There are multiple reasons why the next generation of biological and medical studies require increasing numbers of samples. Biological systems are dynamic, and the effect of a perturbation depends on the genetic background and environment. As a consequence, many conditions need to be considered to reach generalizable conclusions. Moreover, human population and clinical studies only reach sufficient statistical power if conducted at scale and with precise measurement methods. Finally, many proteins remain without sufficient functional annotations, because they have not been systematically studied under a broad range of conditions. In this review, we discuss the latest technical developments in mass spectrometry (MS)-based proteomics that facilitate large-scale studies by fast and efficient chromatography, fast scanning mass spectrometers, data-independent acquisition (DIA), and new software. We further highlight recent studies which demonstrate how high-throughput (HT) proteomics can be applied to capture biological diversity, to annotate gene functions or to generate predictive and prognostic models for human diseases.
Collapse
Affiliation(s)
- Christoph B Messner
- Precision Proteomics Center, Swiss Institute of Allergy and Asthma Research (SIAF), University of Zurich, Davos, Switzerland
| | - Vadim Demichev
- Institute of Biochemistry, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Ziyue Wang
- Institute of Biochemistry, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Johannes Hartl
- Institute of Biochemistry, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Georg Kustatscher
- Wellcome Centre for Cell Biology, University of Edinburgh, Max Born Crescent, Edinburgh, Scotland, UK
| | - Michael Mülleder
- Core Facility High Throughput Mass Spectrometry, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Markus Ralser
- Institute of Biochemistry, Charité - Universitätsmedizin Berlin, Berlin, Germany
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| |
Collapse
|
42
|
Midha MK, Kapil C, Maes M, Baxter DH, Morrone SR, Prokop TJ, Moritz RL. Vacuum Insulated Probe Heated ElectroSpray Ionization source (VIP-HESI) enhances micro flow rate chromatography signals in the Bruker timsTOF mass spectrometer. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.15.528699. [PMID: 36824828 PMCID: PMC9949110 DOI: 10.1101/2023.02.15.528699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]
Abstract
By far the largest contribution to ion detectability in liquid chromatography-driven mass spectrometry-based proteomics is the efficient generation of peptide ions by the electrospray source. To maximize the transfer of peptides from liquid to a gaseous phase to allow molecular ions to enter the mass spectrometer at micro-spray flow rates, an efficient electrospray process is required. Here we describe superior performance of new Vacuum-Insulated-Probe-Heated-ElectroSpray-Ionization source (VIP-HESI) coupled with micro-spray flow rate chromatography and Bruker timsTOF PRO mass spectrometer. VIP-HESI significantly improves chromatography signals in comparison to nano-spray ionization using the CaptiveSpray source and provides increased protein detection with higher quantitative precision, enhancing reproducibility of sample injection amounts. Protein quantitation of human K562 lymphoblast samples displayed excellent chromatographic retention time reproducibility (<10% coefficient-of-variation (CV)) with no signal degradation over extended periods of time, and a mouse plasma proteome analysis identified 12% more plasma protein groups allowing large-scale analysis to proceed with confidence (1,267 proteins at 0.4% CV). We show that Slice-PASEF mode with VIP-HESI setup is sensitive in identifying low amounts of peptide without losing quantitative precision. We demonstrate that VIP-HESI coupled with micro-flow-rate chromatography achieves higher depth of coverage and run-to-run reproducibility for a broad range of proteomic applications.
Collapse
|
43
|
Manuel JM, Guilloy N, Khatir I, Roucou X, Laurent B. Re-evaluating the impact of alternative RNA splicing on proteomic diversity. Front Genet 2023; 14:1089053. [PMID: 36845399 PMCID: PMC9947481 DOI: 10.3389/fgene.2023.1089053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 01/23/2023] [Indexed: 02/11/2023] Open
Abstract
Alternative splicing (AS) constitutes a mechanism by which protein-coding genes and long non-coding RNA (lncRNA) genes produce more than a single mature transcript. From plants to humans, AS is a powerful process that increases transcriptome complexity. Importantly, splice variants produced from AS can potentially encode for distinct protein isoforms which can lose or gain specific domains and, hence, differ in their functional properties. Advances in proteomics have shown that the proteome is indeed diverse due to the presence of numerous protein isoforms. For the past decades, with the help of advanced high-throughput technologies, numerous alternatively spliced transcripts have been identified. However, the low detection rate of protein isoforms in proteomic studies raised debatable questions on whether AS contributes to proteomic diversity and on how many AS events are really functional. We propose here to assess and discuss the impact of AS on proteomic complexity in the light of the technological progress, updated genome annotation, and current scientific knowledge.
Collapse
Affiliation(s)
- Jeru Manoj Manuel
- Research Center on Aging, Centre Intégré Universitaire de Santé et Services Sociaux de l’Estrie-Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC, Canada,Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Noé Guilloy
- Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Inès Khatir
- Research Center on Aging, Centre Intégré Universitaire de Santé et Services Sociaux de l’Estrie-Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC, Canada,Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada,Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, QC, Canada,Quebec Network for Research on Protein Function Structure and Engineering, PROTEO, Québec, QC, Canada
| | - Benoit Laurent
- Research Center on Aging, Centre Intégré Universitaire de Santé et Services Sociaux de l’Estrie-Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC, Canada,Department of Biochemistry and Functional Genomics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada,*Correspondence: Benoit Laurent,
| |
Collapse
|
44
|
Sundararaman N, Bhat A, Venkatraman V, Binek A, Dwight Z, Ariyasinghe NR, Escopete S, Joung SY, Cheng S, Parker SJ, Fert-Bober J, Van Eyk JE. BIRCH: An Automated Workflow for Evaluation, Correction, and Visualization of Batch Effect in Bottom-Up Mass Spectrometry-Based Proteomics Data. J Proteome Res 2023; 22:471-481. [PMID: 36695565 DOI: 10.1021/acs.jproteome.2c00671] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Recent surges in large-scale mass spectrometry (MS)-based proteomics studies demand a concurrent rise in methods to facilitate reliable and reproducible data analysis. Quantification of proteins in MS analysis can be affected by variations in technical factors such as sample preparation and data acquisition conditions leading to batch effects, which adds to noise in the data set. This may in turn affect the effectiveness of any biological conclusions derived from the data. Here we present Batch-effect Identification, Representation, and Correction of Heterogeneous data (BIRCH), a workflow for analysis and correction of batch effect through an automated, versatile, and easy to use web-based tool with the goal of eliminating technical variation. BIRCH also supports diagnosis of the data to check for the presence of batch effects, feasibility of batch correction, and imputation to deal with missing values in the data set. To illustrate the relevance of the tool, we explore two case studies, including an iPSC-derived cell study and a Covid vaccine study to show different context-specific use cases. Ultimately this tool can be used as an extremely powerful approach for eliminating technical bias while retaining biological bias, toward understanding disease mechanisms and potential therapeutics.
Collapse
Affiliation(s)
- Niveda Sundararaman
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Archana Bhat
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Vidya Venkatraman
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Aleksandra Binek
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Zachary Dwight
- Precision Biomarker Laboratories, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Nethika R Ariyasinghe
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Sean Escopete
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Sandy Y Joung
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Susan Cheng
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Sarah J Parker
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Justyna Fert-Bober
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Jennifer E Van Eyk
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| |
Collapse
|
45
|
He X, Liu X, Zuo F, Shi H, Jing J. Artificial intelligence-based multi-omics analysis fuels cancer precision medicine. Semin Cancer Biol 2023; 88:187-200. [PMID: 36596352 DOI: 10.1016/j.semcancer.2022.12.009] [Citation(s) in RCA: 109] [Impact Index Per Article: 54.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 12/16/2022] [Accepted: 12/29/2022] [Indexed: 01/02/2023]
Abstract
With biotechnological advancements, innovative omics technologies are constantly emerging that have enabled researchers to access multi-layer information from the genome, epigenome, transcriptome, proteome, metabolome, and more. A wealth of omics technologies, including bulk and single-cell omics approaches, have empowered to characterize different molecular layers at unprecedented scale and resolution, providing a holistic view of tumor behavior. Multi-omics analysis allows systematic interrogation of various molecular information at each biological layer while posing tricky challenges regarding how to extract valuable insights from the exponentially increasing amount of multi-omics data. Therefore, efficient algorithms are needed to reduce the dimensionality of the data while simultaneously dissecting the mysteries behind the complex biological processes of cancer. Artificial intelligence has demonstrated the ability to analyze complementary multi-modal data streams within the oncology realm. The coincident development of multi-omics technologies and artificial intelligence algorithms has fuelled the development of cancer precision medicine. Here, we present state-of-the-art omics technologies and outline a roadmap of multi-omics integration analysis using an artificial intelligence strategy. The advances made using artificial intelligence-based multi-omics approaches are described, especially concerning early cancer screening, diagnosis, response assessment, and prognosis prediction. Finally, we discuss the challenges faced in multi-omics analysis, along with tentative future trends in this field. With the increasing application of artificial intelligence in multi-omics analysis, we anticipate a shifting paradigm in precision medicine becoming driven by artificial intelligence-based multi-omics technologies.
Collapse
Affiliation(s)
- Xiujing He
- Laboratory of Integrative Medicine, Clinical Research Center for Breast, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, Sichuan, PR China
| | - Xiaowei Liu
- Laboratory of Integrative Medicine, Clinical Research Center for Breast, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, Sichuan, PR China
| | - Fengli Zuo
- Laboratory of Integrative Medicine, Clinical Research Center for Breast, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, Sichuan, PR China
| | - Hubing Shi
- Laboratory of Integrative Medicine, Clinical Research Center for Breast, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, Sichuan, PR China
| | - Jing Jing
- Laboratory of Integrative Medicine, Clinical Research Center for Breast, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, Sichuan, PR China.
| |
Collapse
|
46
|
Connolly EA, Grimison PS, Horvath LG, Robinson PJ, Reddel RR. Quantitative proteomic studies addressing unmet clinical needs in sarcoma. Front Oncol 2023; 13:1126736. [PMID: 37197427 PMCID: PMC10183589 DOI: 10.3389/fonc.2023.1126736] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2022] [Accepted: 03/31/2023] [Indexed: 05/19/2023] Open
Abstract
Sarcoma is a rare and complex disease comprising over 80 malignant subtypes that is frequently characterized by poor prognosis. Challenges in clinical management include uncertainties in diagnosis and disease classification, limited prognostic and predictive biomarkers, incompletely understood disease heterogeneity among and within subtypes, lack of effective treatment options, and limited progress in identifying new drug targets and novel therapeutics. Proteomics refers to the study of the entire complement of proteins expressed in specific cells or tissues. Advances in proteomics have included the development of quantitative mass spectrometry (MS)-based technologies which enable analysis of large numbers of proteins with relatively high throughput, enabling proteomics to be studied on a scale that has not previously been possible. Cellular function is determined by the levels of various proteins and their interactions, so proteomics offers the possibility of new insights into cancer biology. Sarcoma proteomics therefore has the potential to address some of the key current challenges described above, but it is still in its infancy. This review covers key quantitative proteomic sarcoma studies with findings that pertain to clinical utility. Proteomic methodologies that have been applied to human sarcoma research are briefly described, including recent advances in MS-based proteomic technology. We highlight studies that illustrate how proteomics may aid diagnosis and improve disease classification by distinguishing sarcoma histologies and identify distinct profiles within histological subtypes which may aid understanding of disease heterogeneity. We also review studies where proteomics has been applied to identify prognostic, predictive and therapeutic biomarkers. These studies traverse a range of histological subtypes including chordoma, Ewing sarcoma, gastrointestinal stromal tumors, leiomyosarcoma, liposarcoma, malignant peripheral nerve sheath tumors, myxofibrosarcoma, rhabdomyosarcoma, synovial sarcoma, osteosarcoma, and undifferentiated pleomorphic sarcoma. Critical questions and unmet needs in sarcoma which can potentially be addressed with proteomics are outlined.
Collapse
Affiliation(s)
- Elizabeth A. Connolly
- ProCan, Children’s Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, NSW, Australia
- Department of Medical Oncology, Chris O’Brien Lifehouse, Sydney, NSW, Australia
- *Correspondence: Elizabeth A. Connolly,
| | - Peter S. Grimison
- Department of Medical Oncology, Chris O’Brien Lifehouse, Sydney, NSW, Australia
- Faculty of Medicine and Health, The University of Sydney, Sydney, NSW, Australia
| | - Lisa G. Horvath
- Department of Medical Oncology, Chris O’Brien Lifehouse, Sydney, NSW, Australia
- Faculty of Medicine and Health, The University of Sydney, Sydney, NSW, Australia
| | - Phillip J. Robinson
- ProCan, Children’s Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, NSW, Australia
| | - Roger R. Reddel
- ProCan, Children’s Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, NSW, Australia
| |
Collapse
|
47
|
Vincent D, Bui A, Ezernieks V, Shahinfar S, Luke T, Ram D, Rigas N, Panozzo J, Rochfort S, Daetwyler H, Hayden M. A community resource to mass explore the wheat grain proteome and its application to the late-maturity alpha-amylase (LMA) problem. Gigascience 2022; 12:giad084. [PMID: 37919977 PMCID: PMC10627334 DOI: 10.1093/gigascience/giad084] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 08/02/2023] [Accepted: 09/19/2023] [Indexed: 11/04/2023] Open
Abstract
BACKGROUND Late-maturity alpha-amylase (LMA) is a wheat genetic defect causing the synthesis of high isoelectric point alpha-amylase following a temperature shock during mid-grain development or prolonged cold throughout grain development, both leading to starch degradation. While the physiology is well understood, the biochemical mechanisms involved in grain LMA response remain unclear. We have applied high-throughput proteomics to 4,061 wheat flours displaying a range of LMA activities. Using an array of statistical analyses to select LMA-responsive biomarkers, we have mined them using a suite of tools applicable to wheat proteins. RESULTS We observed that LMA-affected grains activated their primary metabolisms such as glycolysis and gluconeogenesis; TCA cycle, along with DNA- and RNA- binding mechanisms; and protein translation. This logically transitioned to protein folding activities driven by chaperones and protein disulfide isomerase, as well as protein assembly via dimerisation and complexing. The secondary metabolism was also mobilized with the upregulation of phytohormones and chemical and defence responses. LMA further invoked cellular structures, including ribosomes, microtubules, and chromatin. Finally, and unsurprisingly, LMA expression greatly impacted grain storage proteins, as well as starch and other carbohydrates, with the upregulation of alpha-gliadins and starch metabolism, whereas LMW glutenin, stachyose, sucrose, UDP-galactose, and UDP-glucose were downregulated. CONCLUSIONS To our knowledge, this is not only the first proteomics study tackling the wheat LMA issue but also the largest plant-based proteomics study published to date. Logistics, technicalities, requirements, and bottlenecks of such an ambitious large-scale high-throughput proteomics experiment along with the challenges associated with big data analyses are discussed.
Collapse
Affiliation(s)
- Delphine Vincent
- Agriculture Victoria Research, AgriBio, Center Centre for AgriBioscience, Bundoora, VIC 3083, Australia
| | - AnhDuyen Bui
- Agriculture Victoria Research, AgriBio, Center Centre for AgriBioscience, Bundoora, VIC 3083, Australia
| | - Vilnis Ezernieks
- Agriculture Victoria Research, AgriBio, Center Centre for AgriBioscience, Bundoora, VIC 3083, Australia
| | - Saleh Shahinfar
- Agriculture Victoria Research, AgriBio, Center Centre for AgriBioscience, Bundoora, VIC 3083, Australia
| | - Timothy Luke
- Agriculture Victoria Research, AgriBio, Center Centre for AgriBioscience, Bundoora, VIC 3083, Australia
| | - Doris Ram
- Agriculture Victoria Research, AgriBio, Center Centre for AgriBioscience, Bundoora, VIC 3083, Australia
| | - Nicholas Rigas
- Agriculture Victoria Research, Grains Innovation Park, Horsham, VIC 3400, Australia
| | - Joe Panozzo
- Agriculture Victoria Research, Grains Innovation Park, Horsham, VIC 3400, Australia
- Centre for Agricultural Innovation, University of Melbourne, Parkville, VIC 3010, Australia
| | - Simone Rochfort
- Agriculture Victoria Research, AgriBio, Center Centre for AgriBioscience, Bundoora, VIC 3083, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC 3083, Australia
| | - Hans Daetwyler
- Agriculture Victoria Research, AgriBio, Center Centre for AgriBioscience, Bundoora, VIC 3083, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC 3083, Australia
| | - Matthew Hayden
- Agriculture Victoria Research, AgriBio, Center Centre for AgriBioscience, Bundoora, VIC 3083, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC 3083, Australia
| |
Collapse
|
48
|
Tear Proteome Revealed Association of S100A Family Proteins and Mesothelin with Thrombosis in Elderly Patients with Retinal Vein Occlusion. Int J Mol Sci 2022; 23:ijms232314653. [PMID: 36498980 PMCID: PMC9736253 DOI: 10.3390/ijms232314653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Revised: 11/03/2022] [Accepted: 11/21/2022] [Indexed: 11/25/2022] Open
Abstract
Tear samples collected from patients with central retinal vein occlusion (CRVO; n = 28) and healthy volunteers (n = 29) were analyzed using a proteomic label-free absolute quantitative approach. A large proportion (458 proteins with a frequency > 0.6) of tear proteomes was found to be shared between the study groups. Comparative proteomic analysis revealed 29 proteins (p < 0.05) significantly differed between CRVO patients and the control group. Among them, S100A6 (log (2) FC = 1.11, p < 0.001), S100A8 (log (2) FC = 2.45, p < 0.001), S100A9 (log2 (FC) = 2.08, p < 0.001), and mesothelin ((log2 (FC) = 0.82, p < 0.001) were the most abundantly represented upregulated proteins, and β2-microglobulin was the most downregulated protein (log2 (FC) = −2.13, p < 0.001). The selected up- and downregulated proteins were gathered to customize a map of CRVO-related critical protein interactions with quantitative properties. The customized map (FDR < 0.01) revealed inflammation, impairment of retinal hemostasis, and immune response as the main set of processes associated with CRVO ischemic condition. The semantic analysis displayed the prevalence of core biological processes covering dysregulation of mitochondrial organization and utilization of improperly or topologically incorrect folded proteins as a consequence of oxidative stress, and escalating of the ischemic condition caused by the local retinal hemostasis dysregulation. The most significantly different proteins (S100A6, S100A8, S100A9, MSLN, and β2-microglobulin) were applied for the ROC analysis, and their AUC varied from 0.772 to 0.952, suggesting probable association with the CRVO.
Collapse
|
49
|
He B, Huang Z, Huang C, Nice EC. Clinical applications of plasma proteomics and peptidomics: Towards precision medicine. Proteomics Clin Appl 2022; 16:e2100097. [PMID: 35490333 DOI: 10.1002/prca.202100097] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Revised: 04/16/2022] [Accepted: 04/28/2022] [Indexed: 02/05/2023]
Abstract
In the context of precision medicine, disease treatment requires individualized strategies based on the underlying molecular characteristics to overcome therapeutic challenges posed by heterogeneity. For this purpose, it is essential to develop new biomarkers to diagnose, stratify, or possibly prevent diseases. Plasma is an available source of biomarkers that greatly reflects the physiological and pathological conditions of the body. An increasing number of studies are focusing on proteins and peptides, including many involving the Human Proteome Project (HPP) of the Human Proteome Organization (HUPO), and proteomics and peptidomics techniques are emerging as critical tools for developing novel precision medicine preventative measures. Excitingly, the emerging plasma proteomics and peptidomics toolbox exhibits a huge potential for studying pathogenesis of diseases (e.g., COVID-19 and cancer), identifying valuable biomarkers and improving clinical management. However, the enormous complexity and wide dynamic range of plasma proteins makes plasma proteome profiling challenging. Herein, we summarize the recent advances in plasma proteomics and peptidomics with a focus on their emerging roles in COVID-19 and cancer research, aiming to emphasize the significance of plasma proteomics and peptidomics in clinical applications and precision medicine.
Collapse
Affiliation(s)
- Bo He
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, and West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University and Collaborative Innovation Center for Biotherapy, Chengdu, P. R. China
| | - Zhao Huang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, and West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University and Collaborative Innovation Center for Biotherapy, Chengdu, P. R. China
| | - Canhua Huang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, and West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University and Collaborative Innovation Center for Biotherapy, Chengdu, P. R. China.,Department of Pharmacology, and Provincial Key Laboratory of Pathophysiology in Ningbo University School of Medicine, Ningbo, Zhejiang, China
| | - Edouard C Nice
- Department of Biochemistry and Molecular Biology, Monash University, Clayton, Victoria, Australia
| |
Collapse
|
50
|
Sabirov D, Ogurcov S, Baichurina I, Blatt N, Rizvanov A, Mukhamedshina Y. Molecular diagnostics in neurotrauma: Are there reliable biomarkers and effective methods for their detection? Front Mol Biosci 2022; 9:1017916. [PMID: 36250009 PMCID: PMC9557129 DOI: 10.3389/fmolb.2022.1017916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Accepted: 09/12/2022] [Indexed: 12/05/2022] Open
Abstract
To date, a large number of studies are being carried out in the field of neurotrauma, researchers not only establish the molecular mechanisms of the course of the disorders, but are also involved in the search for effective biomarkers for early prediction of the outcome and therapeutic intervention. Particular attention is paid to traumatic brain injury and spinal cord injury, due to the complex cascade of reactions in primary and secondary injury that affect pathophysiological processes and regenerative potential of the central nervous system. Despite a wide range of methods available methods to study biomarkers that correlate with the severity and degree of recovery in traumatic brain injury and spinal cord injury, development of reliable test systems for clinical use continues. In this review, we evaluate the results of recent studies looking for various molecules acting as biomarkers in the abovementioned neurotrauma. We also summarize the current knowledge of new methods for studying biological molecules, analyzing their sensitivity and limitations, as well as reproducibility of results. In this review, we also highlight the importance of developing reliable and reproducible protocols to identify diagnostic and prognostic biomolecules.
Collapse
Affiliation(s)
- Davran Sabirov
- Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan, Russia
| | - Sergei Ogurcov
- Neurosurgical Department No. 2, Republic Clinical Hospital, Kazan, Russia
| | - Irina Baichurina
- Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan, Russia
- *Correspondence: Irina Baichurina,
| | - Nataliya Blatt
- Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan, Russia
| | - Albert Rizvanov
- Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan, Russia
| | - Yana Mukhamedshina
- Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan, Russia
- Department of Histology, Cytology, and Embryology, Kazan State Medical University, Kazan, Russia
| |
Collapse
|