1
|
Raj A, Aggarwal S, Singh P, Yadav AK, Dash D. PgxSAVy: A tool for comprehensive evaluation of variant peptide quality in proteogenomics - catching the (un)usual suspects. Comput Struct Biotechnol J 2024; 23:711-722. [PMID: 38292474 PMCID: PMC10825656 DOI: 10.1016/j.csbj.2023.12.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 12/19/2023] [Accepted: 12/23/2023] [Indexed: 02/01/2024] Open
Abstract
Variant peptides resulting from single nucleotide polymorphisms (SNPs) can lead to aberrant protein functions and have translational potential for disease diagnosis and personalized therapy. Variant peptides detected by proteogenomics are fraught with high number of false positives, but there is no uniform and comprehensive approach to assess variant quality across analysis pipelines. Despite class-specific FDR along with ad-hoc filters, the problem is far from solved. These protocols are typically manual and tedious, and thus not uniform across labs. We demonstrate that variant peptide rescoring, integrated with intensity, variant event information and search result features, allows better discrimination of correct variant peptides. Implemented into PgxSAVy - a tool for quality control of variant peptides, this method can tackle the high rate of false positives. PgxSAVy provides a rigorous framework for quality control and annotations of variant peptides on the basis of (i) variant quality, (ii) isobaric masses, and (iii) disease annotation. PgxSAVy demonstrated high accuracy by identifying true variants with 98.43% accuracy on simulated data. Large-scale proteogenomic reanalysis of ∼2.8 million spectra (PXD004010 and PXD001468) resulted in 12,705 variant peptide spectrum matches (PSMs), of which PgxSAVy evaluated 3028 (23.8%), 1409 (11.1%) and 8268 (65.1%) as confident, semi-confident and doubtful respectively. PgxSAVy also annotates the variants based on their pathogenicity and provides support for assisted manual validation. The analysis of proteins carrying variants can provide fine granularity in discovering important pathways. PgxSAVy will advance personalized medicine by providing a comprehensive framework for quality control and prioritization of proteogenomics variants. PgxSAVy is freely available at https://pgxsavy.igib.res.in/ as a webserver and https://github.com/anuragraj/PgxSAVy as a stand-alone tool.
Collapse
Affiliation(s)
- Anurag Raj
- G. N. Ramachandran Knowledge Centre for Genomics Informatics, CSIR – Institute of Genomics and Integrative Biology, New Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| | - Suruchi Aggarwal
- Computational and Mathematical Biology Centre (CMBC), 3rd Milestone, Faridabad-Gurgaon Expressway, Faridabad, Haryana 121001, India
- Centre for Drug Discovery (CDD), 3rd Milestone, Faridabad-Gurgaon Expressway, Faridabad, Haryana 121001, India
- Centre for Microbial Research (CMR), Translational Health Science and Technology Institute, NCR Biotech Science Cluster, 3rd Milestone, Faridabad-Gurgaon Expressway, Faridabad, Haryana 121001, India
| | - Prateek Singh
- G. N. Ramachandran Knowledge Centre for Genomics Informatics, CSIR – Institute of Genomics and Integrative Biology, New Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| | - Amit Kumar Yadav
- Computational and Mathematical Biology Centre (CMBC), 3rd Milestone, Faridabad-Gurgaon Expressway, Faridabad, Haryana 121001, India
- Centre for Drug Discovery (CDD), 3rd Milestone, Faridabad-Gurgaon Expressway, Faridabad, Haryana 121001, India
- Centre for Microbial Research (CMR), Translational Health Science and Technology Institute, NCR Biotech Science Cluster, 3rd Milestone, Faridabad-Gurgaon Expressway, Faridabad, Haryana 121001, India
| | - Debasis Dash
- G. N. Ramachandran Knowledge Centre for Genomics Informatics, CSIR – Institute of Genomics and Integrative Biology, New Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| |
Collapse
|
2
|
Deutsch EW, Kok LW, Mudge JM, Ruiz-Orera J, Fierro-Monti I, Sun Z, Abelin JG, Alba MM, Aspden JL, Bazzini AA, Bruford EA, Brunet MA, Calviello L, Carr SA, Carvunis AR, Chothani S, Clauwaert J, Dean K, Faridi P, Frankish A, Hubner N, Ingolia NT, Magrane M, Martin MJ, Martinez TF, Menschaert G, Ohler U, Orchard S, Rackham O, Roucou X, Slavoff SA, Valen E, Wacholder A, Weissman JS, Wu W, Xie Z, Choudhary J, Bassani-Sternberg M, Vizcaíno JA, Ternette N, Moritz RL, Prensner JR, van Heesch S. High-quality peptide evidence for annotating non-canonical open reading frames as human proteins. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.09.612016. [PMID: 39314370 PMCID: PMC11419116 DOI: 10.1101/2024.09.09.612016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]
Abstract
A major scientific drive is to characterize the protein-coding genome as it provides the primary basis for the study of human health. But the fundamental question remains: what has been missed in prior genomic analyses? Over the past decade, the translation of non-canonical open reading frames (ncORFs) has been observed across human cell types and disease states, with major implications for proteomics, genomics, and clinical science. However, the impact of ncORFs has been limited by the absence of a large-scale understanding of their contribution to the human proteome. Here, we report the collaborative efforts of stakeholders in proteomics, immunopeptidomics, Ribo-seq ORF discovery, and gene annotation, to produce a consensus landscape of protein-level evidence for ncORFs. We show that at least 25% of a set of 7,264 ncORFs give rise to translated gene products, yielding over 3,000 peptides in a pan-proteome analysis encompassing 3.8 billion mass spectra from 95,520 experiments. With these data, we developed an annotation framework for ncORFs and created public tools for researchers through GENCODE and PeptideAtlas. This work will provide a platform to advance ncORF-derived proteins in biomedical discovery and, beyond humans, diverse animals and plants where ncORFs are similarly observed.
Collapse
|
3
|
van Wijk KJ, Leppert T, Sun Z, Guzchenko I, Debley E, Sauermann G, Routray P, Mendoza L, Sun Q, Deutsch EW. The Zea mays PeptideAtlas: A New Maize Community Resource. J Proteome Res 2024; 23:3984-4004. [PMID: 39101213 DOI: 10.1021/acs.jproteome.4c00320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/06/2024]
Abstract
This study presents the Maize PeptideAtlas resource (www.peptideatlas.org/builds/maize) to help solve questions about the maize proteome. Publicly available raw tandem mass spectrometry (MS/MS) data for maize collected from ProteomeXchange were reanalyzed through a uniform processing and metadata annotation pipeline. These data are from a wide range of genetic backgrounds and many sample types and experimental conditions. The protein search space included different maize genome annotations for the B73 inbred line from MaizeGDB, UniProtKB, NCBI RefSeq, and for the W22 inbred line. 445 million MS/MS spectra were searched, of which 120 million were matched to 0.37 million distinct peptides. Peptides were matched to 66.2% of proteins in the most recent B73 nuclear genome annotation. Furthermore, most conserved plastid- and mitochondrial-encoded proteins (NCBI RefSeq annotations) were identified. Peptides and proteins identified in the other B73 genome annotations will improve maize genome annotation. We also illustrate the high-confidence detection of unique W22 proteins. N-terminal acetylation, phosphorylation, ubiquitination, and three lysine acylations (K-acetyl, K-malonyl, and K-hydroxyisobutyryl) were identified and can be inspected through a PTM viewer in PeptideAtlas. All matched MS/MS-derived peptide data are linked to spectral, technical, and biological metadata. This new PeptideAtlas is integrated in MaizeGDB with a peptide track in JBrowse.
Collapse
Affiliation(s)
- Klaas J van Wijk
- Section of Plant Biology, School of Integrative Plant Sciences (SIPS), Cornell University, Ithaca, New York 14853, United States
| | - Tami Leppert
- Institute for Systems Biology (ISB), Seattle, Washington 98109, United States
| | - Zhi Sun
- Institute for Systems Biology (ISB), Seattle, Washington 98109, United States
| | - Isabell Guzchenko
- Section of Plant Biology, School of Integrative Plant Sciences (SIPS), Cornell University, Ithaca, New York 14853, United States
| | - Erica Debley
- Section of Plant Biology, School of Integrative Plant Sciences (SIPS), Cornell University, Ithaca, New York 14853, United States
| | - Georgia Sauermann
- Section of Plant Biology, School of Integrative Plant Sciences (SIPS), Cornell University, Ithaca, New York 14853, United States
| | - Pratyush Routray
- Section of Plant Biology, School of Integrative Plant Sciences (SIPS), Cornell University, Ithaca, New York 14853, United States
| | - Luis Mendoza
- Institute for Systems Biology (ISB), Seattle, Washington 98109, United States
| | - Qi Sun
- Computational Biology Service Unit, Cornell University, Ithaca, New York 14853, United States
| | - Eric W Deutsch
- Institute for Systems Biology (ISB), Seattle, Washington 98109, United States
| |
Collapse
|
4
|
Hendricks NG, Bhosale SD, Keoseyan AJ, Ortiz J, Stotland A, Seyedmohammad S, Nguyen CDL, Bui JT, Moradian A, Mockus SM, Van Eyk JE. An Inflection Point in High-Throughput Proteomics with Orbitrap Astral: Analysis of Biofluids, Cells, and Tissues. J Proteome Res 2024; 23:4163-4169. [PMID: 39163279 PMCID: PMC11385373 DOI: 10.1021/acs.jproteome.4c00384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/22/2024]
Abstract
This Technical Note presents a comprehensive proteomics workflow for the new combination of Orbitrap and Astral mass analyzers across biofluids, cells, and tissues. Central to our workflow is the integration of Adaptive Focused Acoustics (AFA) technology for cells and tissue lysis to ensure robust and reproducible sample preparation in a high-throughput manner. Furthermore, we automated the detergent-compatible single-pot, solid-phase-enhanced sample Preparation (SP3) method for protein digestion. The synergy of these advanced methodologies facilitates a robust and high-throughput approach for cell and tissue analysis, an important consideration in translational research. This work disseminates our platform workflow, analyzes the effectiveness, demonstrates the reproducibility of the results, and highlights the potential of these technologies in biomarker discovery and disease pathology. For cells and tissues (heart, liver, lung, and intestine) proteomics analysis by data-independent acquisition mode, identifications exceeding 10,000 proteins can be achieved with a 24 min active gradient. In 200 ng injections of HeLa digest across multiple gradients, an average of more than 80% of proteins have a CV less than 20%, and a 45 min run covers ∼90% of the expressed proteome. This complete workflow allows for large swaths of the proteome to be identified and is compatible with diverse sample types.
Collapse
Affiliation(s)
- Nathan G Hendricks
- Precision Biomarker Laboratories, Cedars-Sinai Medical Center, Beverly Hills, California 90211, United States
| | - Santosh D Bhosale
- Precision Biomarker Laboratories, Cedars-Sinai Medical Center, Beverly Hills, California 90211, United States
| | - Angel J Keoseyan
- Precision Biomarker Laboratories, Cedars-Sinai Medical Center, Beverly Hills, California 90211, United States
| | - Josselin Ortiz
- Precision Biomarker Laboratories, Cedars-Sinai Medical Center, Beverly Hills, California 90211, United States
| | - Aleksandr Stotland
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Saeed Seyedmohammad
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Chi D L Nguyen
- Precision Biomarker Laboratories, Cedars-Sinai Medical Center, Beverly Hills, California 90211, United States
| | - Jonathan T Bui
- Precision Biomarker Laboratories, Cedars-Sinai Medical Center, Beverly Hills, California 90211, United States
| | - Annie Moradian
- Precision Biomarker Laboratories, Cedars-Sinai Medical Center, Beverly Hills, California 90211, United States
| | - Susan M Mockus
- Precision Biomarker Laboratories, Cedars-Sinai Medical Center, Beverly Hills, California 90211, United States
| | - Jennifer E Van Eyk
- Precision Biomarker Laboratories, Cedars-Sinai Medical Center, Beverly Hills, California 90211, United States
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| |
Collapse
|
5
|
Salz R, Vorsteveld EE, van der Made CI, Kersten S, Stemerdink M, Riepe TV, Hsieh TH, Mhlanga M, Netea MG, Volders PJ, Hoischen A, ’t Hoen PA. Multi-omic profiling of pathogen-stimulated primary immune cells. iScience 2024; 27:110471. [PMID: 39091463 PMCID: PMC11293528 DOI: 10.1016/j.isci.2024.110471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 04/23/2024] [Accepted: 07/04/2024] [Indexed: 08/04/2024] Open
Abstract
We performed long-read transcriptome and proteome profiling of pathogen-stimulated peripheral blood mononuclear cells (PBMCs) from healthy donors to discover new transcript and protein isoforms expressed during immune responses to diverse pathogens. Long-read transcriptome profiling reveals novel sequences and isoform switching induced upon pathogen stimulation, including transcripts that are difficult to detect using traditional short-read sequencing. Widespread loss of intron retention occurs as a common result of all pathogen stimulations. We highlight novel transcripts of NFKB1 and CASP1 that may indicate novel immunological mechanisms. RNA expression differences did not result in differences in the amounts of secreted proteins. Clustering analysis of secreted proteins revealed a correlation between chemokine (receptor) expression on the RNA and protein levels in C. albicans- and poly(I:C)-stimulated PBMCs. Isoform aware long-read sequencing of pathogen-stimulated immune cells highlights the potential of these methods to identify novel transcripts, revealing a more complex transcriptome landscape than previously appreciated.
Collapse
Affiliation(s)
- Renee Salz
- Department of Medical BioSciences, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- RadboudUMC Research Institute for Medical Innovation, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
| | - Emil E. Vorsteveld
- RadboudUMC Research Institute for Medical Innovation, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- Department of Human Genetics, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
| | - Caspar I. van der Made
- RadboudUMC Research Institute for Medical Innovation, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- Department of Human Genetics, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- Department of Internal Medicine and Radboud Centre for Infectious Diseases (RCI), Radboud University Medical Centre, 6525 GA Nijmegen, the Netherlands
| | - Simone Kersten
- RadboudUMC Research Institute for Medical Innovation, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- Department of Human Genetics, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
| | - Merel Stemerdink
- RadboudUMC Research Institute for Medical Innovation, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- Department of Otorhinolaryngology, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
| | - Tabea V. Riepe
- Department of Medical BioSciences, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- RadboudUMC Research Institute for Medical Innovation, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- Department of Human Genetics, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
| | - Tsung-han Hsieh
- Department of Cell Biology, Radboud University, 6500 HB Nijmegen, the Netherlands
| | - Musa Mhlanga
- RadboudUMC Research Institute for Medical Innovation, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- Department of Cell Biology, Radboud University, 6500 HB Nijmegen, the Netherlands
| | - Mihai G. Netea
- Department of Human Genetics, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- Department of Internal Medicine and Radboud Centre for Infectious Diseases (RCI), Radboud University Medical Centre, 6525 GA Nijmegen, the Netherlands
| | - Pieter-Jan Volders
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Laboratory of Molecular Diagnostics, Department of Clinical Biology, Jessa Hospital, 3500 Hasselt, Belgium
| | - Alexander Hoischen
- RadboudUMC Research Institute for Medical Innovation, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- Department of Human Genetics, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- Department of Internal Medicine and Radboud Centre for Infectious Diseases (RCI), Radboud University Medical Centre, 6525 GA Nijmegen, the Netherlands
| | - Peter A.C. ’t Hoen
- Department of Medical BioSciences, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
- RadboudUMC Research Institute for Medical Innovation, Radboud University Medical Center, 6525 GA Nijmegen, the Netherlands
| |
Collapse
|
6
|
Marassi V, La Rocca G, Placci A, Muntiu A, Vincenzoni F, Vitali A, Desiderio C, Maraldi T, Beretti F, Russo E, Miceli V, Conaldi PG, Papait A, Romele P, Cargnoni A, Silini AR, Alviano F, Parolini O, Giordani S, Zattoni A, Reschiglian P, Roda B. Native characterization and QC profiling of human amniotic mesenchymal stromal cell vesicular fractions for secretome-based therapy. Talanta 2024; 276:126216. [PMID: 38761653 DOI: 10.1016/j.talanta.2024.126216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 04/09/2024] [Accepted: 05/05/2024] [Indexed: 05/20/2024]
Abstract
Human amniotic mesenchymal stromal cells (hAMSCs) have unique immunomodulatory properties making them attractive candidates for regenerative applications in inflammatory diseases. Most of their beneficial properties are mediated through their secretome. The bioactive factors concurring to its therapeutic activity are still unknown. Evidence suggests synergy between the two main components of the secretome, soluble factors and vesicular fractions, pivotal in shifting inflammation and promoting self-healing. Biological variability and the absence of quality control (QC) protocols hinder secretome-based therapy translation to clinical applications. Moreover, vesicular secretome contains a multitude of particles with varying size, cargos and functions whose complexity hinders full characterization and comprehension. This study achieved a significant advancement in secretome characterization by utilizing native, FFF-based separation and characterizing extracellular vesicles derived from hAMSCs. This was accomplished by obtaining dimensionally homogeneous fractions then characterized based on their protein content, potentially enabling the identification of subpopulations with diverse functionalities. This method proved to be successful as an independent technique for secretome profiling, with the potential to contribute to the standardization of a qualitative method. Additionally, it served as a preparative separation tool, streamlining populations before ELISA and LC-MS characterization. This approach facilitated the categorization of distinctive and recurring proteins, along with the identification of clusters associated with vesicle activity and functions. However, the presence of proteins unique to each fraction obtained through the FFF separation tool presents a challenge for further analysis of the protein content within these cargoes.
Collapse
Affiliation(s)
- Valentina Marassi
- Department of Chemistry G. Ciamician, University of Bologna, Italy; byFlow srl, Bologna, Italy
| | - Giampiero La Rocca
- Department of Biomedicine, Neurosciences and Advanced Diagnostics, University of Palermo, 90127, Palermo, Italy
| | - Anna Placci
- Department of Chemistry G. Ciamician, University of Bologna, Italy
| | - Alexandra Muntiu
- Istituto di Scienze e Tecnologie Chimiche "Giulio Natta", Consiglio Nazionale delle Ricerche, 00168, Rome, Italy
| | - Federica Vincenzoni
- Dipartimento di Scienze Biotecnologiche di Base, Cliniche Intensivologiche e Perioperatorie, Università Cattolica del Sacro Cuore, 00168, Rome, Italy; Fondazione Policlinico Universitario A. Gemelli IRCCS, 00168, Rome, Italy
| | - Alberto Vitali
- Istituto di Scienze e Tecnologie Chimiche "Giulio Natta", Consiglio Nazionale delle Ricerche, 00168, Rome, Italy
| | - Claudia Desiderio
- Istituto di Scienze e Tecnologie Chimiche "Giulio Natta", Consiglio Nazionale delle Ricerche, 00168, Rome, Italy
| | - Tullia Maraldi
- Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio Emilia, 41125, Modena, Italy
| | - Francesca Beretti
- Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio Emilia, 41125, Modena, Italy
| | - Eleonora Russo
- Department of Biomedicine, Neurosciences and Advanced Diagnostics, University of Palermo, 90127, Palermo, Italy
| | - Vitale Miceli
- Research Department, IRCCS ISMETT (Istituto Mediterraneo per i Trapianti e Terapie ad alta Specializzazione), 90127, Palermo, Italy
| | - Pier Giulio Conaldi
- Research Department, IRCCS ISMETT (Istituto Mediterraneo per i Trapianti e Terapie ad alta Specializzazione), 90127, Palermo, Italy
| | - Andrea Papait
- Fondazione Policlinico Universitario A. Gemelli IRCCS, 00168, Rome, Italy; Department of Life Science and Public Health, Università Cattolica del Sacro Cuore, 00168, Rome, Italy
| | - Pietro Romele
- Centro di Ricerca E. Menni, Fondazione Poliambulanza Istituto Ospedaliero, 25124, Brescia, Italy
| | - Anna Cargnoni
- Centro di Ricerca E. Menni, Fondazione Poliambulanza Istituto Ospedaliero, 25124, Brescia, Italy
| | - Antonietta Rosa Silini
- Centro di Ricerca E. Menni, Fondazione Poliambulanza Istituto Ospedaliero, 25124, Brescia, Italy
| | - Francesco Alviano
- Department of Biomedical and Neuromotor Science, University of Bologna, Bologna, Italy
| | - Ornella Parolini
- Fondazione Policlinico Universitario A. Gemelli IRCCS, 00168, Rome, Italy; Department of Life Science and Public Health, Università Cattolica del Sacro Cuore, 00168, Rome, Italy
| | - Stefano Giordani
- Department of Chemistry G. Ciamician, University of Bologna, Italy
| | - Andrea Zattoni
- Department of Chemistry G. Ciamician, University of Bologna, Italy; byFlow srl, Bologna, Italy
| | - Pierluigi Reschiglian
- Department of Chemistry G. Ciamician, University of Bologna, Italy; byFlow srl, Bologna, Italy
| | - Barbara Roda
- Department of Chemistry G. Ciamician, University of Bologna, Italy; byFlow srl, Bologna, Italy.
| |
Collapse
|
7
|
Korchak JA, Jeffery ED, Bandyopadhyay S, Jordan BT, Lehe MD, Watts EF, Fenix A, Wilhelm M, Sheynkman GM. IS-PRM-Based Peptide Targeting Informed by Long-Read Sequencing for Alternative Proteome Detection. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2024. [PMID: 39012054 DOI: 10.1021/jasms.4c00119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/17/2024]
Abstract
Alternative splicing is a major contributor of transcriptomic complexity, but the extent to which transcript isoforms are translated into stable, functional protein isoforms is unclear. Furthermore, detection of relatively scarce isoform-specific peptides is challenging, with many protein isoforms remaining uncharted due to technical limitations. Recently, a family of advanced targeted MS strategies, termed internal standard parallel reaction monitoring (IS-PRM), have demonstrated multiplexed, sensitive detection of predefined peptides of interest. Such approaches have not yet been used to confirm existence of novel peptides. Here, we present a targeted proteogenomic approach that leverages sample-matched long-read RNA sequencing (lrRNA-seq) data to predict potential protein isoforms with prior transcript evidence. Predicted tryptic isoform-specific peptides, which are specific to individual gene product isoforms, serve as "triggers" and "targets" in the IS-PRM method, Tomahto. Using the model human stem cell line WTC11, LR RNaseq data were generated and used to inform the generation of synthetic standards for 192 isoform-specific peptides (114 isoforms from 55 genes). These synthetic "trigger" peptides were labeled with super heavy tandem mass tags (TMT) and spiked into TMT-labeled WTC11 tryptic digest, predicted to contain corresponding endogenous "target" peptides. Compared to DDA mode, Tomahto increased detectability of isoforms by 3.6-fold, resulting in the identification of five previously unannotated isoforms. Our method detected protein isoform expression for 43 out of 55 genes corresponding to 54 resolved isoforms. This lrRNA-seq-informed Tomahto targeted approach is a new modality for generating protein-level evidence of alternative isoforms─a critical first step in designing functional studies and eventually clinical assays.
Collapse
Affiliation(s)
- Jennifer A Korchak
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia 22903, United States
| | - Erin D Jeffery
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia 22903, United States
| | - Saikat Bandyopadhyay
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia 22903, United States
- Center for Public Health Genomics, University of Virginia, Charlottesville, Virginia 22903, United States
| | - Ben T Jordan
- Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21701, United States
| | - Micah D Lehe
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia 22903, United States
| | - Emily F Watts
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia 22903, United States
| | - Aidan Fenix
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, Washington 98195, United States
| | - Mathias Wilhelm
- Computational Mass Spectrometry, Technical University of Munich (TUM), D-85354 Freising, Germany
| | - Gloria M Sheynkman
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia 22903, United States
- Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, Virginia 22903, United States
- UVA Comprehensive Cancer Center, University of Virginia, Charlottesville, Virginia 22903, United States
| |
Collapse
|
8
|
Wei Q, Li J, He QY, Chen Y, Zhang G. Identifying PE2 and PE5 Proteins from Existing Mass Spectrometry Data Using pFind. J Proteome Res 2024; 23:2323-2331. [PMID: 38865581 DOI: 10.1021/acs.jproteome.3c00674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2024]
Abstract
The Chromosome-Centric Human Proteome Project (C-HPP) aims to identify all proteins encoded by the human genome. Currently, the human proteome still contains approximately 2000 PE2-PE5 proteins, referring to annotated coding genes that lack sufficient protein-level evidence. During the past 10 years, it has been increasingly difficult to identify PE2-PE5 proteins in C-HPP approaches due to the limited occurrence. Therefore, we proposed that reanalyzing massive MS data sets in repository with newly developed algorithms may increase the occurrence of the peptides of these proteins. In this study, we downloaded 1000 MS data sets via the ProteomeXchange database. Using pFind software, we identified peptides referring to 1788 PE2-PE5 proteins. Among them, 11 PE2 and 16 PE5 proteins were identified with at least 2 peptides, and 12 of them were identified using 2 peptides in a single data set, following the criteria of the HPP guidelines. We found translation evidence for 16 of the 11 PE2 and 16 PE5 proteins in our RNC-seq data, supporting their existence. The properties of the PE2 and PE5 proteins were similar to those of the PE1 proteins. Our approach demonstrated that mining PE2 and PE5 proteins in massive data repository is still worthy, and multidata set peptide identifications may support the presence of PE2 and PE5 proteins or at least prompt additional studies for validation. Extremely high throughput could be a solution to finding more PE2 and PE5 proteins.
Collapse
Affiliation(s)
- Qianzhou Wei
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou 510632, China
| | - Jiamin Li
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou 510632, China
| | - Qing-Yu He
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou 510632, China
| | - Yang Chen
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou 510632, China
| | - Gong Zhang
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou 510632, China
| |
Collapse
|
9
|
Prisby R, Luchini A, Liotta LA, Solazzo C. Wheat-Based Glues in Conservation and Cultural Heritage: (Dis)solving the Proteome of Flour and Starch Pastes and Their Adhering Properties. J Proteome Res 2024; 23:1649-1665. [PMID: 38574199 PMCID: PMC11077587 DOI: 10.1021/acs.jproteome.3c00804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 02/24/2024] [Accepted: 03/22/2024] [Indexed: 04/06/2024]
Abstract
Plant-based adhesives, such as those made from wheat, have been prominently used for books and paper-based objects and are also used as conservation adhesives. Starch paste originates from starch granules, whereas flour paste encompasses the entire wheat endosperm proteome, offering strong adhesive properties due to gluten proteins. From a conservation perspective, understanding the precise nature of the adhesive is vital as the longevity, resilience, and reaction to environmental changes can differ substantially between starch- and flour-based pastes. We devised a proteomics method to discern the protein content of these pastes. Protocols involved extracting soluble proteins using 0.5 M NaCl and 30 mM Tris-HCl solutions and then targeting insoluble proteins, such as gliadins and glutenins, with a buffer containing 7 M urea, 2 M thiourea, 4% CHAPS, 40 mM Tris, and 75 mM DTT. Flour paste's proteome is diverse (1942 proteins across 759 groups), contrasting with starch paste's predominant starch-associated protein makeup (218 proteins in 58 groups). Transformation into pastes reduces proteomes' complexity. Testing on historical bookbindings confirmed the use of flour-based glue, which is rich in gluten and serpins. High levels of deamidation were detected, particularly for glutamine residues, which can impact the solubility and stability of the glue over time. The mass spectrometry proteomics data have been deposited to the ProteomeXchange, Consortium (http://proteomecentral.proteomexchange.org) via the MassIVE partner repository with the data set identifier MSV000093372 (ftp://MSV000093372@massive.ucsd.edu).
Collapse
Affiliation(s)
- Rocio Prisby
- Center
for Applied Proteomics and Molecular Medicine, George Mason University, 10920 George Mason Circle, MSN 1A9, Manassas, Virginia 20110, United States
| | - Alessandra Luchini
- Center
for Applied Proteomics and Molecular Medicine, George Mason University, 10920 George Mason Circle, MSN 1A9, Manassas, Virginia 20110, United States
| | - Lance A. Liotta
- Center
for Applied Proteomics and Molecular Medicine, George Mason University, 10920 George Mason Circle, MSN 1A9, Manassas, Virginia 20110, United States
| | - Caroline Solazzo
- Independent
Researcher for Museum Conservation Institute, Smithsonian Institution, 4210 Silver Hill Road, Suitland, Maryland 20746, United States
| |
Collapse
|
10
|
Hendricks NG, Bhosale SD, Keoseyan AJ, Ortiz J, Stotland A, Seyedmohammad S, Nguyen CDL, Bui J, Moradian A, Mockus SM, Van Eyk JE. An inflection point in high-throughput proteomics with Orbitrap Astral: analysis of biofluids, cells, and tissues. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.26.591396. [PMID: 38712179 PMCID: PMC11071456 DOI: 10.1101/2024.04.26.591396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
This technical note presents a comprehensive proteomics workflow for the new combination of Orbitrap and Astral mass analyzers across biofluids, cells, and tissues. Central to our workflow is the integration of Adaptive Focused Acoustics (AFA) technology for cells and tissue lysis, to ensure robust and reproducible sample preparation in a high-throughput manner. Furthermore, we automated the detergent-compatible single-pot, solid-phase-enhanced sample Preparation (SP3) method for protein digestion, a technique that streamlines the process by combining purification and digestion steps, thereby reducing sample loss and improving efficiency. The synergy of these advanced methodologies facilitates a robust and high-throughput approach for cells and tissue analysis, an important consideration in translational research. This work disseminates our platform workflow, analyzes the effectiveness, demonstrates reproducibility of the results, and highlights the potential of these technologies in biomarker discovery and disease pathology. For cells and tissues (heart, liver, lung, and intestine) proteomics analysis by data-independent acquisition mode, identifications exceeding 10,000 proteins can be achieved with a 24-minute active gradient. In 200ng injections of HeLa digest across multiple gradients, an average of more than 80% of proteins have a CV less than 20%, and a 45-minute run covers ~90% of the expressed proteome. In plasma samples including naive, depleted, perchloric acid precipitated, and Seer nanoparticle captured, all with a 24-minute gradient length, we identified 87, 108, 96 and 137 out of 216 FDA approved circulating protein biomarkers, respectively. This complete workflow allows for large swaths of the proteome to be identified and is compatible across diverse sample types.
Collapse
Affiliation(s)
- Nathan G. Hendricks
- Precision Biomarker Laboratories, Cedars-Sinai Medical Center, Beverly Hills, California 90210, United States
| | - Santosh D. Bhosale
- Precision Biomarker Laboratories, Cedars-Sinai Medical Center, Beverly Hills, California 90210, United States
| | - Angel J. Keoseyan
- Precision Biomarker Laboratories, Cedars-Sinai Medical Center, Beverly Hills, California 90210, United States
| | - Josselin Ortiz
- Precision Biomarker Laboratories, Cedars-Sinai Medical Center, Beverly Hills, California 90210, United States
| | - Aleksandr Stotland
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Saeed Seyedmohammad
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Chi D. L. Nguyen
- Precision Biomarker Laboratories, Cedars-Sinai Medical Center, Beverly Hills, California 90210, United States
| | - Jonathan Bui
- Precision Biomarker Laboratories, Cedars-Sinai Medical Center, Beverly Hills, California 90210, United States
| | - Annie Moradian
- Precision Biomarker Laboratories, Cedars-Sinai Medical Center, Beverly Hills, California 90210, United States
| | - Susan M. Mockus
- Precision Biomarker Laboratories, Cedars-Sinai Medical Center, Beverly Hills, California 90210, United States
| | - Jennifer E Van Eyk
- Precision Biomarker Laboratories, Cedars-Sinai Medical Center, Beverly Hills, California 90210, United States
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| |
Collapse
|
11
|
Omenn GS, Lane L, Overall CM, Lindskog C, Pineau C, Packer NH, Cristea IM, Weintraub ST, Orchard S, Roehrl MHA, Nice E, Guo T, Van Eyk JE, Liu S, Bandeira N, Aebersold R, Moritz RL, Deutsch EW. The 2023 Report on the Proteome from the HUPO Human Proteome Project. J Proteome Res 2024; 23:532-549. [PMID: 38232391 PMCID: PMC11026053 DOI: 10.1021/acs.jproteome.3c00591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]
Abstract
Since 2010, the Human Proteome Project (HPP), the flagship initiative of the Human Proteome Organization (HUPO), has pursued two goals: (1) to credibly identify the protein parts list and (2) to make proteomics an integral part of multiomics studies of human health and disease. The HPP relies on international collaboration, data sharing, standardized reanalysis of MS data sets by PeptideAtlas and MassIVE-KB using HPP Guidelines for quality assurance, integration and curation of MS and non-MS protein data by neXtProt, plus extensive use of antibody profiling carried out by the Human Protein Atlas. According to the neXtProt release 2023-04-18, protein expression has now been credibly detected (PE1) for 18,397 of the 19,778 neXtProt predicted proteins coded in the human genome (93%). Of these PE1 proteins, 17,453 were detected with mass spectrometry (MS) in accordance with HPP Guidelines and 944 by a variety of non-MS methods. The number of neXtProt PE2, PE3, and PE4 missing proteins now stands at 1381. Achieving the unambiguous identification of 93% of predicted proteins encoded from across all chromosomes represents remarkable experimental progress on the Human Proteome parts list. Meanwhile, there are several categories of predicted proteins that have proved resistant to detection regardless of protein-based methods used. Additionally there are some PE1-4 proteins that probably should be reclassified to PE5, specifically 21 LINC entries and ∼30 HERV entries; these are being addressed in the present year. Applying proteomics in a wide array of biological and clinical studies ensures integration with other omics platforms as reported by the Biology and Disease-driven HPP teams and the antibody and pathology resource pillars. Current progress has positioned the HPP to transition to its Grand Challenge Project focused on determining the primary function(s) of every protein itself and in networks and pathways within the context of human health and disease.
Collapse
Affiliation(s)
- Gilbert S. Omenn
- University of Michigan, Ann Arbor, Michigan 48109, United States
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Lydie Lane
- CALIPHO Group, SIB Swiss Institute of Bioinformatics and University of Geneva, 1015 Lausanne, Switzerland
| | - Christopher M. Overall
- University of British Columbia, Vancouver, BC V6T 1Z4, Canada, Yonsei University Republic of Korea
| | | | - Charles Pineau
- University Rennes, Inserm U1085, Irset, 35042 Rennes, France
| | | | | | - Susan T. Weintraub
- University of Texas Health Science Center-San Antonio, San Antonio, Texas 78229-3900, United States
| | | | - Michael H. A. Roehrl
- Department of Pathology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA 02215, United States
| | | | - Tiannan Guo
- Westlake Center for Intelligent Proteomics, Westlake Laboratory, Westlake University, Hangzhou 310024, Zhejiang Province, China
| | - Jennifer E. Van Eyk
- Advanced Clinical Biosystems Research Institute, Smidt Heart Institute, Cedars-Sinai Medical Center, 127 South San Vicente Boulevard, Pavilion, 9th Floor, Los Angeles, CA, 90048, United States
| | - Siqi Liu
- BGI Group, Shenzhen 518083, China
| | - Nuno Bandeira
- University of California, San Diego, La Jolla, CA, 92093, United States
| | - Ruedi Aebersold
- Institute of Molecular Systems Biology in ETH Zurich, 8092 Zurich, Switzerland
- University of Zurich, 8092 Zurich, Switzerland
| | - Robert L. Moritz
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Eric W. Deutsch
- Institute for Systems Biology, Seattle, Washington 98109, United States
| |
Collapse
|
12
|
Cao X, Sun S, Xing J. A Massive Proteogenomic Screen Identifies Thousands of Novel Peptides From the Human "Dark" Proteome. Mol Cell Proteomics 2024; 23:100719. [PMID: 38242438 PMCID: PMC10867589 DOI: 10.1016/j.mcpro.2024.100719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 01/01/2024] [Accepted: 01/16/2024] [Indexed: 01/21/2024] Open
Abstract
Although the human gene annotation has been continuously improved over the past 2 decades, numerous studies demonstrated the existence of a "dark proteome", consisting of proteins that were critical for biological processes but not included in widely used gene catalogs. The Genotype-Tissue Expression project generated more than 15,000 RNA-seq datasets from multiple tissues, which modeled 30 million transcripts in the human genome. To provide a resource of high-confidence novel proteins from the dark proteome, we screened 50,000 mass spectrometry runs from over 900 projects to identify proteins translated from the Genotype-Tissue Expression transcript model with proteomic support. We also integrated 3.8 million common genetic variants from the gnomAD database to improve peptide identification. As a result, we identified 170,529 novel peptides with proteomic evidence, of which 6048 passed the strictest standard we defined and were supported by PepQuery. We provided a user-friendly website (https://ncorf.genes.fun/) for researchers to check the evidence of novel peptides from their studies. The findings will improve our understanding of coding genes and facilitate genomic data interpretation in biomedical research.
Collapse
Affiliation(s)
- Xiaolong Cao
- Department of Anesthesiology, Zhujiang Hospital, Southern Medical University, Guangzhou, Guangdong, China; Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA; Human Genetic Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA
| | - Siqi Sun
- Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA; Human Genetic Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA
| | - Jinchuan Xing
- Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA; Human Genetic Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA.
| |
Collapse
|
13
|
Provencher N, Leblanc S, Jacques JF, Roucou X. Exploring the Alternative Proteome with OpenProt and Mass Spectrometry. Methods Mol Biol 2024; 2836:3-17. [PMID: 38995532 DOI: 10.1007/978-1-0716-4007-4_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/13/2024]
Abstract
Proteogenomics has revealed the translation of unannotated open reading frames (ORFs) present in mRNAs and in noncoding RNAs (ncRNAs). OpenProt annotates all ORFs with a minimum of 30 codons in the transcriptome of several species and displays many functional features associated with the corresponding proteins. Two types of proteins are annotated: reference or canonical proteins which are proteins already annotated in UniProt, RefSeq, or Ensembl and noncanonical proteins. Noncanonical proteins form two groups: predicted novel isoforms that display a significant level of homology with a reference protein and alternative proteins that are new proteins with no significant homology to known proteins. This chapter describes how to check whether a gene and/or transcript contains multiple open reading frames and how to use OpenProt databases for the detection of alternative proteins and novel isoforms by mass spectrometry-based proteomics.
Collapse
Affiliation(s)
- Nicolas Provencher
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Sébastien Leblanc
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Jean-François Jacques
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC, Canada.
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke (CRCHUS), Sherbrooke, QC, Canada.
| |
Collapse
|
14
|
Gordeeva AI, Valueva AA, Rybakova EE, Ershova MO, Shumov ID, Kozlov AF, Ziborov VS, Kozlova AS, Zgoda VG, Ivanov YD, Ilgisonis EV, Kiseleva OI, Ponomarenko EA, Lisitsa AV, Archakov AI, Pleshakova TO. MS Identification of Blood Plasma Proteins Concentrated on a Photocrosslinker-Modified Surface. Int J Mol Sci 2023; 25:409. [PMID: 38203578 PMCID: PMC10778900 DOI: 10.3390/ijms25010409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Revised: 12/14/2023] [Accepted: 12/22/2023] [Indexed: 01/12/2024] Open
Abstract
This work demonstrates the use of a modified mica to concentrate proteins, which is required for proteomic profiling of blood plasma by mass spectrometry (MS). The surface of mica substrates, which are routinely used in atomic force microscopy (AFM), was modified with a photocrosslinker to allow "irreversible" binding of proteins via covalent bond formation. This modified substrate was called the AFM chip. This study aimed to determine the role of the surface and crosslinker in the efficient concentration of various types of proteins in plasma over a wide concentration range. The substrate surface was modified with a 4-benzoylbenzoic acid N-succinimidyl ester (SuccBB) photocrosslinker, activated by UV irradiation. AFM chips were incubated with plasma samples from a healthy volunteer at various dilution ratios (102X, 104X, and 106X). Control experiments were performed without UV irradiation to evaluate the contribution of physical protein adsorption to the concentration efficiency. AFM imaging confirmed the presence of protein layers on the chip surface after incubation with the samples. MS analysis of different samples indicated that the proteomic profile of the AFM-visualized layers contained common and unique proteins. In the working series of experiments, 228 proteins were identified on the chip surface for all samples, and 21 proteins were not identified in the control series. In the control series, a total of 220 proteins were identified on the chip surface, seven of which were not found in the working series. In plasma samples at various dilution ratios, a total of 146 proteins were identified without the concentration step, while 17 proteins were not detected in the series using AFM chips. The introduction of a concentration step using AFM chips allowed us to identify more proteins than in plasma samples without this step. We found that AFM chips with a modified surface facilitate the efficient concentration of proteins owing to the adsorption factor and the formation of covalent bonds between the proteins and the chip surface. The results of our study can be applied in the development of highly sensitive analytical systems for determining the complete composition of the plasma proteome.
Collapse
Affiliation(s)
| | | | | | | | - Ivan D. Shumov
- Institute of Biomedical Chemistry (IBMC), 119121 Moscow, Russia; (A.I.G.); (A.A.V.); (E.E.R.); (M.O.E.); (A.F.K.); (V.S.Z.); (A.S.K.); (V.G.Z.); (Y.D.I.); (E.V.I.); (O.I.K.); (E.A.P.); (A.V.L.); (A.I.A.); (T.O.P.)
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Wacholder A, Carvunis AR. Biological factors and statistical limitations prevent detection of most noncanonical proteins by mass spectrometry. PLoS Biol 2023; 21:e3002409. [PMID: 38048358 PMCID: PMC10721188 DOI: 10.1371/journal.pbio.3002409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Revised: 12/14/2023] [Accepted: 10/30/2023] [Indexed: 12/06/2023] Open
Abstract
Ribosome profiling experiments indicate pervasive translation of short open reading frames (ORFs) outside of annotated protein-coding genes. However, shotgun mass spectrometry (MS) experiments typically detect only a small fraction of the predicted protein products of this noncanonical translation. The rarity of detection could indicate that most predicted noncanonical proteins are rapidly degraded and not present in the cell; alternatively, it could reflect technical limitations. Here, we leveraged recent advances in ribosome profiling and MS to investigate the factors limiting detection of noncanonical proteins in yeast. We show that the low detection rate of noncanonical ORF products can largely be explained by small size and low translation levels and does not indicate that they are unstable or biologically insignificant. In particular, proteins encoded by evolutionarily young genes, including those with well-characterized biological roles, are too short and too lowly expressed to be detected by shotgun MS at current detection sensitivities. Additionally, we find that decoy biases can give misleading estimates of noncanonical protein false discovery rates, potentially leading to false detections. After accounting for these issues, we found strong evidence for 4 noncanonical proteins in MS data, which were also supported by evolution and translation data. These results illustrate the power of MS to validate unannotated genes predicted by ribosome profiling, but also its substantial limitations in finding many biologically relevant lowly expressed proteins.
Collapse
Affiliation(s)
- Aaron Wacholder
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Anne-Ruxandra Carvunis
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| |
Collapse
|
16
|
Muntiu A, Papait A, Vincenzoni F, Vitali A, Lattanzi W, Romele P, Cargnoni A, Silini A, Parolini O, Desiderio C. Disclosing the molecular profile of the human amniotic mesenchymal stromal cell secretome by filter-aided sample preparation proteomic characterization. Stem Cell Res Ther 2023; 14:339. [PMID: 38012707 PMCID: PMC10683150 DOI: 10.1186/s13287-023-03557-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 10/30/2023] [Indexed: 11/29/2023] Open
Abstract
BACKGROUND The secretome of mesenchymal stromal cells isolated from the amniotic membrane (hAMSCs) has been extensively studied for its in vitro immunomodulatory activity as well as for the treatment of several preclinical models of immune-related disorders. The bioactive molecules within the hAMSCs secretome are capable of modulating the immune response and thus contribute to stimulating regenerative processes. At present, only a few studies have attempted to define the composition of the secretome, and several approaches, including multi-omics, are underway in an attempt to precisely define its composition and possibly identify key factors responsible for the therapeutic effect. METHODS In this study, we characterized the protein composition of the hAMSCs secretome by a filter-aided sample preparation (FASP) digestion and liquid chromatography-high resolution mass spectrometry (LC-MS) approach. Data were processed for gene ontology classification and functional protein interaction analysis by bioinformatics tools. RESULTS Proteomic analysis of the hAMSCs secretome resulted in the identification of 1521 total proteins, including 662 unique elements. A number of 157 elements, corresponding to 23.7%, were found as repeatedly characterizing the hAMSCs secretome, and those that resulted as significantly over-represented were involved in immunomodulation, hemostasis, development and remodeling of the extracellular matrix molecular pathways. CONCLUSIONS Overall, our characterization enriches the landscape of hAMSCs with new information that could enable a better understanding of the mechanisms of action underlying the therapeutic efficacy of the hAMSCs secretome while also providing a basis for its therapeutic translation.
Collapse
Affiliation(s)
- Alexandra Muntiu
- Istituto di Scienze e Tecnologie Chimiche (SCITEC) ''Giulio Natta'', Consiglio Nazionale delle Ricerche, Rome, Italy
| | - Andrea Papait
- Department of Life Science and Public Health, Università Cattolica del Sacro Cuore, Rome, Italy
- Fondazione Policlinico Universitario ''Agostino Gemelli'' Istituto di Ricovero e Cura a Carattere Scientifico, IRCCS, Rome, Italy
| | - Federica Vincenzoni
- Fondazione Policlinico Universitario ''Agostino Gemelli'' Istituto di Ricovero e Cura a Carattere Scientifico, IRCCS, Rome, Italy
- Dipartimento di Scienze Biotecnologiche di Base, Cliniche Intensivologiche e Perioperatorie, Università Cattolica del Sacro Cuore, Rome, Italy
| | - Alberto Vitali
- Istituto di Scienze e Tecnologie Chimiche (SCITEC) ''Giulio Natta'', Consiglio Nazionale delle Ricerche, Rome, Italy
| | - Wanda Lattanzi
- Department of Life Science and Public Health, Università Cattolica del Sacro Cuore, Rome, Italy
- Fondazione Policlinico Universitario ''Agostino Gemelli'' Istituto di Ricovero e Cura a Carattere Scientifico, IRCCS, Rome, Italy
| | - Pietro Romele
- Centro di Ricerca E. Menni, Fondazione Poliambulanza Istituto Ospedaliero, Brescia, Italy
| | - Anna Cargnoni
- Centro di Ricerca E. Menni, Fondazione Poliambulanza Istituto Ospedaliero, Brescia, Italy
| | - Antonietta Silini
- Centro di Ricerca E. Menni, Fondazione Poliambulanza Istituto Ospedaliero, Brescia, Italy
| | - Ornella Parolini
- Department of Life Science and Public Health, Università Cattolica del Sacro Cuore, Rome, Italy.
- Fondazione Policlinico Universitario ''Agostino Gemelli'' Istituto di Ricovero e Cura a Carattere Scientifico, IRCCS, Rome, Italy.
| | - Claudia Desiderio
- Istituto di Scienze e Tecnologie Chimiche (SCITEC) ''Giulio Natta'', Consiglio Nazionale delle Ricerche, Rome, Italy.
| |
Collapse
|
17
|
Wu L, Hoque A, Lam H. Spectroscape enables real-time query and visualization of a spectral archive in proteomics. Nat Commun 2023; 14:6267. [PMID: 37805652 PMCID: PMC10560257 DOI: 10.1038/s41467-023-42006-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 09/26/2023] [Indexed: 10/09/2023] Open
Abstract
In proteomics, spectral archives organize the enormous amounts of publicly available peptide tandem mass spectra by similarity, offering opportunities for error correction and novel discoveries. Here we adapt an indexing algorithm developed by Facebook for organizing online multimedia resources to tandem mass spectra and achieve practically instantaneous retrieval and clustering of approximate nearest neighbors in a large spectral archive. An interactive web-based graphical user interface enables the user to view a query spectrum in its clustered neighborhood, which facilitates contextual validation of peptide identifications and exploration of the dark proteome.
Collapse
Affiliation(s)
- Long Wu
- Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong
- Department of Electrical and Computer Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong
| | - Ayman Hoque
- Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong
| | - Henry Lam
- Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong.
| |
Collapse
|
18
|
Luo J, Deng M, Zhang X, Sun X. ESICCC as a systematic computational framework for evaluation, selection, and integration of cell-cell communication inference methods. Genome Res 2023; 33:1788-1805. [PMID: 37827697 PMCID: PMC10691505 DOI: 10.1101/gr.278001.123] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2023] [Accepted: 09/21/2023] [Indexed: 10/14/2023]
Abstract
Cell-cell communication (CCC) is critical for determining cell fates and functions in multicellular organisms. With the advent of single-cell RNA-sequencing (scRNA-seq) and spatial transcriptomics (ST), an increasing number of CCC inference methods have been developed. Nevertheless, a thorough comparison of their performances is yet to be conducted. To fill this gap, we developed a systematic benchmark framework called ESICCC to evaluate 18 ligand-receptor (LR) inference methods and five ligand/receptor-target inference methods using a total of 116 data sets, including 15 ST data sets, 15 sets of cell line perturbation data, two sets of cell type-specific expression/proteomics data, and 84 sets of sampled or unsampled scRNA-seq data. We evaluated and compared the agreement, accuracy, robustness, and usability of these methods. Regarding accuracy evaluation, RNAMagnet, CellChat, and scSeqComm emerge as the three best-performing methods for intercellular ligand-receptor inference based on scRNA-seq data, whereas stMLnet and HoloNet are the best methods for predicting ligand/receptor-target regulation using ST data. To facilitate the practical applications, we provide a decision-tree-style guideline for users to easily choose best tools for their specific research concerns in CCC inference, and develop an ensemble pipeline CCCbank that enables versatile combinations of methods and databases. Moreover, our comparative results also uncover several critical influential factors for CCC inference, such as prior interaction information, ligand-receptor scoring algorithm, intracellular signaling complexity, and spatial relationship, which may be considered in the future studies to advance the development of new methodologies.
Collapse
Affiliation(s)
- Jiaxin Luo
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China
- School of Mathematics, Sun Yat-sen University, Guangzhou 510275, China
| | - Minghua Deng
- School of Mathematical Sciences, Peking University, Beijing, 100871, China
| | - Xuegong Zhang
- Bioinformatics Division of BNRIST and Department of Automation, MOE Key Lab of Bioinformatics, Tsinghua University, Beijing, 100084, China
| | - Xiaoqiang Sun
- School of Mathematics, Sun Yat-sen University, Guangzhou 510275, China;
| |
Collapse
|
19
|
Prensner JR, Abelin JG, Kok LW, Clauser KR, Mudge JM, Ruiz-Orera J, Bassani-Sternberg M, Moritz RL, Deutsch EW, van Heesch S. What Can Ribo-Seq, Immunopeptidomics, and Proteomics Tell Us About the Noncanonical Proteome? Mol Cell Proteomics 2023; 22:100631. [PMID: 37572790 PMCID: PMC10506109 DOI: 10.1016/j.mcpro.2023.100631] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 07/21/2023] [Accepted: 08/08/2023] [Indexed: 08/14/2023] Open
Abstract
Ribosome profiling (Ribo-Seq) has proven transformative for our understanding of the human genome and proteome by illuminating thousands of noncanonical sites of ribosome translation outside the currently annotated coding sequences (CDSs). A conservative estimate suggests that at least 7000 noncanonical ORFs are translated, which, at first glance, has the potential to expand the number of human protein CDSs by 30%, from ∼19,500 annotated CDSs to over 26,000 annotated CDSs. Yet, additional scrutiny of these ORFs has raised numerous questions about what fraction of them truly produce a protein product and what fraction of those can be understood as proteins according to conventional understanding of the term. Adding further complication is the fact that published estimates of noncanonical ORFs vary widely by around 30-fold, from several thousand to several hundred thousand. The summation of this research has left the genomics and proteomics communities both excited by the prospect of new coding regions in the human genome but searching for guidance on how to proceed. Here, we discuss the current state of noncanonical ORF research, databases, and interpretation, focusing on how to assess whether a given ORF can be said to be "protein coding."
Collapse
Affiliation(s)
- John R Prensner
- Division of Pediatric Hematology/Oncology, Department of Pediatrics, University of Michigan Medical School, Ann Arbor, Michigan, USA; Department of Biological Chemistry, University of Michigan Medical School, Ann Arbor, Michigan, USA.
| | | | - Leron W Kok
- Princess Máxima Center for Pediatric Oncology, Utrecht, The Netherlands
| | - Karl R Clauser
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK
| | - Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, Agora Center Bugnon 25A, University of Lausanne, Lausanne, Switzerland; Department of Oncology, Centre Hospitalier Universitaire Vaudois (CHUV), Lausanne, Switzerland; Agora Cancer Research Centre, Lausanne, Switzerland
| | - Robert L Moritz
- Institute for Systems Biology (ISB), Seattle, Washington, USA
| | - Eric W Deutsch
- Institute for Systems Biology (ISB), Seattle, Washington, USA
| | | |
Collapse
|
20
|
Notari S, Gambardella G, Vincenzoni F, Desiderio C, Castagnola M, Bocedi A, Ricci G. The unusual properties of lactoferrin during its nascent phase. Sci Rep 2023; 13:14113. [PMID: 37644064 PMCID: PMC10465537 DOI: 10.1038/s41598-023-41064-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Accepted: 08/21/2023] [Indexed: 08/31/2023] Open
Abstract
Lactoferrin, a multifunctional iron-binding protein containing 16 disulfides, is actively studied for its antibacterial and anti-carcinogenic properties. However, scarce information is nowadays available about its oxidative folding starting from the reduced and unfolded status. This study discovers unusual properties when this protein is examined in its reduced molten globule-like conformation. Using kinetic, CD and fluorescence analyses together with mass spectrometry, we found that a few cysteines display astonishing hyper-reactivity toward different thiol reagents. In details, four cysteines (i.e. 668, 64, 512 and 424) display thousands of times higher reactivity toward GSSG but normal against other natural disulfides. The formation of these four mixed-disulfides with glutathione probably represents the first step of its folding in vivo. A widespread low pKa decreases the reactivity of other 14 cysteines toward GSSG limiting their involvement in the early phase of the oxidative folding. The origin of this hyper-reactivity was due to transient lactoferrin-GSSG complex, as supported by fluorescence experiments. Lactoferrin represents another disulfide containing protein in addition to albumin, lysozyme, ribonuclease, chymotrypsinogen, and trypsinogen which shows cysteines with an extraordinary and specific hyper-reactivity toward GSSG confirming the discovery of a fascinating new feature of proteins in their nascent phase.
Collapse
Affiliation(s)
- Sara Notari
- Dipartimento di Scienze e Tecnologie Chimiche, Università di Roma "Tor Vergata", Rome, Italy
| | - Giorgia Gambardella
- Dipartimento di Scienze e Tecnologie Chimiche, Università di Roma "Tor Vergata", Rome, Italy
| | - Federica Vincenzoni
- Dipartimento di Scienze biotecnologiche di Base, cliniche intensivologiche e perioperatorie, Università Cattolica del Sacro Cuore, Rome, Italy
- Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Claudia Desiderio
- Istituto di Scienze e Tecnologie Chimiche "Giulio Natta", Consiglio Nazionale delle Ricerche, Rome, Italy
| | - Massimo Castagnola
- Laboratorio di Proteomica, Centro Europeo di Ricerca sul Cervello, IRCCS Fondazione Santa Lucia, Rome, Italy
| | - Alessio Bocedi
- Dipartimento di Scienze e Tecnologie Chimiche, Università di Roma "Tor Vergata", Rome, Italy
| | - Giorgio Ricci
- Dipartimento di Scienze e Tecnologie Chimiche, Università di Roma "Tor Vergata", Rome, Italy.
| |
Collapse
|
21
|
Bowler-Barnett EH, Fan J, Luo J, Magrane M, Martin MJ, Orchard S. UniProt and Mass Spectrometry-Based Proteomics-A 2-Way Working Relationship. Mol Cell Proteomics 2023; 22:100591. [PMID: 37301379 PMCID: PMC10404557 DOI: 10.1016/j.mcpro.2023.100591] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 05/20/2023] [Accepted: 06/07/2023] [Indexed: 06/12/2023] Open
Abstract
The human proteome comprises of all of the proteins produced by the sequences translated from the human genome with additional modifications in both sequence and function caused by nonsynonymous variants and posttranslational modifications including cleavage of the initial transcript into smaller peptides and polypeptides. The UniProtKB database (www.uniprot.org) is the world's leading high-quality, comprehensive and freely accessible resource of protein sequence and functional information and presents a summary of experimentally verified, or computationally predicted, functional information added by our expert biocuration team for each protein in the proteome. Researchers in the field of mass spectrometry-based proteomics both consume and add to the body of data available in UniProtKB, and this review highlights the information we provide to this community and the knowledge we in turn obtain from groups via deposition of large-scale datasets in public domain databases.
Collapse
Affiliation(s)
- E H Bowler-Barnett
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom
| | - J Fan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom
| | - J Luo
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom
| | - M Magrane
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom
| | - M J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom
| | - S Orchard
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom.
| |
Collapse
|
22
|
Chen Z, Yang K, Zhang J, Ren S, Chen H, Guo J, Cui Y, Wang T, Wang M. Systems crosstalk between antiviral response and cancerous pathways via extracellular vesicles in HIV-1-associated colorectal cancer. Comput Struct Biotechnol J 2023; 21:3369-3382. [PMID: 37389186 PMCID: PMC10300105 DOI: 10.1016/j.csbj.2023.06.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 05/30/2023] [Accepted: 06/10/2023] [Indexed: 07/01/2023] Open
Abstract
HIV-1 associated colorectal cancer (HA-CRC) is one of the most understudied non-AIDS-defining cancers. In this study, we analyzed the proteome of HA-CRC and the paired remote tissues (HA-RT) through data-independent acquisition mass spectrometry (MS). The quantified proteins could differentiate the HA-CRC and HA-RT groups per PCA or cluster analyses. As a background comparison, we reanalyzed the MS data of non-HIV-1 infected CRC (non-HA-CRC) published by CPTAC. According to the GSEA results, we found that HA-CRC and non-HA-CRC shared similarly over-represented KEGG pathways. Hallmark analysis suggested that terms of antiviral response were only significantly enriched in HA-CRC. The network and molecular system analysis centered the crosstalk of IFN-associated antiviral response and cancerous pathways, which was favored by significant up-regulation of ISGylated proteins as detected in the HA-CRC tissues. We further proved that defective HIV-1 reservoir cells as represented by the 8E5 cells could activate the IFN pathway in human macrophages via horizonal transfer of cell-associated HIV-1 RNA (CA-HIV RNA) carried by extracellular vesicles (EVs). In conclusion, HIV-1 reservoir cells secreted and CA-HIV RNA-containing EVs can induce IFN pathway activation in macrophages that contributes to one of the mechanistic explanations of the systems crosstalk between antiviral response and cancerous pathways in HA-CRC.
Collapse
Affiliation(s)
- Zimei Chen
- The First Affiliated Hospital, MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou, Guangdong 510632, China
- Department of Infectious Diseases, Institute of HIV/AIDS, The First Hospital of Changsha, Changsha, Hunan 410005, China
| | - Ke Yang
- Department of Infectious Diseases, Institute of HIV/AIDS, The First Hospital of Changsha, Changsha, Hunan 410005, China
| | - Jiayi Zhang
- The First Affiliated Hospital, MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou, Guangdong 510632, China
| | - Shufan Ren
- The First Affiliated Hospital, MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou, Guangdong 510632, China
| | - Hui Chen
- Department of Infectious Diseases, Institute of HIV/AIDS, The First Hospital of Changsha, Changsha, Hunan 410005, China
| | - Jiahui Guo
- The First Affiliated Hospital, MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou, Guangdong 510632, China
| | - Yizhi Cui
- The First Affiliated Hospital, MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou, Guangdong 510632, China
| | - Tong Wang
- The First Affiliated Hospital, MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou, Guangdong 510632, China
- Department of Infectious Diseases, Institute of HIV/AIDS, The First Hospital of Changsha, Changsha, Hunan 410005, China
| | - Min Wang
- Department of Infectious Diseases, Institute of HIV/AIDS, The First Hospital of Changsha, Changsha, Hunan 410005, China
| |
Collapse
|
23
|
Leblanc S, Brunet MA, Jacques JF, Lekehal AM, Duclos A, Tremblay A, Bruggeman-Gascon A, Samandi S, Brunelle M, Cohen AA, Scott MS, Roucou X. Newfound Coding Potential of Transcripts Unveils Missing Members of Human Protein Communities. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:515-534. [PMID: 36183975 PMCID: PMC10787177 DOI: 10.1016/j.gpb.2022.09.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 08/10/2022] [Accepted: 09/26/2022] [Indexed: 06/16/2023]
Abstract
Recent proteogenomic approaches have led to the discovery that regions of the transcriptome previously annotated as non-coding regions [i.e., untranslated regions (UTRs), open reading frames overlapping annotated coding sequences in a different reading frame, and non-coding RNAs] frequently encode proteins, termed alternative proteins (altProts). This suggests that previously identified protein-protein interaction (PPI) networks are partially incomplete because altProts are not present in conventional protein databases. Here, we used the proteogenomic resource OpenProt and a combined spectrum- and peptide-centric analysis for the re-analysis of a high-throughput human network proteomics dataset, thereby revealing the presence of 261 altProts in the network. We found 19 genes encoding both an annotated (reference) and an alternative protein interacting with each other. Of the 117 altProts encoded by pseudogenes, 38 are direct interactors of reference proteins encoded by their respective parental genes. Finally, we experimentally validate several interactions involving altProts. These data improve the blueprints of the human PPI network and suggest functional roles for hundreds of altProts.
Collapse
Affiliation(s)
- Sébastien Leblanc
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Marie A Brunet
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Jean-François Jacques
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Amina M Lekehal
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Andréa Duclos
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Alexia Tremblay
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Alexis Bruggeman-Gascon
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Sondos Samandi
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Mylène Brunelle
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada
| | - Alan A Cohen
- Department of Family Medicine, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
| | - Michelle S Scott
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada; PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Quebec City, QC G1V 0A6, Canada.
| |
Collapse
|
24
|
Prensner JR, Abelin JG, Kok LW, Clauser KR, Mudge JM, Ruiz-Orera J, Bassani-Sternberg M, Deutsch EW, van Heesch S. What can Ribo-seq and proteomics tell us about the non-canonical proteome? BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.16.541049. [PMID: 37292611 PMCID: PMC10245706 DOI: 10.1101/2023.05.16.541049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Ribosome profiling (Ribo-seq) has proven transformative for our understanding of the human genome and proteome by illuminating thousands of non-canonical sites of ribosome translation outside of the currently annotated coding sequences (CDSs). A conservative estimate suggests that at least 7,000 non-canonical open reading frames (ORFs) are translated, which, at first glance, has the potential to expand the number of human protein-coding sequences by 30%, from ∼19,500 annotated CDSs to over 26,000. Yet, additional scrutiny of these ORFs has raised numerous questions about what fraction of them truly produce a protein product and what fraction of those can be understood as proteins according to conventional understanding of the term. Adding further complication is the fact that published estimates of non-canonical ORFs vary widely by around 30-fold, from several thousand to several hundred thousand. The summation of this research has left the genomics and proteomics communities both excited by the prospect of new coding regions in the human genome, but searching for guidance on how to proceed. Here, we discuss the current state of non-canonical ORF research, databases, and interpretation, focusing on how to assess whether a given ORF can be said to be "protein-coding". In brief The human genome encodes thousands of non-canonical open reading frames (ORFs) in addition to protein-coding genes. As a nascent field, many questions remain regarding non-canonical ORFs. How many exist? Do they encode proteins? What level of evidence is needed for their verification? Central to these debates has been the advent of ribosome profiling (Ribo-seq) as a method to discern genome-wide ribosome occupancy, and immunopeptidomics as a method to detect peptides that are processed and presented by MHC molecules and not observed in traditional proteomics experiments. This article provides a synthesis of the current state of non-canonical ORF research and proposes standards for their future investigation and reporting. Highlights Combined use of Ribo-seq and proteomics-based methods enables optimal confidence in detecting non-canonical ORFs and their protein products.Ribo-seq can provide more sensitive detection of non-canonical ORFs, but data quality and analytical pipelines will impact results.Non-canonical ORF catalogs are diverse and span both high-stringency and low-stringency ORF nominations.A framework for standardized non-canonical ORF evidence will advance the research field.
Collapse
Affiliation(s)
- John R. Prensner
- Department of Pediatrics, Division of Pediatric Hematology/Oncology, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | | | - Leron W. Kok
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS, Utrecht, the Netherlands
| | - Karl R. Clauser
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Jonathan M. Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jorge Ruiz-Orera
- Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, University of Lausanne, Agora Center Bugnon 25A, 1005 Lausanne, Switzerland
- Department of Oncology, Centre hospitalier universitaire vaudois (CHUV), Rue du Bugnon 46, 1005 Lausanne, Switzerland
- Agora Cancer Research Centre, 1011 Lausanne, Switzerland
| | - Eric W. Deutsch
- Institute for Systems Biology (ISB), Seattle, Washington 98109, USA
| | - Sebastiaan van Heesch
- Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS, Utrecht, the Netherlands
| |
Collapse
|
25
|
Wen B, Zhang B. PepQuery2 democratizes public MS proteomics data for rapid peptide searching. Nat Commun 2023; 14:2213. [PMID: 37072382 PMCID: PMC10113256 DOI: 10.1038/s41467-023-37462-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Accepted: 03/17/2023] [Indexed: 04/20/2023] Open
Abstract
We present PepQuery2, which leverages a new tandem mass spectrometry (MS/MS) data indexing approach to enable ultrafast, targeted identification of novel and known peptides in any local or publicly available MS proteomics datasets. The stand-alone version of PepQuery2 allows directly searching more than one billion indexed MS/MS spectra in the PepQueryDB or any public datasets from PRIDE, MassIVE, iProX, or jPOSTrepo, whereas the web version enables users to search datasets in PepQueryDB with a user-friendly interface. We demonstrate the utilities of PepQuery2 in a wide range of applications including detecting proteomic evidence for genomically predicted novel peptides, validating novel and known peptides identified using spectrum-centric database searching, prioritizing tumor-specific antigens, identifying missing proteins, and selecting proteotypic peptides for targeted proteomics experiments. By putting public MS proteomics data directly into the hands of scientists, PepQuery2 opens many new ways to transform these data into useful information for the broad research community.
Collapse
Affiliation(s)
- Bo Wen
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX, 77030, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA
| | - Bing Zhang
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX, 77030, USA.
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA.
| |
Collapse
|
26
|
Omenn GS, Lane L, Overall CM, Pineau C, Packer NH, Cristea IM, Lindskog C, Weintraub ST, Orchard S, Roehrl MH, Nice E, Liu S, Bandeira N, Chen YJ, Guo T, Aebersold R, Moritz RL, Deutsch EW. The 2022 Report on the Human Proteome from the HUPO Human Proteome Project. J Proteome Res 2023; 22:1024-1042. [PMID: 36318223 PMCID: PMC10081950 DOI: 10.1021/acs.jproteome.2c00498] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The 2022 Metrics of the Human Proteome from the HUPO Human Proteome Project (HPP) show that protein expression has now been credibly detected (neXtProt PE1 level) for 18 407 (93.2%) of the 19 750 predicted proteins coded in the human genome, a net gain of 50 since 2021 from data sets generated around the world and reanalyzed by the HPP. Conversely, the number of neXtProt PE2, PE3, and PE4 missing proteins has been reduced by 78 from 1421 to 1343. This represents continuing experimental progress on the human proteome parts list across all the chromosomes, as well as significant reclassifications. Meanwhile, applying proteomics in a vast array of biological and clinical studies continues to yield significant findings and growing integration with other omics platforms. We present highlights from the Chromosome-Centric HPP, Biology and Disease-driven HPP, and HPP Resource Pillars, compare features of mass spectrometry and Olink and Somalogic platforms, note the emergence of translation products from ribosome profiling of small open reading frames, and discuss the launch of the initial HPP Grand Challenge Project, "A Function for Each Protein".
Collapse
Affiliation(s)
- Gilbert S. Omenn
- University of Michigan, Ann Arbor, Michigan 48109, United States
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Lydie Lane
- CALIPHO Group, SIB Swiss Institute of Bioinformatics and University of Geneva, 1015 Lausanne, Switzerland
| | | | - Charles Pineau
- French Institute of Health and Medical Research, 35042 RENNES Cedex, France
| | - Nicolle H. Packer
- Macquarie University, Sydney, NSW 2109, Australia
- Griffith University’s Institute for Glycomics, Sydney, NSW 2109, Australia
| | | | | | - Susan T. Weintraub
- University of Texas Health Science Center-San Antonio, San Antonio, Texas 78229-3900, United States
| | - Sandra Orchard
- EMBL-EBI, Hinxton, Cambridgeshire, CB10 1SD, United Kingdom
| | - Michael H.A. Roehrl
- Memorial Sloan Kettering Cancer Center, New York, New York, 10065, United States
| | | | - Siqi Liu
- BGI Group, Shenzhen 518083, China
| | - Nuno Bandeira
- University of California, San Diego, La Jolla, California 92093, United States
| | - Yu-Ju Chen
- National Taiwan University, Academia Sinica, Nankang, Taipei 11529, Taiwan
| | - Tiannan Guo
- Westlake University Guomics Laboratory of Big Proteomic Data, Hangzhou 310024, Zhejiang Province, China
| | - Ruedi Aebersold
- Institute of Molecular Systems Biology in ETH Zurich, 8092 Zurich, Switzerland
| | - Robert L. Moritz
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Eric W. Deutsch
- Institute for Systems Biology, Seattle, Washington 98109, United States
| |
Collapse
|
27
|
Evolution and implications of de novo genes in humans. Nat Ecol Evol 2023:10.1038/s41559-023-02014-y. [PMID: 36928843 DOI: 10.1038/s41559-023-02014-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Accepted: 02/06/2023] [Indexed: 03/18/2023]
Abstract
Genes and translated open reading frames (ORFs) that emerged de novo from previously non-coding sequences provide species with opportunities for adaptation. When aberrantly activated, some human-specific de novo genes and ORFs have disease-promoting properties-for instance, driving tumour growth. Thousands of putative de novo coding sequences have been described in humans, but we still do not know what fraction of those ORFs has readily acquired a function. Here, we discuss the challenges and controversies surrounding the detection, mechanisms of origin, annotation, validation and characterization of de novo genes and ORFs. Through manual curation of literature and databases, we provide a thorough table with most de novo genes reported for humans to date. We re-evaluate each locus by tracing the enabling mutations and list proposed disease associations, protein characteristics and supporting evidence for translation and protein detection. This work will support future explorations of de novo genes and ORFs in humans.
Collapse
|
28
|
Letunica N, McCafferty C, Swaney E, Cai T, Monagle P, Ignjatovic V, Attard C. Proteomic Applications and Considerations: From Research to Patient Care. Methods Mol Biol 2023; 2628:181-192. [PMID: 36781786 DOI: 10.1007/978-1-0716-2978-9_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2023]
Abstract
Despite technological advancements in the field of proteomics, the rate at which serum and plasma biomarkers identified using proteomic approaches are translated into clinical use remains extremely low. In this chapter, we describe recent technological advancements and analytical strategies in proteomic methods. We also describe the progress of proteomic blood-based biomarkers to date and discuss what the future of proteomics might entail with the use of multi-omic approaches and implementing machine learning on large proteomic datasets. Lastly, we provide several key considerations for biomarker studies, ranging from sample type to the use of reference samples, in order to achieve progress from bench to bedside, ultimately improving patient diagnosis, disease, and/or therapeutic monitoring and care.
Collapse
Affiliation(s)
- Natasha Letunica
- Haematology Research, Murdoch Children's Research Institute, Melbourne, VIC, Australia
| | - Conor McCafferty
- Haematology Research, Murdoch Children's Research Institute, Melbourne, VIC, Australia.,Department of Paediatrics, The University of Melbourne, Melbourne, VIC, Australia
| | - Ella Swaney
- Haematology Research, Murdoch Children's Research Institute, Melbourne, VIC, Australia.,Department of Paediatrics, The University of Melbourne, Melbourne, VIC, Australia
| | - Tengyi Cai
- Haematology Research, Murdoch Children's Research Institute, Melbourne, VIC, Australia.,Department of Paediatrics, The University of Melbourne, Melbourne, VIC, Australia
| | - Paul Monagle
- Haematology Research, Murdoch Children's Research Institute, Melbourne, VIC, Australia.,Department of Paediatrics, The University of Melbourne, Melbourne, VIC, Australia.,Department of Clinical Haematology, Royal Children's Hospital, Melbourne, VIC, Australia.,Kids Cancer Centre, Sydney Children's Hospital, Randwick, NSW, Australia
| | - Vera Ignjatovic
- Department of Paediatrics, The University of Melbourne, Melbourne, VIC, Australia.,Institute for Clinical and Translational Research, Johns Hopkins All Children's Hospital, St. Petersburg, USA.,Department of Pediatrics, Johns Hopkins University, Baltimore, USA
| | - Chantal Attard
- Haematology Research, Murdoch Children's Research Institute, Melbourne, VIC, Australia. .,Department of Paediatrics, The University of Melbourne, Melbourne, VIC, Australia. .,The Royal Children's Hospital, Parkville, VIC, Australia.
| |
Collapse
|
29
|
Consolidation of metabolomic, proteomic, and GWAS data in connective model of schizophrenia. Sci Rep 2023; 13:2139. [PMID: 36747015 PMCID: PMC9901842 DOI: 10.1038/s41598-023-29117-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 01/31/2023] [Indexed: 02/08/2023] Open
Abstract
Despite of multiple systematic studies of schizophrenia based on proteomics, metabolomics, and genome-wide significant loci, reconstruction of underlying mechanism is still a challenging task. Combination of the advanced data for quantitative proteomics, metabolomics, and genome-wide association study (GWAS) can enhance the current fundamental knowledge about molecular pathogenesis of schizophrenia. In this study, we utilized quantitative proteomic and metabolomic assay, and high throughput genotyping for the GWAS study. We identified 20 differently expressed proteins that were validated on an independent cohort of patients with schizophrenia, including ALS, A1AG1, PEDF, VTDB, CERU, APOB, APOH, FASN, GPX3, etc. and almost half of them are new for schizophrenia. The metabolomic survey revealed 18 group-specific compounds, most of which were the part of transformation of tyrosine and steroids with the prevalence to androgens (androsterone sulfate, thyroliberin, thyroxine, dihydrotestosterone, androstenedione, cholesterol sulfate, metanephrine, dopaquinone, etc.). The GWAS assay mostly failed to reveal significantly associated loci therefore 52 loci with the smoothened p < 10-5 were fractionally integrated into proteome-metabolome data. We integrated three omics layers and powered them by the quantitative analysis to propose a map of molecular events associated with schizophrenia psychopathology. The resulting interplay between different molecular layers emphasizes a strict implication of lipids transport, oxidative stress, imbalance in steroidogenesis and associated impartments of thyroid hormones as key interconnected nodes essential for understanding of how the regulation of distinct metabolic axis is achieved and what happens in the conditioned proteome and metabolome to produce a schizophrenia-specific pattern.
Collapse
|
30
|
Deutsch EW, Mendoza L, Shteynberg DD, Hoopmann MR, Sun Z, Eng JK, Moritz RL. Trans-Proteomic Pipeline: Robust Mass Spectrometry-Based Proteomics Data Analysis Suite. J Proteome Res 2023; 22:615-624. [PMID: 36648445 PMCID: PMC10166710 DOI: 10.1021/acs.jproteome.2c00624] [Citation(s) in RCA: 32] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
The Trans-Proteomic Pipeline (TPP) mass spectrometry data analysis suite has been in continual development and refinement since its first tools, PeptideProphet and ProteinProphet, were published 20 years ago. The current release provides a large complement of tools for spectrum processing, spectrum searching, search validation, abundance computation, protein inference, and more. Many of the tools include machine-learning modeling to extract the most information from data sets and build robust statistical models to compute the probabilities that derived information is correct. Here we present the latest information on the many TPP tools, and how TPP can be deployed on various platforms from personal Windows laptops to Linux clusters and expansive cloud computing environments. We describe tutorials on how to use TPP in a variety of ways and describe synergistic projects that leverage TPP. We conclude with plans for continued development of TPP.
Collapse
Affiliation(s)
- Eric W Deutsch
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Luis Mendoza
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | | | | | - Zhi Sun
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Jimmy K Eng
- Proteomics Resource, University of Washington, Seattle, Washington 98195, United States
| | - Robert L Moritz
- Institute for Systems Biology, Seattle, Washington 98109, United States
| |
Collapse
|
31
|
Deutsch EW, Vizcaíno JA, Jones AR, Binz PA, Lam H, Klein J, Bittremieux W, Perez-Riverol Y, Tabb DL, Walzer M, Ricard-Blum S, Hermjakob H, Neumann S, Mak TD, Kawano S, Mendoza L, Van Den Bossche T, Gabriels R, Bandeira N, Carver J, Pullman B, Sun Z, Hoffmann N, Shofstahl J, Zhu Y, Licata L, Quaglia F, Tosatto SCE, Orchard SE. Proteomics Standards Initiative at Twenty Years: Current Activities and Future Work. J Proteome Res 2023; 22:287-301. [PMID: 36626722 PMCID: PMC9903322 DOI: 10.1021/acs.jproteome.2c00637] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Indexed: 01/11/2023]
Abstract
The Human Proteome Organization (HUPO) Proteomics Standards Initiative (PSI) has been successfully developing guidelines, data formats, and controlled vocabularies (CVs) for the proteomics community and other fields supported by mass spectrometry since its inception 20 years ago. Here we describe the general operation of the PSI, including its leadership, working groups, yearly workshops, and the document process by which proposals are thoroughly and publicly reviewed in order to be ratified as PSI standards. We briefly describe the current state of the many existing PSI standards, some of which remain the same as when originally developed, some of which have undergone subsequent revisions, and some of which have become obsolete. Then the set of proposals currently being developed are described, with an open call to the community for participation in the forging of the next generation of standards. Finally, we describe some synergies and collaborations with other organizations and look to the future in how the PSI will continue to promote the open sharing of data and thus accelerate the progress of the field of proteomics.
Collapse
Affiliation(s)
- Eric W. Deutsch
- Institute
for Systems Biology, Seattle, Washington 98109, United States
| | - Juan Antonio Vizcaíno
- European
Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Andrew R. Jones
- Institute
of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Pierre-Alain Binz
- Clinical
Chemistry Service, Lausanne University Hospital, 1011 976 Lausanne, Switzerland
| | - Henry Lam
- Department
of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong 999077, P. R. China.
| | - Joshua Klein
- Program for
Bioinformatics, Boston University, Boston, Massachusetts 02215, United States
| | - Wout Bittremieux
- Skaggs
School
of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California 92093, United States
- Department
of Computer Science, University of Antwerp, 2020 Antwerpen, Belgium
| | - Yasset Perez-Riverol
- European
Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - David L. Tabb
- SA MRC
Centre for TB Research, DST/NRF Centre of Excellence for Biomedical
TB Research, Division of Molecular Biology and Human Genetics, Faculty
of Medicine and Health Sciences, Stellenbosch
University, Cape Town 7602, South Africa
| | - Mathias Walzer
- European
Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Sylvie Ricard-Blum
- Univ.
Lyon, Université Lyon 1, ICBMS, UMR 5246, 69622 Villeurbanne, France
| | - Henning Hermjakob
- European
Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Steffen Neumann
- Bioinformatics
and Scientific Data, Leibniz Institute of
Plant Biochemistry, 06120 Halle, Germany
- German
Centre for Integrative Biodiversity Research (iDiv), 04103 Halle-Jena-Leipzig, Germany
| | - Tytus D. Mak
- Mass Spectrometry
Data Center, National Institute of Standards
and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, United
States
| | - Shin Kawano
- Database
Center for Life Science, Joint Support Center for Data Science Research, Research Organization of Information and Systems, Chiba 277-0871, Japan
- Faculty
of Contemporary Society, Toyama University
of International Studies, Toyama 930-1292, Japan
- School
of Frontier Engineering, Kitasato University, Sagamihara 252-0373, Japan
| | - Luis Mendoza
- Institute
for Systems Biology, Seattle, Washington 98109, United States
| | - Tim Van Den Bossche
- VIB-UGent
Center for Medical Biotechnology, VIB, 9052 Ghent, Belgium
- Department
of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, 9052 Ghent, Belgium
| | - Ralf Gabriels
- VIB-UGent
Center for Medical Biotechnology, VIB, 9052 Ghent, Belgium
- Department
of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, 9052 Ghent, Belgium
| | - Nuno Bandeira
- Skaggs
School
of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California 92093, United States
- Center
for Computational Mass Spectrometry, Department of Computer Science
and Engineering, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego 92093-0404, United States
| | - Jeremy Carver
- Center
for Computational Mass Spectrometry, Department of Computer Science
and Engineering, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego 92093-0404, United States
| | - Benjamin Pullman
- Center
for Computational Mass Spectrometry, Department of Computer Science
and Engineering, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego 92093-0404, United States
| | - Zhi Sun
- Institute
for Systems Biology, Seattle, Washington 98109, United States
| | - Nils Hoffmann
- Institute
for Bio- and Geosciences (IBG-5), Forschungszentrum
Jülich GmbH, 52428 Jülich, Germany
| | - Jim Shofstahl
- Thermo
Fisher Scientific, 355 River Oaks Parkway, San Jose, California 95134, United States
| | - Yunping Zhu
- National
Center for Protein Sciences (Beijing), Beijing
Institute of Lifeomics, #38, Life Science Park, Changping District, Beijing 102206, China
| | - Luana Licata
- Fondazione
Human Technopole, 20157 Milan, Italy
- Department
of Biology, University of Rome Tor Vergata, 00133 Rome, Italy
| | - Federica Quaglia
- Institute
of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR-IBIOM), 70126 Bari, Italy
- Department
of Biomedical Sciences, University of Padova, 35131 Padova, Italy
| | | | - Sandra E. Orchard
- European
Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| |
Collapse
|
32
|
UniProt: the Universal Protein Knowledgebase in 2023. Nucleic Acids Res 2023; 51:D523-D531. [PMID: 36408920 PMCID: PMC9825514 DOI: 10.1093/nar/gkac1052] [Citation(s) in RCA: 1615] [Impact Index Per Article: 1615.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Revised: 10/05/2022] [Accepted: 10/25/2022] [Indexed: 11/22/2022] Open
Abstract
The aim of the UniProt Knowledgebase is to provide users with a comprehensive, high-quality and freely accessible set of protein sequences annotated with functional information. In this publication we describe enhancements made to our data processing pipeline and to our website to adapt to an ever-increasing information content. The number of sequences in UniProtKB has risen to over 227 million and we are working towards including a reference proteome for each taxonomic group. We continue to extract detailed annotations from the literature to update or create reviewed entries, while unreviewed entries are supplemented with annotations provided by automated systems using a variety of machine-learning techniques. In addition, the scientific community continues their contributions of publications and annotations to UniProt entries of their interest. Finally, we describe our new website (https://www.uniprot.org/), designed to enhance our users' experience and make our data easily accessible to the research community. This interface includes access to AlphaFold structures for more than 85% of all entries as well as improved visualisations for subcellular localisation of proteins.
Collapse
|
33
|
Martinez TF, Lyons-Abbott S, Bookout AL, De Souza EV, Donaldson C, Vaughan JM, Lau C, Abramov A, Baquero AF, Baquero K, Friedrich D, Huard J, Davis R, Kim B, Koch T, Mercer AJ, Misquith A, Murray SA, Perry S, Pino LK, Sanford C, Simon A, Zhang Y, Zipp G, Bizarro CV, Shokhirev MN, Whittle AJ, Searle BC, MacCoss MJ, Saghatelian A, Barnes CA. Profiling mouse brown and white adipocytes to identify metabolically relevant small ORFs and functional microproteins. Cell Metab 2023; 35:166-183.e11. [PMID: 36599300 PMCID: PMC9889109 DOI: 10.1016/j.cmet.2022.12.004] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 09/19/2022] [Accepted: 12/06/2022] [Indexed: 01/05/2023]
Abstract
Microproteins (MPs) are a potentially rich source of uncharacterized metabolic regulators. Here, we use ribosome profiling (Ribo-seq) to curate 3,877 unannotated MP-encoding small ORFs (smORFs) in primary brown, white, and beige mouse adipocytes. Of these, we validated 85 MPs by proteomics, including 33 circulating MPs in mouse plasma. Analyses of MP-encoding mRNAs under different physiological conditions (high-fat diet) revealed that numerous MPs are regulated in adipose tissue in vivo and are co-expressed with established metabolic genes. Furthermore, Ribo-seq provided evidence for the translation of Gm8773, which encodes a secreted MP that is homologous to human and chicken FAM237B. Gm8773 is highly expressed in the arcuate nucleus of the hypothalamus, and intracerebroventricular administration of recombinant mFAM237B showed orexigenic activity in obese mice. Together, these data highlight the value of this adipocyte MP database in identifying MPs with roles in fundamental metabolic and physiological processes such as feeding.
Collapse
Affiliation(s)
- Thomas F Martinez
- Department of Pharmaceutical Sciences, Department of Biological Chemistry, Chao Family Comprehensive Cancer Center, University of California, Irvine, Irvine, CA, USA
| | | | - Angie L Bookout
- Novo Nordisk Research Center Seattle, Inc., Seattle, WA, USA
| | - Eduardo V De Souza
- Centro de Pesquisas em Biologia Molecular e Funcional (CPBMF) and Instituto Nacional de Ciência e Tecnologia em Tuberculose (INCT-TB), Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS), Porto Alegre, Brazil; Programa de Pós-Graduação em Biologia Celular e Molecular, Pontifícia Universidade Católica do Rio Grande do Sul, Porto Alegre, Rio Grande do Sul 90616-900, Brazil; Clayton Foundation Laboratories for Peptide Biology, Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Cynthia Donaldson
- Clayton Foundation Laboratories for Peptide Biology, Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Joan M Vaughan
- Clayton Foundation Laboratories for Peptide Biology, Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Calvin Lau
- Clayton Foundation Laboratories for Peptide Biology, Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Ariel Abramov
- Novo Nordisk Research Center Seattle, Inc., Seattle, WA, USA
| | - Arian F Baquero
- Novo Nordisk Research Center Seattle, Inc., Seattle, WA, USA
| | - Karalee Baquero
- Novo Nordisk Research Center Seattle, Inc., Seattle, WA, USA
| | - Dave Friedrich
- Novo Nordisk Research Center Seattle, Inc., Seattle, WA, USA
| | - Justin Huard
- Novo Nordisk Research Center Seattle, Inc., Seattle, WA, USA
| | - Ray Davis
- Novo Nordisk Research Center Seattle, Inc., Seattle, WA, USA
| | - Bong Kim
- Novo Nordisk Research Center Seattle, Inc., Seattle, WA, USA
| | - Ty Koch
- Novo Nordisk Research Center Seattle, Inc., Seattle, WA, USA
| | - Aaron J Mercer
- Novo Nordisk Research Center Seattle, Inc., Seattle, WA, USA
| | - Ayesha Misquith
- Novo Nordisk Research Center Seattle, Inc., Seattle, WA, USA
| | - Sara A Murray
- Novo Nordisk Research Center Seattle, Inc., Seattle, WA, USA
| | - Sakara Perry
- Novo Nordisk Research Center Seattle, Inc., Seattle, WA, USA
| | - Lindsay K Pino
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | | | - Alex Simon
- Novo Nordisk Research Center Seattle, Inc., Seattle, WA, USA
| | - Yu Zhang
- Novo Nordisk Research Center Seattle, Inc., Seattle, WA, USA
| | - Garrett Zipp
- Novo Nordisk Research Center Seattle, Inc., Seattle, WA, USA
| | - Cristiano V Bizarro
- Centro de Pesquisas em Biologia Molecular e Funcional (CPBMF) and Instituto Nacional de Ciência e Tecnologia em Tuberculose (INCT-TB), Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS), Porto Alegre, Brazil; Programa de Pós-Graduação em Biologia Celular e Molecular, Pontifícia Universidade Católica do Rio Grande do Sul, Porto Alegre, Rio Grande do Sul 90616-900, Brazil
| | - Maxim N Shokhirev
- Razavi Newman Integrative Genomics and Bioinformatics Core, Salk Institute for Biological Studies, La Jolla, CA, USA
| | | | - Brian C Searle
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Michael J MacCoss
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Alan Saghatelian
- Clayton Foundation Laboratories for Peptide Biology, Salk Institute for Biological Studies, La Jolla, CA, USA.
| | - Christopher A Barnes
- Novo Nordisk Research Center Seattle, Inc., Seattle, WA, USA; Velia Therapeutics, Inc., San Diego, CA, USA.
| |
Collapse
|
34
|
Li Z, Li K, Xu B, Chen J, Zhang Y, Guo L, Xie J. Identification evidence unraveled by strict proteomics rules toward forensic samples. Electrophoresis 2023; 44:337-348. [PMID: 35906925 DOI: 10.1002/elps.202200051] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 06/18/2022] [Accepted: 07/14/2022] [Indexed: 02/01/2023]
Abstract
Snake venom is a complex mixture of proteins and peptides secreted by venomous snakes from their poison glands. Although proteomics for snake venom composition, interspecific differences, and developmental evolution has been developed for a decade, current diagnosis or identification techniques of snake venom in clinical intoxication and forensic science applications are mainly dependent on morphological and immunoassay. It could be expected that the proteomics techniques directly offer great help. This work applied a bottom-up proteomics method to identify proteins' types and species attribution in suspected snake venom samples using ultrahigh-performance liquid chromatography-quadrupole-electrostatic field Orbitrap tandem mass spectrometric technique, and cytotoxicity assay was amended to provide a direct evidence of toxicity. Toward the suspicious samples seized in the security control, sample pretreatment (in-sol and in-gel digestion) and data acquisition (nontargeted and targeted screening) modes complemented and validated each other. We have implemented two consequent approaches in identifying the species source of proteins in the samples via the points of venom proteomics and strict forensic identification. First, we completed a workflow consisting of a proteomics database match toward an entire SWISS-PROT (date 2018-11-22) database and a result-directed specific taxonomy database. The latter was a helpful hint to compare master protein kinds and reveal the insufficiency of specific venom proteomics characterization rules. Second, we suggested strict rules for protein identification to meet the requirements of forensic science on improved identification correctness, that is, (1) peptide spectrum matches confidence, peptide confidence, and protein confidence were both high (with the false-discovery ratio less than 1%); (2) the number of unique peptides was more than or equal to two in one protein, and (3) within unique peptides, which at least 75% of the ∆m/z of the matched y and b ions were less than 5 ppm. We identified these samples as cobra venom containing 10 highly abundant proteins (P00597, P82463, P60770, Q9YGI4, P62375, P49123, P80245, P60302, P01442, and P60304) from two snake venom protein families (acid phospholipase A2 and three-finger toxins), and the most abundant proteins were cytotoxins.
Collapse
Affiliation(s)
- Zehua Li
- State Key Laboratory of Toxicology and Medical Countermeasures, and Laboratory of Toxicant Analysis, Institute of Pharmacology and Toxicology, Academy of Military Medical Sciences, Beijing, P. R. China
| | - Kexin Li
- State Key Laboratory of Toxicology and Medical Countermeasures, and Laboratory of Toxicant Analysis, Institute of Pharmacology and Toxicology, Academy of Military Medical Sciences, Beijing, P. R. China
| | - Bin Xu
- State Key Laboratory of Toxicology and Medical Countermeasures, and Laboratory of Toxicant Analysis, Institute of Pharmacology and Toxicology, Academy of Military Medical Sciences, Beijing, P. R. China
| | - Jia Chen
- State Key Laboratory of Toxicology and Medical Countermeasures, and Laboratory of Toxicant Analysis, Institute of Pharmacology and Toxicology, Academy of Military Medical Sciences, Beijing, P. R. China
| | - Ying Zhang
- Forensic Science Service of Beijing Public Security Bureau, Key Laboratory of Forensic Toxicology, Ministry of Public Security, Beijing, P. R. China
| | - Lei Guo
- State Key Laboratory of Toxicology and Medical Countermeasures, and Laboratory of Toxicant Analysis, Institute of Pharmacology and Toxicology, Academy of Military Medical Sciences, Beijing, P. R. China
| | - Jianwei Xie
- State Key Laboratory of Toxicology and Medical Countermeasures, and Laboratory of Toxicant Analysis, Institute of Pharmacology and Toxicology, Academy of Military Medical Sciences, Beijing, P. R. China
| |
Collapse
|
35
|
Girard O, Lavigne R, Chevolleau S, Onfray C, Com E, Schmit PO, Chapelle M, Fréour T, Lane L, David L, Pineau C. Naive Pluripotent and Trophoblastic Stem Cell Lines as a Model for Detecting Missing Proteins in the Context of the Chromosome-Centric Human Proteome Project. J Proteome Res 2022; 22:1148-1158. [PMID: 36445260 DOI: 10.1021/acs.jproteome.2c00496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
The Chromosome-centric Human Proteome Project (C-HPP) aims at identifying the proteins as gene products encoded by the human genome, characterizing their isoforms and functions. The existence of products has now been confirmed for 93.2% of the genes at the protein level. The remaining mostly correspond to proteins of low abundance or difficult to access. Over the past years, we have significantly contributed to the identification of missing proteins in the human spermatozoa. We pursue our search in the reproductive sphere with a focus on early human embryonic development. Pluripotent cells, developing into the fetus, and trophoblast cells, giving rise to the placenta, emerge during the first weeks. This emergence is a focus of scientists working in the field of reproduction, placentation and regenerative medicine. Most knowledge has been harnessed by transcriptomic analysis. Interestingly, some genes are uniquely expressed in those cells, giving the opportunity to uncover new proteins that might play a crucial role in setting up the molecular events underlying early embryonic development. Here, we analyzed naive pluripotent and trophoblastic stem cells and discovered 4 new missing proteins, thus contributing to the C-HPP. The mass spectrometry proteomics data was deposited on ProteomeXchange under the data set identifier PXD035768.
Collapse
Affiliation(s)
- Océane Girard
- Nantes Université, CHU Nantes, Inserm, CR2TI, UMR 1064, F-44000Nantes, France
| | - Régis Lavigne
- Univ Rennes, Inserm, EHESP, Irset (Institut de Recherche en Santé, Environnement et Travail) - UMR_S 1085, F-35000Rennes, France.,Univ Rennes, CNRS, Inserm, Biosit UAR 3480 US_S 018, Protim Core Facility, F-35000Rennes, France
| | - Simon Chevolleau
- Nantes Université, CHU Nantes, Inserm, CR2TI, UMR 1064, F-44000Nantes, France
| | - Constance Onfray
- Nantes Université, CHU Nantes, Inserm, CR2TI, UMR 1064, F-44000Nantes, France
| | - Emmanuelle Com
- Univ Rennes, Inserm, EHESP, Irset (Institut de Recherche en Santé, Environnement et Travail) - UMR_S 1085, F-35000Rennes, France.,Univ Rennes, CNRS, Inserm, Biosit UAR 3480 US_S 018, Protim Core Facility, F-35000Rennes, France
| | | | - Manuel Chapelle
- Bruker Daltonique SA, 34 rue de l'Industrie, F-67166Wissembourg cedex, France
| | - Thomas Fréour
- Nantes Université, CHU Nantes, Inserm, CR2TI, UMR 1064, F-44000Nantes, France.,CHU Nantes, Service de Biologie de la Reproduction, F-44000Nantes, France.,Department of Obstetrics, Gynecology and Reproductive Medicine, Dexeus University Hospital, 08028Barcelona, Spain
| | - Lydie Lane
- CALIPHO Group, SIB Swiss Institute of Bioinformatics and University of Geneva, CH-1211Geneva, Switzerland
| | - Laurent David
- Nantes Université, CHU Nantes, Inserm, CR2TI, UMR 1064, F-44000Nantes, France.,Nantes Université, CHU Nantes, Inserm, CNRS, BioCore, F-44000Nantes, France
| | - Charles Pineau
- Univ Rennes, Inserm, EHESP, Irset (Institut de Recherche en Santé, Environnement et Travail) - UMR_S 1085, F-35000Rennes, France.,Univ Rennes, CNRS, Inserm, Biosit UAR 3480 US_S 018, Protim Core Facility, F-35000Rennes, France
| |
Collapse
|
36
|
Tear Proteome Revealed Association of S100A Family Proteins and Mesothelin with Thrombosis in Elderly Patients with Retinal Vein Occlusion. Int J Mol Sci 2022; 23:ijms232314653. [PMID: 36498980 PMCID: PMC9736253 DOI: 10.3390/ijms232314653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Revised: 11/03/2022] [Accepted: 11/21/2022] [Indexed: 11/25/2022] Open
Abstract
Tear samples collected from patients with central retinal vein occlusion (CRVO; n = 28) and healthy volunteers (n = 29) were analyzed using a proteomic label-free absolute quantitative approach. A large proportion (458 proteins with a frequency > 0.6) of tear proteomes was found to be shared between the study groups. Comparative proteomic analysis revealed 29 proteins (p < 0.05) significantly differed between CRVO patients and the control group. Among them, S100A6 (log (2) FC = 1.11, p < 0.001), S100A8 (log (2) FC = 2.45, p < 0.001), S100A9 (log2 (FC) = 2.08, p < 0.001), and mesothelin ((log2 (FC) = 0.82, p < 0.001) were the most abundantly represented upregulated proteins, and β2-microglobulin was the most downregulated protein (log2 (FC) = −2.13, p < 0.001). The selected up- and downregulated proteins were gathered to customize a map of CRVO-related critical protein interactions with quantitative properties. The customized map (FDR < 0.01) revealed inflammation, impairment of retinal hemostasis, and immune response as the main set of processes associated with CRVO ischemic condition. The semantic analysis displayed the prevalence of core biological processes covering dysregulation of mitochondrial organization and utilization of improperly or topologically incorrect folded proteins as a consequence of oxidative stress, and escalating of the ischemic condition caused by the local retinal hemostasis dysregulation. The most significantly different proteins (S100A6, S100A8, S100A9, MSLN, and β2-microglobulin) were applied for the ROC analysis, and their AUC varied from 0.772 to 0.952, suggesting probable association with the CRVO.
Collapse
|
37
|
Cui M, Cheng C, Zhang L. High-throughput proteomics: a methodological mini-review. J Transl Med 2022; 102:1170-1181. [PMID: 36775443 PMCID: PMC9362039 DOI: 10.1038/s41374-022-00830-7] [Citation(s) in RCA: 90] [Impact Index Per Article: 45.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Revised: 07/06/2022] [Accepted: 07/10/2022] [Indexed: 11/15/2022] Open
Abstract
Proteomics plays a vital role in biomedical research in the post-genomic era. With the technological revolution and emerging computational and statistic models, proteomic methodology has evolved rapidly in the past decade and shed light on solving complicated biomedical problems. Here, we summarize scientific research and clinical practice of existing and emerging high-throughput proteomics approaches, including mass spectrometry, protein pathway array, next-generation tissue microarrays, single-cell proteomics, single-molecule proteomics, Luminex, Simoa and Olink Proteomics. We also discuss important computational methods and statistical algorithms that can maximize the mining of proteomic data with clinical and/or other 'omics data. Various principles and precautions are provided for better utilization of these tools. In summary, the advances in high-throughput proteomics will not only help better understand the molecular mechanisms of pathogenesis, but also to identify the signature signaling networks of specific diseases. Thus, modern proteomics have a range of potential applications in basic research, prognostic oncology, precision medicine, and drug discovery.
Collapse
Affiliation(s)
- Miao Cui
- Department of Pathology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Department of Pathology, Mount Sinai West, New York, NY, USA
| | - Chao Cheng
- Department of Medicine, Section of Epidemiology and Population Sciences, Baylor College of Medicine, Houston, TX, USA. .,Department of Medicine, Baylor College of Medicine, Houston, TX, USA. .,Dan L Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, USA.
| | - Lanjing Zhang
- Department of Biological Sciences, Rutgers University, Newark, NJ, USA. .,Department of Pathology, Princeton Medical Center, Plainsboro, NJ, USA. .,Rutgers Cancer Institute of New Jersey, New Brunswick, NJ, USA. .,Department of Chemical Biology, Ernest Mario School of Pharmacy, Rutgers University, Piscataway, NJ, USA.
| |
Collapse
|
38
|
Perez-Riverol Y. Proteomic repository data submission, dissemination, and reuse: key messages. Expert Rev Proteomics 2022; 19:297-310. [PMID: 36529941 PMCID: PMC7614296 DOI: 10.1080/14789450.2022.2160324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 12/07/2022] [Indexed: 12/23/2022]
Abstract
INTRODUCTION The creation of ProteomeXchange data workflows in 2012 transformed the field of proteomics, consisting of the standardization of data submission and dissemination and enabling the widespread reanalysis of public MS proteomics data worldwide. ProteomeXchange has triggered a growing trend toward public dissemination of proteomics data, facilitating the assessment, reuse, comparative analyses, and extraction of new findings from public datasets. By 2022, the consortium is integrated by PRIDE, PeptideAtlas, MassIVE, jPOST, iProX, and Panorama Public. AREAS COVERED Here, we review and discuss the current ecosystem of resources, guidelines, and file formats for proteomics data dissemination and reanalysis. Special attention is drawn to new exciting quantitative and post-translational modification-oriented resources. The challenges and future directions on data depositions including the lack of metadata and cloud-based and high-performance software solutions for fast and reproducible reanalysis of the available data are discussed. EXPERT OPINION The success of ProteomeXchange and the amount of proteomics data available in the public domain have triggered the creation and/or growth of other protein knowledgebase resources. Data reuse is a leading, active, and evolving field; supporting the creation of new formats, tools, and workflows to rediscover and reshape the public proteomics data.
Collapse
Affiliation(s)
- Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| |
Collapse
|
39
|
Kalogeropoulos K, Savickas S, Haack AM, Larsen CA, Mikosiński J, Schoof EM, Smola H, Bundgaard L, Auf dem Keller U. WITHDRAWN: High-throughput and high-sensitivity biomarker monitoring in body fluid by FAIMS-enhanced fast LC SureQuant™ IS targeted quantitation. Mol Cell Proteomics 2022:100251. [PMID: 35644345 DOI: 10.1016/j.mcpro.2022.100251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 05/20/2022] [Accepted: 05/23/2022] [Indexed: 10/18/2022] Open
Affiliation(s)
- Konstantinos Kalogeropoulos
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Søltofts Plads, Kgs. Lyngby, 2800, Denmark
| | - Simonas Savickas
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Søltofts Plads, Kgs. Lyngby, 2800, Denmark
| | - Aleksander M Haack
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Søltofts Plads, Kgs. Lyngby, 2800, Denmark
| | - Cathrine A Larsen
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Søltofts Plads, Kgs. Lyngby, 2800, Denmark
| | - Jacek Mikosiński
- Poradnia Chorób Naczyń Obwodowych "MIKOMED", Ul. Pługowa 51/53, 94-238 Łódź, Poland
| | - Erwin M Schoof
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Søltofts Plads, Kgs. Lyngby, 2800, Denmark
| | - Hans Smola
- Paul Hartmann AG, Paul-Hartmann-Straße 12, 89522 Heidenheim an der Brenz, Germany
| | - Louise Bundgaard
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Søltofts Plads, Kgs. Lyngby, 2800, Denmark.
| | - Ulrich Auf dem Keller
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Søltofts Plads, Kgs. Lyngby, 2800, Denmark.
| |
Collapse
|
40
|
Xu W, Liu C, Deng B, Lin P, Sun Z, Liu A, Xuan J, Li Y, Zhou K, Zhang X, Huang Q, Zhou H, He Q, Li B, Qu L, Yang J. TP53-inducible putative long noncoding RNAs encode functional polypeptides that suppress cell proliferation. Genome Res 2022; 32:1026-1041. [PMID: 35609991 DOI: 10.1101/gr.275831.121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Accepted: 05/06/2022] [Indexed: 01/10/2023]
Abstract
Polypeptides encoded by long non-coding RNAs (lncRNAs) are a novel class of functional molecules. However, whether these hidden polypeptides participate in the TP53 pathway and play a significant biological role is still unclear. Here, we discover that TP53-regulated lncRNAs encode peptides, two of which are functional in various human cell lines. Using ribosome profiling and RNA-seq approaches in HepG2 cells, we systematically identified more than 300 novel TP53-regulated lncRNAs and further confirmed that fifteen of these TP53-regulated lncRNAs encode peptides. Furthermore, several peptides were validated by multiple mass spectrometry measures. Ten of the novel translational lncRNAs were directly inducible by TP53 in response to DNA damage. Notably, we showed that the TP53-inducible peptides TP53LC02 and TP53LC04, but not their lncRNAs, could suppress cell proliferation. TP53LC04 peptide also had a function associated with cell proliferation by regulating the cell cycle in response to DNA damage. This study demonstrates that TP53-inducible lncRNAs encode new functional peptides, leading to the enlargement of the components of TP53 tumor suppressor network and providing novel potential targets for cancer therapy.
Collapse
Affiliation(s)
- Wenli Xu
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol, The Third Affiliated Hospital, Sun Yat-sen University
| | - Chang Liu
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol, Sun Yat-sen University
| | - Bing Deng
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol, Sun Yat-sen University
| | - Penghui Lin
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol, Sun Yat-sen University
| | - Zhenghua Sun
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, Jinan University
| | - Anrui Liu
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol, Sun Yat-sen University
| | - Jiajia Xuan
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol, Sun Yat-sen University
| | - Yuying Li
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, Jinan University
| | - Keren Zhou
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol, Sun Yat-sen University
| | | | - Qiaojuan Huang
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol, Sun Yat-sen University
| | - Hui Zhou
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol, Sun Yat-sen University
| | - Qingyu He
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, Jinan University
| | - Bin Li
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol, Sun Yat-sen University
| | - Lianghu Qu
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol, Sun Yat-sen University
| | - Jianhua Yang
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol,The Fifth Affiliated Hospital, Sun Yat-sen University
| |
Collapse
|
41
|
Choong WK, Sung TY. Multiaspect Examinations of Possible Alternative Mappings of Identified Variant Peptides: A Case Study on the HEK293 Cell Line. ACS OMEGA 2022; 7:16454-16467. [PMID: 35601313 PMCID: PMC9118379 DOI: 10.1021/acsomega.2c00466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/23/2022] [Accepted: 04/20/2022] [Indexed: 06/15/2023]
Abstract
Adopting proteogenomics approach to validate single nucleotide variation events by identifying corresponding single amino acid variant peptides from mass spectrometry (MS)-based proteomics data facilitates translational and clinical research. Although variant peptides are usually identified from MS data with a stringent false discovery rate (FDR), FDR control could fail to eliminate dubious results caused by several issues; thus, postexamination to eliminate dubious results is required. However, comprehensive postexaminations of identification results are still lacking. Therefore, we propose a framework of three bottom-up levels, peptide-spectrum match, peptide, and variant event levels, that consists of rigorous 11-aspect examinations from the MS perspective to further confirm the reliability of variant events. As a proof of concept and showing feasibility, we demonstrate 11 examinations on the identified variant peptides from an HEK293 cell line data set, where various database search strategies were applied to maximize the number of identified variant PSMs with an FDR <1% for postexaminations. The results showed that only FDR criterion is insufficient to validate identified variant peptides and the 11 postexaminations can reveal low-confidence variant events detected by shotgun proteomics experiments. Therefore, we suggest that postexaminations of identified variant events based on the proposed framework are necessary for proteogenomics studies.
Collapse
|
42
|
Ilgisonis EV, Pogodin PV, Kiseleva OI, Tarbeeva SN, Ponomarenko EA. Evolution of Protein Functional Annotation: Text Mining Study. J Pers Med 2022; 12:jpm12030479. [PMID: 35330478 PMCID: PMC8952229 DOI: 10.3390/jpm12030479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 03/07/2022] [Accepted: 03/08/2022] [Indexed: 11/23/2022] Open
Abstract
Within the Human Proteome Project initiative framework for creating functional annotations of uPE1 proteins, the neXt-CP50 Challenge was launched in 2018. In analogy with the missing-protein challenge, each command deciphers the functional features of the proteins in the chromosome-centric mode. However, the neXt-CP50 Challenge is more complicated than the missing-protein challenge: the approaches and methods for solving the problem are clear, but neither the concept of protein function nor specific experimental and/or bioinformatics protocols have been standardized to address it. We proposed using a retrospective analysis of the key HPP repository, the neXtProt database, to identify the most frequently used experimental and bioinformatic methods for analyzing protein functions, and the dynamics of accumulation of functional annotations. It has been shown that the dynamics of the increase in the number of proteins with known functions are greater than the progress made in the experimental confirmation of the existence of questionable proteins in the framework of the missing-protein challenge. At the same time, the functional annotation is based on the guilty-by-association postulate, according to which, based on large-scale experiments on API-MS and Y2H, proteins with unknown functions are most likely mapped through “handshakes” to biochemical processes.
Collapse
|
43
|
Miller RM, Jordan BT, Mehlferber MM, Jeffery ED, Chatzipantsiou C, Kaur S, Millikin RJ, Dai Y, Tiberi S, Castaldi PJ, Shortreed MR, Luckey CJ, Conesa A, Smith LM, Deslattes Mays A, Sheynkman GM. Enhanced protein isoform characterization through long-read proteogenomics. Genome Biol 2022; 23:69. [PMID: 35241129 PMCID: PMC8892804 DOI: 10.1186/s13059-022-02624-y] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Accepted: 02/02/2022] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND The detection of physiologically relevant protein isoforms encoded by the human genome is critical to biomedicine. Mass spectrometry (MS)-based proteomics is the preeminent method for protein detection, but isoform-resolved proteomic analysis relies on accurate reference databases that match the sample; neither a subset nor a superset database is ideal. Long-read RNA sequencing (e.g., PacBio or Oxford Nanopore) provides full-length transcripts which can be used to predict full-length protein isoforms. RESULTS We describe here a long-read proteogenomics approach for integrating sample-matched long-read RNA-seq and MS-based proteomics data to enhance isoform characterization. We introduce a classification scheme for protein isoforms, discover novel protein isoforms, and present the first protein inference algorithm for the direct incorporation of long-read transcriptome data to enable detection of protein isoforms previously intractable to MS-based detection. We have released an open-source Nextflow pipeline that integrates long-read sequencing in a proteomic workflow for isoform-resolved analysis. CONCLUSIONS Our work suggests that the incorporation of long-read sequencing and proteomic data can facilitate improved characterization of human protein isoform diversity. Our first-generation pipeline provides a strong foundation for future development of long-read proteogenomics and its adoption for both basic and translational research.
Collapse
Affiliation(s)
- Rachel M Miller
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Ben T Jordan
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA
| | - Madison M Mehlferber
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA
- Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA, USA
| | - Erin D Jeffery
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA
| | | | - Simi Kaur
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Robert J Millikin
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Yunxiang Dai
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Simone Tiberi
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland
| | - Peter J Castaldi
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Division of General Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, USA
| | | | - Chance John Luckey
- Department of Pathology, University of Virginia, Charlottesville, VA, USA
| | - Ana Conesa
- Institute for Integrative Systems Biology, Spanish National Research Council (CSIC), Paterna, Spain
- Microbiology and Cell Science Department, Institute for Food and Agricultural Sciences, University of Florida, Gainesville, FL, USA
| | - Lloyd M Smith
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Anne Deslattes Mays
- Office of Data Science and Sharing, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Rockville, MD, USA
| | - Gloria M Sheynkman
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA.
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA.
- UVA Cancer Center, University of Virginia, Charlottesville, VA, USA.
| |
Collapse
|
44
|
Rajczewski AT, Jagtap PD, Griffin TJ. An overview of technologies for MS-based proteomics-centric multi-omics. Expert Rev Proteomics 2022; 19:165-181. [PMID: 35466851 PMCID: PMC9613604 DOI: 10.1080/14789450.2022.2070476] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
INTRODUCTION Mass spectrometry-based proteomics reveals dynamic molecular signatures underlying phenotypes reflecting normal and perturbed conditions in living systems. Although valuable on its own, the proteome has only one level of moleclar information, with the genome, epigenome, transcriptome, and metabolome, all providing complementary information. Multi-omic analysis integrating information from one or more of these other domains with proteomic information provides a more complete picture of molecular contributors to dynamic biological systems. AREAS COVERED Here, we discuss the improvements to mass spectrometry-based technologies, focused on peptide-based, bottom-up approaches that have enabled deep, quantitative characterization of complex proteomes. These advances are facilitating the integration of proteomics data with other 'omic information, providing a more complete picture of living systems. We also describe the current state of bioinformatics software and approaches for integrating proteomics and other 'omics data, critical for enabling new discoveries driven by multi-omics. EXPERT COMMENTARY Multi-omics, centered on the integration of proteomics information with other 'omic information, has tremendous promise for biological and biomedical studies. Continued advances in approaches for generating deep, reliable proteomic data and bioinformatics tools aimed at integrating data across 'omic domains will ensure the discoveries offered by these multi-omic studies continue to increase.
Collapse
Affiliation(s)
- Andrew T. Rajczewski
- Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA
| | - Pratik D. Jagtap
- Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA,Coauthor, Research Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA,Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA
| |
Collapse
|
45
|
Moresi F, Rossetti DV, Vincenzoni F, Simboli GA, La Rocca G, Olivi A, Urbani A, Sabatino G, Desiderio C. Investigating Glioblastoma Multiforme Sub-Proteomes: A Computational Study of CUSA Fluid Proteomic Data. Int J Mol Sci 2022; 23:ijms23042058. [PMID: 35216175 PMCID: PMC8879425 DOI: 10.3390/ijms23042058] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 02/07/2022] [Accepted: 02/10/2022] [Indexed: 02/04/2023] Open
Abstract
Based on our previous proteomic study on Cavitating Ultrasound Aspirator (CUSA) fluid pools of Newly Diagnosed (ND) and Recurrent (R) glioblastomas (GBMs) of tumor core and periphery, as defined by 5-aminolevulinc acid (5-ALA) metabolite fluorescence, this work aims to apply a bioinformatic approach to investigate specifically into three sub-proteomes, i.e., Not Detected in Brain (NB), Cancer Related (CR) and Extracellular Vesicles (EVs) proteins following selected database classification. The study of these yet unexplored specific datasets aims to understand the high infiltration capability and relapse rate that characterizes this aggressive brain cancer. Out of the 587 proteins highly confidently identified in GBM CUSA pools, 53 proteins were classified as NB. Their gene ontology (GO) analysis showed the over-representation of blood coagulation and plasminogen activating cascade pathways, possibly compatible with Blood Brain Barrier damage in tumor disease and surgery bleeding. However, the NB group also included non-blood proteins and, specifically, histones correlated with oncogenesis. Concerning CR proteins, 159 proteins were found in the characterized GBM proteome. Their GO analysis highlighted the over-representation of many pathways, primarily glycolysis. Interestingly, while CR proteins were identified in ND-GBM exclusively in the tumor zones (fluorescence positive core and periphery zones) as predictable, conversely, in R-GBM they were unexpectedly characterized prevalently in the healthy zone (fluorescence negative tumor periphery). Relative to EVs protein classification, 60 proteins were found. EVs are over-released in tumor disease and are important in the transport of biological macromolecules. Furthermore, the presence of EVs in numerous body fluids makes them a possible low-invasive source of brain tumor biomarkers to be investigated. These results give new hints on the molecular features of GBM in trying to understand its aggressive behavior and open to more in-depth investigations to disclose potential disease biomarkers.
Collapse
Affiliation(s)
- Fabiana Moresi
- Department of Neurosurgery, Mater Olbia Hospital, 07026 Olbia, Italy; (F.M.); (G.L.R.); (G.S.)
- Dipartimento di Scienze Biotecnologiche di Base, Cliniche Intensivologiche e Perioperatorie, Università Cattolica del Sacro Cuore, 00168 Rome, Italy; (F.V.); (A.U.)
| | - Diana Valeria Rossetti
- Istituto di Scienze e Tecnologie Chimiche “Giulio Natta”, Consiglio Nazionale delle Ricerche, 00168 Rome, Italy;
| | - Federica Vincenzoni
- Dipartimento di Scienze Biotecnologiche di Base, Cliniche Intensivologiche e Perioperatorie, Università Cattolica del Sacro Cuore, 00168 Rome, Italy; (F.V.); (A.U.)
- Fondazione Policlinico Universitario A. Gemelli IRCCS, 00168 Rome, Italy; (G.A.S.); (A.O.)
| | - Giorgia Antonia Simboli
- Fondazione Policlinico Universitario A. Gemelli IRCCS, 00168 Rome, Italy; (G.A.S.); (A.O.)
- Institute of Neurosurgery, Fondazione Policlinico Universitario A. Gemelli IRCCS, Catholic University, 00168 Rome, Italy
| | - Giuseppe La Rocca
- Department of Neurosurgery, Mater Olbia Hospital, 07026 Olbia, Italy; (F.M.); (G.L.R.); (G.S.)
- Fondazione Policlinico Universitario A. Gemelli IRCCS, 00168 Rome, Italy; (G.A.S.); (A.O.)
- Institute of Neurosurgery, Fondazione Policlinico Universitario A. Gemelli IRCCS, Catholic University, 00168 Rome, Italy
| | - Alessandro Olivi
- Fondazione Policlinico Universitario A. Gemelli IRCCS, 00168 Rome, Italy; (G.A.S.); (A.O.)
- Institute of Neurosurgery, Fondazione Policlinico Universitario A. Gemelli IRCCS, Catholic University, 00168 Rome, Italy
| | - Andrea Urbani
- Dipartimento di Scienze Biotecnologiche di Base, Cliniche Intensivologiche e Perioperatorie, Università Cattolica del Sacro Cuore, 00168 Rome, Italy; (F.V.); (A.U.)
- Fondazione Policlinico Universitario A. Gemelli IRCCS, 00168 Rome, Italy; (G.A.S.); (A.O.)
| | - Giovanni Sabatino
- Department of Neurosurgery, Mater Olbia Hospital, 07026 Olbia, Italy; (F.M.); (G.L.R.); (G.S.)
- Fondazione Policlinico Universitario A. Gemelli IRCCS, 00168 Rome, Italy; (G.A.S.); (A.O.)
- Institute of Neurosurgery, Fondazione Policlinico Universitario A. Gemelli IRCCS, Catholic University, 00168 Rome, Italy
| | - Claudia Desiderio
- Istituto di Scienze e Tecnologie Chimiche “Giulio Natta”, Consiglio Nazionale delle Ricerche, 00168 Rome, Italy;
- Correspondence:
| |
Collapse
|
46
|
A Bioinformatics Approach to Mine the Microbial Proteomic Profile of COVID-19 Mass Spectrometry Data. Appl Microbiol 2022. [DOI: 10.3390/applmicrobiol2010010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Mass spectrometry (MS) is one of the key technologies used in proteomics. The majority of studies carried out using proteomics have focused on identifying proteins in biological samples such as human plasma to pin down prognostic or diagnostic biomarkers associated with particular conditions or diseases. This study aims to quantify microbial (viral and bacterial) proteins in healthy human plasma. MS data of healthy human plasma were searched against the complete proteomes of all available viruses and bacteria. With this baseline established, the same strategy was applied to characterize the metaproteomic profile of different SARS-CoV-2 disease stages in the plasma of patients. Two SARS-CoV-2 proteins were detected with a high confidence and could serve as the early markers of SARS-CoV-2 infection. The complete bacterial and viral protein content in SARS-CoV-2 samples was compared for the different disease stages. The number of viral proteins was found to increase significantly with the progression of the infection, at the expense of bacterial proteins. This strategy can be extended to aid in the development of early diagnostic tests for other infectious diseases based on the presence of microbial biomarkers in human plasma samples.
Collapse
|
47
|
Zheng W, Yang P, Sun C, Zhang Y. Comprehensive comparison of sample preparation workflows for proteomics. Mol Omics 2022; 18:555-567. [DOI: 10.1039/d2mo00076h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Mass spectrometry-based proteomics experiments can be subject to a large variability, which forms an obstacle to obtaining deep and accurate protein identification. Here, to obtain an optimal sample preparation workflow...
Collapse
|
48
|
Bundgaard L, Åhrman E, Malmström J, Auf dem Keller U, Walters M, Jacobsen S. Effective protein extraction combined with data independent acquisition analysis reveals a comprehensive and quantifiable insight into the proteomes of articular cartilage and subchondral bone. Osteoarthritis Cartilage 2022; 30:137-146. [PMID: 34547431 DOI: 10.1016/j.joca.2021.09.006] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Revised: 08/31/2021] [Accepted: 09/13/2021] [Indexed: 02/02/2023]
Abstract
OBJECTIVE The objectives of this study was to establish a sensitive and reproducible method to map the cartilage and subchondral bone proteomes in quantitative terms, and mine the proteomes for proteins of particular interest in the pathogenesis of osteoarthritis (OA). The horse was used as a model animal. DESIGN Protein was extracted from articular cartilage and subchondral bone samples from three horses in triplicate by pressure cycling technology or ultrasonication. Digested proteins were analysed by data independent acquisition based mass spectrometry. Data was processed using a pre-established spectral library as reference database (FDR 1%). RESULTS We identified to our knowledge the hitherto most comprehensive quantitative cartilage (1758 proteins) and subchondral bone (1482 proteins) proteomes in all species presented to date. Both extraction methods were sensitive and reproducible and the high consistency of the identified proteomes (>97% overlap) indicated that both methods preserved the diversity among the extracted proteins. Proteome mining revealed a substantial number of quantifiable cartilage and bone matrix proteins and proteins involved in osteogenesis and bone remodeling, including ACAN, BGN, PRELP, FMOD, COMP, ACP5, BMP3, BMP6, BGLAP, TGFB1, IGF1, ALP, MMP3, and collagens. A number of proteins, including COMP and TNN, were identified in different protein isoforms with potential unique biological roles. CONCLUSION We have successfully developed two sensitive and reproducible non-species specific workflows enabling a comprehensive quantitative insight into the proteomes of cartilage and subchondral bone. This facilitates the prospect of investigating the molecular events at the osteochondral unit in the pathogenesis of OA in future projects.
Collapse
Affiliation(s)
- L Bundgaard
- Section of Medicine and Surgery, Department of Veterinary Clinical Sciences, University of Copenhagen, 2630 Taastrup, Denmark. Section for Protein Science and Biotherapeutics, DTU Bioengineering, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark.
| | - E Åhrman
- Division of Infection Medicine Proteomics, Department of Clinical Sciences, Lund University, Lund 221 84, Sweden.
| | - J Malmström
- Division of Infection Medicine Proteomics, Department of Clinical Sciences, Lund University, Lund 221 84, Sweden.
| | - U Auf dem Keller
- Section for Protein Science and Biotherapeutics, DTU Bioengineering, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark.
| | - M Walters
- Section of Medicine and Surgery, Department of Veterinary Clinical Sciences, University of Copenhagen, 2630 Taastrup, Denmark.
| | - S Jacobsen
- Section of Medicine and Surgery, Department of Veterinary Clinical Sciences, University of Copenhagen, 2630 Taastrup, Denmark.
| |
Collapse
|
49
|
Halder A, Verma A, Biswas D, Srivastava S. Recent advances in mass-spectrometry based proteomics software, tools and databases. DRUG DISCOVERY TODAY. TECHNOLOGIES 2021; 39:69-79. [PMID: 34906327 DOI: 10.1016/j.ddtec.2021.06.007] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 05/08/2021] [Accepted: 06/21/2021] [Indexed: 01/12/2023]
Abstract
The field of proteomics immensely depends on data generation and data analysis which are thoroughly supported by software and databases. There has been a massive advancement in mass spectrometry-based proteomics over the last 10 years which has compelled the scientific community to upgrade or develop algorithms, tools, and repository databases in the field of proteomics. Several standalone software, and comprehensive databases have aided the establishment of integrated omics pipeline and meta-analysis workflow which has contributed to understand the disease pathobiology, biomarker discovery and predicting new therapeutic modalities. For shotgun proteomics where Data Dependent Acquisition is performed, several user-friendly software are developed that can analyse the pre-processed data to provide mechanistic insights of the disease. Likewise, in Data Independent Acquisition, pipelines are emerged which can accomplish the task from building the spectral library to identify the therapeutic targets. Furthermore, in the age of big data analysis the implications of machine learning and cloud computing are appending robustness, rapidness and in-depth proteomics data analysis. The current review talks about the recent advancement, and development of software, tools, and database in the field of mass-spectrometry based proteomics.
Collapse
Affiliation(s)
- Ankit Halder
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India
| | - Ayushi Verma
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India
| | - Deeptarup Biswas
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India
| | - Sanjeeva Srivastava
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India.
| |
Collapse
|
50
|
Bu F, Cheng Q, Zhang Y, Zhang X, Yan K, Liu F, Li Z, Lu X, Ren Y, Liu S. Discovery of Missing Proteins from an Aneuploidy Cell Line Using a Proteogenomic Approach. J Proteome Res 2021; 20:5329-5339. [PMID: 34748338 DOI: 10.1021/acs.jproteome.1c00772] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
With the steadfast development of proteomic technology, the number of missing proteins (MPs) has been continuously shrinking, with approximately 1470 MPs that have not been explored yet. Due to this phenomenon, the discovery of MPs has been increasingly more difficult and elusive. In order to face this challenge, we have hypothesized that a stable aneuploid cell line with increased chromosomes serves as a useful material for assisting MP exploration. Ker-CT cell line with trisomy at chromosome 5 and 20 was selected for this purpose. With a combination strategy of RNA-Seq and LC-MS/MS, a total of 22 178 transcripts and 8846 proteins were identified in Ker-CT. Although the transcripts corresponding to 15 and 15 MP genes located at chromosome 5 and 20 were detected, none of the MPs were found in Ker-CT. Surprisingly, 3 MPs containing at least two unique non-nest peptides of length ≥9 amino acids were identified in Ker-CT, whose genes are located on chromosome 3 and 10, respectively. Furthermore, the 3 MPs were verified using the method of parallel reaction monitoring (PRM). These results suggest that the abnormal status of chromosomes may not only impact the expression of the corresponding genes in trisomy chromosomes, but also influence that of other chromosomes, which benefits MP discovery. The data obtained in this study are available via ProteomeXchange (PXD028647) and PeptideAtlas (PASS01700), respectively.
Collapse
Affiliation(s)
- Fanyu Bu
- BGI-Shenzhen, Beishan Industrial Zone 11th Building, Yantian District, Shenzhen, Guangdong 518083, China.,Department of BGI Education, School of Life Sciences, University of Chinese Academy of Sciences, Shenzhen, Guangdong 518083, China
| | - Qingqiu Cheng
- Clinical Laboratory Center of Dongguan Eighth People's Hospital, Dongguan 523325, China
| | - Yuxing Zhang
- BGI-Shenzhen, Beishan Industrial Zone 11th Building, Yantian District, Shenzhen, Guangdong 518083, China.,Department of BGI Education, School of Life Sciences, University of Chinese Academy of Sciences, Shenzhen, Guangdong 518083, China
| | - Xia Zhang
- BGI-Shenzhen, Beishan Industrial Zone 11th Building, Yantian District, Shenzhen, Guangdong 518083, China.,Department of BGI Education, School of Life Sciences, University of Chinese Academy of Sciences, Shenzhen, Guangdong 518083, China
| | - Keqiang Yan
- BGI-Shenzhen, Beishan Industrial Zone 11th Building, Yantian District, Shenzhen, Guangdong 518083, China.,Department of BGI Education, School of Life Sciences, University of Chinese Academy of Sciences, Shenzhen, Guangdong 518083, China
| | - Frank Liu
- BGI-Shenzhen, Beishan Industrial Zone 11th Building, Yantian District, Shenzhen, Guangdong 518083, China
| | - Zelong Li
- Biological Resource Center of Plants, Animals and Microorganisms, China National Gene Bank, BGI-Shenzhen, Guangdong 518120, China
| | - Xiaomei Lu
- Clinical Laboratory Center of Dongguan Eighth People's Hospital, Dongguan 523325, China
| | - Yan Ren
- BGI-Shenzhen, Beishan Industrial Zone 11th Building, Yantian District, Shenzhen, Guangdong 518083, China
| | - Siqi Liu
- BGI-Shenzhen, Beishan Industrial Zone 11th Building, Yantian District, Shenzhen, Guangdong 518083, China.,Department of BGI Education, School of Life Sciences, University of Chinese Academy of Sciences, Shenzhen, Guangdong 518083, China
| |
Collapse
|