1
|
Gomez-Artiguez L, de la Cámara-Fuentes S, Sun Z, Hernáez ML, Borrajo A, Pitarch A, Molero G, Monteoliva L, Moritz RL, Deutsch EW, Gil C. Candida albicans: A Comprehensive View of the Proteome. J Proteome Res 2025; 24:1636-1648. [PMID: 40084908 DOI: 10.1021/acs.jproteome.4c01020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/16/2025]
Abstract
We describe a new release of the Candida albicans PeptideAtlas proteomics spectral resource (build 2024-03), providing a sequence coverage of 79.5% at the canonical protein level, matched mass spectrometry spectra, and experimental evidence identifying 3382 and 536 phosphorylated serine and threonine sites with false localization rates of 1% and 5.3%, respectively. We provide a tutorial on how to use the PeptideAtlas and associated tools to access this information. The C. albicans PeptideAtlas summary web page provides "Build overview", "PTM coverage", "Experiment contribution", and "Data set contribution" information. The protein and peptide information can also be accessed via the Candida Genome Database via hyperlinks on each protein page. This allows users to peruse identified peptides, protein coverage, post-translational modifications (PTMs), and experiments that identify each protein. Given the value of understanding the PTM landscape in the sequence of each protein, a more detailed explanation of how to interpret and analyze PTM results is provided in the PeptideAtlas of this important pathogen. Candida albicans PeptideAtlas web page: https://db.systemsbiology.net/sbeams/cgi/PeptideAtlas/buildDetails?atlas_build_id=578.
Collapse
Affiliation(s)
- Leticia Gomez-Artiguez
- Microbiology and Parasitology Department, Faculty of Pharmacy, Complutense University of Madrid, 28040 Madrid, Spain
| | | | - Zhi Sun
- Institute for Systems Biology, 401 Terry Ave North, Seattle, Washington 98109, United States
| | - María Luisa Hernáez
- Proteomics Unit, Faculty of Pharmacy, Complutense University of Madrid, 28040 Madrid, Spain
| | - Ana Borrajo
- Microbiology and Parasitology Department, Faculty of Pharmacy, Complutense University of Madrid, 28040 Madrid, Spain
| | - Aída Pitarch
- Microbiology and Parasitology Department, Faculty of Pharmacy, Complutense University of Madrid, 28040 Madrid, Spain
| | - Gloria Molero
- Microbiology and Parasitology Department, Faculty of Pharmacy, Complutense University of Madrid, 28040 Madrid, Spain
| | - Lucía Monteoliva
- Microbiology and Parasitology Department, Faculty of Pharmacy, Complutense University of Madrid, 28040 Madrid, Spain
| | - Robert L Moritz
- Institute for Systems Biology, 401 Terry Ave North, Seattle, Washington 98109, United States
| | - Eric W Deutsch
- Institute for Systems Biology, 401 Terry Ave North, Seattle, Washington 98109, United States
| | - Concha Gil
- Microbiology and Parasitology Department, Faculty of Pharmacy, Complutense University of Madrid, 28040 Madrid, Spain
- Proteomics Unit, Faculty of Pharmacy, Complutense University of Madrid, 28040 Madrid, Spain
| |
Collapse
|
2
|
Gomez-Artiguez L, de la Cámara-Fuentes S, Sun Z, Hernáez ML, Borrajo A, Pitarch A, Molero G, Monteoliva L, Moritz RL, Deutsch EW, Gil C. Candida albicans: a comprehensive view of the proteome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.12.20.629377. [PMID: 39763837 PMCID: PMC11702768 DOI: 10.1101/2024.12.20.629377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/15/2025]
Abstract
We describe a new release of the Candida albicans PeptideAtlas proteomics spectral resource (build 2024-03), providing a sequence coverage of 79.5% at the canonical protein level, matched mass spectrometry spectra, and experimental evidence identifying 3382 and 536 phosphorylated serine and threonine sites with false localization rates of 1% and 5.3%, respectively. We provide a tutorial on how to use the PeptideAtlas and associated tools to access this information. The C. albicans PeptideAtlas summary web page provides "Build overview", "PTM coverage", "Experiment contribution", and "Dataset contribution" information. The protein and peptide information can also be accessed via the Candida Genome Database via hyperlinks on each protein page. This allows users to peruse identified peptides, protein coverage, post-translational modifications (PTMs), and experiments identifying each protein. Given the value of understanding the PTM landscape in the sequence of each protein, a more detailed explanation of how to interpret and analyse PTM results is provided in the PeptideAtlas of this important pathogen. Candida albicans PeptideAtlas web page: https://db.systemsbiology.net/sbeams/cgi/PeptideAtlas/buildDetails?atlas_build_id=578.
Collapse
Affiliation(s)
- Leticia Gomez-Artiguez
- Microbiology and Parasitology Department, Faculty of Pharmacy, Complutense University of Madrid, 28040 Madrid
| | | | - Zhi Sun
- Institute for Systems Biology, 401 Terry Ave North, Seattle, WA, USA. 98109
| | - María Luisa Hernáez
- Proteomics Unit, Faculty of Pharmacy, Complutense University of Madrid, 28040 Madrid
| | - Ana Borrajo
- Microbiology and Parasitology Department, Faculty of Pharmacy, Complutense University of Madrid, 28040 Madrid
| | - Aída Pitarch
- Microbiology and Parasitology Department, Faculty of Pharmacy, Complutense University of Madrid, 28040 Madrid
| | - Gloria Molero
- Microbiology and Parasitology Department, Faculty of Pharmacy, Complutense University of Madrid, 28040 Madrid
| | - Lucía Monteoliva
- Microbiology and Parasitology Department, Faculty of Pharmacy, Complutense University of Madrid, 28040 Madrid
| | - Robert L. Moritz
- Institute for Systems Biology, 401 Terry Ave North, Seattle, WA, USA. 98109
| | - Eric W. Deutsch
- Institute for Systems Biology, 401 Terry Ave North, Seattle, WA, USA. 98109
| | - Concha Gil
- Microbiology and Parasitology Department, Faculty of Pharmacy, Complutense University of Madrid, 28040 Madrid
- Proteomics Unit, Faculty of Pharmacy, Complutense University of Madrid, 28040 Madrid
| |
Collapse
|
3
|
Reddy PJ, Sun Z, Wippel HH, Baxter DH, Swearingen K, Shteynberg DD, Midha MK, Caimano MJ, Strle K, Choi Y, Chan AP, Schork NJ, Varela-Stokes AS, Moritz RL. Borrelia PeptideAtlas: A proteome resource of common Borrelia burgdorferi isolates for Lyme research. Sci Data 2024; 11:1313. [PMID: 39622905 PMCID: PMC11612207 DOI: 10.1038/s41597-024-04047-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 10/28/2024] [Indexed: 12/06/2024] Open
Abstract
Lyme disease is caused by an infection with the spirochete Borrelia burgdorferi, and is the most common vector-borne disease in North America. B. burgdorferi isolates harbor extensive genomic and proteomic variability and further comparison of isolates is key to understanding the infectivity of the spirochetes and biological impacts of identified sequence variants. Here, we applied both transcriptome analysis and mass spectrometry-based proteomics to assemble peptide datasets of B. burgdorferi laboratory isolates B31, MM1, and the infective isolate B31-5A4, to provide a publicly available Borrelia PeptideAtlas. Included are total proteome, secretome, and membrane proteome identifications of the individual isolates. Proteomic data collected from 35 different experiment datasets, totaling 386 mass spectrometry runs, have identified 81,967 distinct peptides, which map to 1,113 proteins. The Borrelia PeptideAtlas covers 86% of the total B31 proteome of 1,291 protein sequences. The Borrelia PeptideAtlas is an extensible comprehensive peptide repository with proteomic information from B. burgdorferi isolates useful for Lyme disease research.
Collapse
Affiliation(s)
- Panga J Reddy
- Institute for Systems Biology, Seattle, Washington, USA
| | - Zhi Sun
- Institute for Systems Biology, Seattle, Washington, USA
| | | | | | | | | | - Mukul K Midha
- Institute for Systems Biology, Seattle, Washington, USA
| | | | - Klemen Strle
- Department of Molecular Biology and Microbiology, Tufts University School of Medicine, Boston, Massachusetts, USA
| | - Yongwook Choi
- Translational Genomics Research Institute, Phoenix, Arizona, USA
| | - Agnes P Chan
- Translational Genomics Research Institute, Phoenix, Arizona, USA
| | | | - Andrea S Varela-Stokes
- Tufts University Cummings School of Veterinary Medicine, Department of Comparative Pathobiology, Grafton, MA, 01536, USA
| | | |
Collapse
|
4
|
van Wijk KJ, Leppert T, Sun Z, Kearly A, Li M, Mendoza L, Guzchenko I, Debley E, Sauermann G, Routray P, Malhotra S, Nelson A, Sun Q, Deutsch EW. Detection of the Arabidopsis Proteome and Its Post-translational Modifications and the Nature of the Unobserved (Dark) Proteome in PeptideAtlas. J Proteome Res 2024; 23:185-214. [PMID: 38104260 DOI: 10.1021/acs.jproteome.3c00536] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
This study describes a new release of the Arabidopsis thaliana PeptideAtlas proteomics resource (build 2023-10) providing protein sequence coverage, matched mass spectrometry (MS) spectra, selected post-translational modifications (PTMs), and metadata. 70 million MS/MS spectra were matched to the Araport11 annotation, identifying ∼0.6 million unique peptides and 18,267 proteins at the highest confidence level and 3396 lower confidence proteins, together representing 78.6% of the predicted proteome. Additional identified proteins not predicted in Araport11 should be considered for the next Arabidopsis genome annotation. This release identified 5198 phosphorylated proteins, 668 ubiquitinated proteins, 3050 N-terminally acetylated proteins, and 864 lysine-acetylated proteins and mapped their PTM sites. MS support was lacking for 21.4% (5896 proteins) of the predicted Araport11 proteome: the "dark" proteome. This dark proteome is highly enriched for E3 ligases, transcription factors, and for certain (e.g., CLE, IDA, PSY) but not other (e.g., THIONIN, CAP) signaling peptides families. A machine learning model trained on RNA expression data and protein properties predicts the probability that proteins will be detected. The model aids in discovery of proteins with short half-life (e.g., SIG1,3 and ERF-VII TFs) and for developing strategies to identify the missing proteins. PeptideAtlas is linked to TAIR, tracks in JBrowse, and several other community proteomics resources.
Collapse
Affiliation(s)
- Klaas J van Wijk
- Section of Plant Biology, School of Integrative Plant Sciences (SIPS), Cornell University, Ithaca, New York 14853, United States
| | - Tami Leppert
- Institute for Systems Biology (ISB), Seattle, Washington 98109, United States
| | - Zhi Sun
- Institute for Systems Biology (ISB), Seattle, Washington 98109, United States
| | - Alyssa Kearly
- Boyce Thompson Institute, Ithaca, New York 14853, United States
| | - Margaret Li
- Institute for Systems Biology (ISB), Seattle, Washington 98109, United States
| | - Luis Mendoza
- Institute for Systems Biology (ISB), Seattle, Washington 98109, United States
| | - Isabell Guzchenko
- Section of Plant Biology, School of Integrative Plant Sciences (SIPS), Cornell University, Ithaca, New York 14853, United States
| | - Erica Debley
- Section of Plant Biology, School of Integrative Plant Sciences (SIPS), Cornell University, Ithaca, New York 14853, United States
| | - Georgia Sauermann
- Section of Plant Biology, School of Integrative Plant Sciences (SIPS), Cornell University, Ithaca, New York 14853, United States
| | - Pratyush Routray
- Section of Plant Biology, School of Integrative Plant Sciences (SIPS), Cornell University, Ithaca, New York 14853, United States
| | - Sagunya Malhotra
- Institute for Systems Biology (ISB), Seattle, Washington 98109, United States
| | - Andrew Nelson
- Boyce Thompson Institute, Ithaca, New York 14853, United States
| | - Qi Sun
- Computational Biology Service Unit, Cornell University, Ithaca, New York 14853, United States
| | - Eric W Deutsch
- Institute for Systems Biology (ISB), Seattle, Washington 98109, United States
| |
Collapse
|
5
|
Reddy PJ, Sun Z, Wippel HH, Baxter D, Swearingen K, Shteynberg DD, Midha MK, Caimano MJ, Strle K, Choi Y, Chan AP, Schork NJ, Moritz RL. Borrelia PeptideAtlas: A proteome resource of common Borrelia burgdorferi isolates for Lyme research. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.16.545244. [PMID: 37398146 PMCID: PMC10312716 DOI: 10.1101/2023.06.16.545244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Lyme disease, caused by an infection with the spirochete Borrelia burgdorferi, is the most common vector-borne disease in North America. B. burgdorferi strains harbor extensive genomic and proteomic variability and further comparison is key to understanding the spirochetes infectivity and biological impacts of identified sequence variants. To achieve this goal, both transcript and mass spectrometry (MS)-based proteomics was applied to assemble peptide datasets of laboratory strains B31, MM1, B31-ML23, infective isolates B31-5A4, B31-A3, and 297, and other public datasets, to provide a publicly available Borrelia PeptideAtlas http://www.peptideatlas.org/builds/borrelia/. Included is information on total proteome, secretome, and membrane proteome of these B. burgdorferi strains. Proteomic data collected from 35 different experiment datasets, with a total of 855 mass spectrometry runs, identified 76,936 distinct peptides at a 0.1% peptide false-discovery-rate, which map to 1,221 canonical proteins (924 core canonical and 297 noncore canonical) and covers 86% of the total base B31 proteome. The diverse proteomic information from multiple isolates with credible data presented by the Borrelia PeptideAtlas can be useful to pinpoint potential protein targets which are common to infective isolates and may be key in the infection process.
Collapse
Affiliation(s)
| | - Zhi Sun
- Institute for Systems Biology, Seattle, Washington, USA
| | | | - David Baxter
- Institute for Systems Biology, Seattle, Washington, USA
| | | | | | | | | | - Klemen Strle
- Department of Molecular Biology and Microbiology, Tufts University School of Medicine, Boston, Massachusetts, USA
| | - Yongwook Choi
- Translational Genomics Research Institute, Phoenix, Arizona, USA
| | - Agnes P. Chan
- Translational Genomics Research Institute, Phoenix, Arizona, USA
| | | | | |
Collapse
|
6
|
van Wijk KJ, Leppert T, Sun Z, Kearly A, Li M, Mendoza L, Guzchenko I, Debley E, Sauermann G, Routray P, Malhotra S, Nelson A, Sun Q, Deutsch EW. Mapping the Arabidopsis thaliana proteome in PeptideAtlas and the nature of the unobserved (dark) proteome; strategies towards a complete proteome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.01.543322. [PMID: 37333403 PMCID: PMC10274743 DOI: 10.1101/2023.06.01.543322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]
Abstract
This study describes a new release of the Arabidopsis thaliana PeptideAtlas proteomics resource providing protein sequence coverage, matched mass spectrometry (MS) spectra, selected PTMs, and metadata. 70 million MS/MS spectra were matched to the Araport11 annotation, identifying ∼0.6 million unique peptides and 18267 proteins at the highest confidence level and 3396 lower confidence proteins, together representing 78.6% of the predicted proteome. Additional identified proteins not predicted in Araport11 should be considered for building the next Arabidopsis genome annotation. This release identified 5198 phosphorylated proteins, 668 ubiquitinated proteins, 3050 N-terminally acetylated proteins and 864 lysine-acetylated proteins and mapped their PTM sites. MS support was lacking for 21.4% (5896 proteins) of the predicted Araport11 proteome - the 'dark' proteome. This dark proteome is highly enriched for certain ( e.g. CLE, CEP, IDA, PSY) but not other ( e.g. THIONIN, CAP,) signaling peptides families, E3 ligases, TFs, and other proteins with unfavorable physicochemical properties. A machine learning model trained on RNA expression data and protein properties predicts the probability for proteins to be detected. The model aids in discovery of proteins with short-half life ( e.g. SIG1,3 and ERF-VII TFs) and completing the proteome. PeptideAtlas is linked to TAIR, JBrowse, PPDB, SUBA, UniProtKB and Plant PTM Viewer.
Collapse
|
7
|
van Wijk KJ, Leppert T, Sun Q, Boguraev SS, Sun Z, Mendoza L, Deutsch EW. The Arabidopsis PeptideAtlas: Harnessing worldwide proteomics data to create a comprehensive community proteomics resource. THE PLANT CELL 2021; 33:3421-3453. [PMID: 34411258 PMCID: PMC8566204 DOI: 10.1093/plcell/koab211] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Accepted: 08/13/2021] [Indexed: 05/02/2023]
Abstract
We developed a resource, the Arabidopsis PeptideAtlas (www.peptideatlas.org/builds/arabidopsis/), to solve central questions about the Arabidopsis thaliana proteome, such as the significance of protein splice forms and post-translational modifications (PTMs), or simply to obtain reliable information about specific proteins. PeptideAtlas is based on published mass spectrometry (MS) data collected through ProteomeXchange and reanalyzed through a uniform processing and metadata annotation pipeline. All matched MS-derived peptide data are linked to spectral, technical, and biological metadata. Nearly 40 million out of ∼143 million MS/MS (tandem MS) spectra were matched to the reference genome Araport11, identifying ∼0.5 million unique peptides and 17,858 uniquely identified proteins (only isoform per gene) at the highest confidence level (false discovery rate 0.0004; 2 non-nested peptides ≥9 amino acid each), assigned canonical proteins, and 3,543 lower-confidence proteins. Physicochemical protein properties were evaluated for targeted identification of unobserved proteins. Additional proteins and isoforms currently not in Araport11 were identified that were generated from pseudogenes, alternative start, stops, and/or splice variants, and small Open Reading Frames; these features should be considered when updating the Arabidopsis genome. Phosphorylation can be inspected through a sophisticated PTM viewer. PeptideAtlas is integrated with community resources including TAIR, tracks in JBrowse, PPDB, and UniProtKB. Subsequent PeptideAtlas builds will incorporate millions more MS/MS data.
Collapse
Affiliation(s)
- Klaas J van Wijk
- Section of Plant Biology, School of Integrative Plant Sciences (SIPS), Cornell University, Ithaca, New York 14853, USA
- Authors for correspondence: (K.J.V.W.), (E.W.D.)
| | - Tami Leppert
- Institute for Systems Biology (ISB), Seattle, Washington 98109, USA
| | - Qi Sun
- Computational Biology Service Unit, Cornell University, Ithaca, New York 14853, USA
| | - Sascha S Boguraev
- Section of Plant Biology, School of Integrative Plant Sciences (SIPS), Cornell University, Ithaca, New York 14853, USA
| | - Zhi Sun
- Institute for Systems Biology (ISB), Seattle, Washington 98109, USA
| | - Luis Mendoza
- Institute for Systems Biology (ISB), Seattle, Washington 98109, USA
| | - Eric W Deutsch
- Institute for Systems Biology (ISB), Seattle, Washington 98109, USA
- Authors for correspondence: (K.J.V.W.), (E.W.D.)
| |
Collapse
|
8
|
Tholey A, Taylor NL, Heazlewood JL, Bendixen E. We Are Not Alone: The iMOP Initiative and Its Roles in a Biology- and Disease-Driven Human Proteome Project. J Proteome Res 2017; 16:4273-4280. [DOI: 10.1021/acs.jproteome.7b00408] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Andreas Tholey
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, 24105 Kiel, Germany
| | - Nicolas L. Taylor
- Australian
Research Council Centre of Excellence in Plant Energy Biology, School
of Molecular Sciences and Institute of Agriculture, The University of Western Australia, Crawley, Western Australia 6009, Australia
| | - Joshua L. Heazlewood
- School
of BioSciences, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Emøke Bendixen
- Department
of Molecular Biology and Genetics, Faculty of Science and Technology, Aarhus University, 8000 Aarhus, Denmark
| |
Collapse
|