1
|
Perez-Riverol Y, Bandla C, Kundu D, Kamatchinathan S, Bai J, Hewapathirana S, John N, Prakash A, Walzer M, Wang S, Vizcaíno J. The PRIDE database at 20 years: 2025 update. Nucleic Acids Res 2025; 53:D543-D553. [PMID: 39494541 PMCID: PMC11701690 DOI: 10.1093/nar/gkae1011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2024] [Revised: 10/11/2024] [Accepted: 10/16/2024] [Indexed: 11/05/2024] Open
Abstract
The PRoteomics IDEntifications (PRIDE) database (https://www.ebi.ac.uk/pride/) is the world's leading mass spectrometry (MS)-based proteomics data repository and one of the founding members of the ProteomeXchange consortium. This manuscript summarizes the developments in PRIDE resources and related tools for the last three years. The number of submitted datasets to PRIDE Archive (the archival component of PRIDE) has reached on average around 534 datasets per month. This has been possible thanks to continuous improvements in infrastructure such as a new file transfer protocol for very large datasets (Globus), a new data resubmission pipeline and an automatic dataset validation process. Additionally, we will highlight novel activities such as the availability of the PRIDE chatbot (based on the use of open-source Large Language Models), and our work to improve support for MS crosslinking datasets. Furthermore, we will describe how we have increased our efforts to reuse, reanalyze and disseminate high-quality proteomics data into added-value resources such as UniProt, Ensembl and Expression Atlas.
Collapse
Affiliation(s)
- Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Chakradhar Bandla
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Deepti J Kundu
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Selvakumar Kamatchinathan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Jingwen Bai
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Suresh Hewapathirana
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Nithu Sara John
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Ananth Prakash
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Mathias Walzer
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Shengbo Wang
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
2
|
Lermyte F. The need for open and FAIR data in top-down proteomics. Proteomics 2024; 24:e2300354. [PMID: 38088481 DOI: 10.1002/pmic.202300354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 10/24/2023] [Indexed: 02/15/2024]
Abstract
In recent years, there has been a tremendous evolution in the high-throughput, tandem mass spectrometry-based analysis of intact proteins, also known as top-down proteomics (TDP). Both hardware and software have developed to the point that the technique has largely entered the mainstream, and large-scale, ambitious, multi-laboratory initiatives have started to make their appearance in the literature. For this, however, more convenient and robust data sharing and reuse will be required. Walzer et al. have created TopDownApp, a customisable, open platform for visualisation and analysis of TDP data, which they hope will be a step in this direction. As they point out, other benefits of such data sharing and interoperability would include reanalysis of published datasets, as well as the prospect of using large amounts of data to train machine learning algorithms. In time, this work could prove to be a valuable resource in the move towards a future of greater TDP data findability, accessibility, interoperability and reusability.
Collapse
Affiliation(s)
- Frederik Lermyte
- Department of Chemistry, Clemens-Schöpf Institute of Organic Chemistry and Biochemistry, Technical University of Darmstadt, Darmstadt, Germany
| |
Collapse
|