1
|
Jiang Y, Rex DA, Schuster D, Neely BA, Rosano GL, Volkmar N, Momenzadeh A, Peters-Clarke TM, Egbert SB, Kreimer S, Doud EH, Crook OM, Yadav AK, Vanuopadath M, Hegeman AD, Mayta M, Duboff AG, Riley NM, Moritz RL, Meyer JG. Comprehensive Overview of Bottom-Up Proteomics Using Mass Spectrometry. ACS MEASUREMENT SCIENCE AU 2024; 4:338-417. [PMID: 39193565 PMCID: PMC11348894 DOI: 10.1021/acsmeasuresciau.3c00068] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 05/03/2024] [Accepted: 05/03/2024] [Indexed: 08/29/2024]
Abstract
Proteomics is the large scale study of protein structure and function from biological systems through protein identification and quantification. "Shotgun proteomics" or "bottom-up proteomics" is the prevailing strategy, in which proteins are hydrolyzed into peptides that are analyzed by mass spectrometry. Proteomics studies can be applied to diverse studies ranging from simple protein identification to studies of proteoforms, protein-protein interactions, protein structural alterations, absolute and relative protein quantification, post-translational modifications, and protein stability. To enable this range of different experiments, there are diverse strategies for proteome analysis. The nuances of how proteomic workflows differ may be challenging to understand for new practitioners. Here, we provide a comprehensive overview of different proteomics methods. We cover from biochemistry basics and protein extraction to biological interpretation and orthogonal validation. We expect this Review will serve as a handbook for researchers who are new to the field of bottom-up proteomics.
Collapse
Affiliation(s)
- Yuming Jiang
- Department
of Computational Biomedicine, Cedars Sinai
Medical Center, Los Angeles, California 90048, United States
- Smidt Heart
Institute, Cedars Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced
Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los
Angeles, California 90048, United States
| | - Devasahayam Arokia
Balaya Rex
- Center for
Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India
| | - Dina Schuster
- Department
of Biology, Institute of Molecular Systems
Biology, ETH Zurich, Zurich 8093, Switzerland
- Department
of Biology, Institute of Molecular Biology
and Biophysics, ETH Zurich, Zurich 8093, Switzerland
- Laboratory
of Biomolecular Research, Division of Biology and Chemistry, Paul Scherrer Institute, Villigen 5232, Switzerland
| | - Benjamin A. Neely
- Chemical
Sciences Division, National Institute of
Standards and Technology, NIST, Charleston, South Carolina 29412, United States
| | - Germán L. Rosano
- Mass
Spectrometry
Unit, Institute of Molecular and Cellular
Biology of Rosario, Rosario, 2000 Argentina
| | - Norbert Volkmar
- Department
of Biology, Institute of Molecular Systems
Biology, ETH Zurich, Zurich 8093, Switzerland
| | - Amanda Momenzadeh
- Department
of Computational Biomedicine, Cedars Sinai
Medical Center, Los Angeles, California 90048, United States
- Smidt Heart
Institute, Cedars Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced
Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los
Angeles, California 90048, United States
| | - Trenton M. Peters-Clarke
- Department
of Pharmaceutical Chemistry, University
of California—San Francisco, San Francisco, California, 94158, United States
| | - Susan B. Egbert
- Department
of Chemistry, University of Manitoba, Winnipeg, Manitoba, R3T 2N2 Canada
| | - Simion Kreimer
- Smidt Heart
Institute, Cedars Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced
Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los
Angeles, California 90048, United States
| | - Emma H. Doud
- Center
for Proteome Analysis, Indiana University
School of Medicine, Indianapolis, Indiana, 46202-3082, United States
| | - Oliver M. Crook
- Oxford
Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, United
Kingdom
| | - Amit Kumar Yadav
- Translational
Health Science and Technology Institute, NCR Biotech Science Cluster 3rd Milestone Faridabad-Gurgaon
Expressway, Faridabad, Haryana 121001, India
| | | | - Adrian D. Hegeman
- Departments
of Horticultural Science and Plant and Microbial Biology, University of Minnesota, Twin Cities, Minnesota 55108, United States
| | - Martín
L. Mayta
- School
of Medicine and Health Sciences, Center for Health Sciences Research, Universidad Adventista del Plata, Libertador San Martin 3103, Argentina
- Molecular
Biology Department, School of Pharmacy and Biochemistry, Universidad Nacional de Rosario, Rosario 2000, Argentina
| | - Anna G. Duboff
- Department
of Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Nicholas M. Riley
- Department
of Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Robert L. Moritz
- Institute
for Systems biology, Seattle, Washington 98109, United States
| | - Jesse G. Meyer
- Department
of Computational Biomedicine, Cedars Sinai
Medical Center, Los Angeles, California 90048, United States
- Smidt Heart
Institute, Cedars Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced
Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los
Angeles, California 90048, United States
| |
Collapse
|
2
|
Erban T, Sopko B. Understanding bacterial pathogen diversity: A proteogenomic analysis and use of an array of genome assemblies to identify novel virulence factors of the honey bee bacterial pathogen Paenibacillus larvae. Proteomics 2024; 24:e2300280. [PMID: 38742951 DOI: 10.1002/pmic.202300280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 03/07/2024] [Accepted: 04/08/2024] [Indexed: 05/16/2024]
Abstract
Mass spectrometry proteomics data are typically evaluated against publicly available annotated sequences, but the proteogenomics approach is a useful alternative. A single genome is commonly utilized in custom proteomic and proteogenomic data analysis. We pose the question of whether utilizing numerous different genome assemblies in a search database would be beneficial. We reanalyzed raw data from the exoprotein fraction of four reference Enterobacterial Repetitive Intergenic Consensus (ERIC) I-IV genotypes of the honey bee bacterial pathogen Paenibacillus larvae and evaluated them against three reference databases (from NCBI-protein, RefSeq, and UniProt) together with an array of protein sequences generated by six-frame direct translation of 15 genome assemblies from GenBank. The wide search yielded 453 protein hits/groups, which UpSet analysis categorized into 50 groups based on the success of protein identification by the 18 database components. Nine hits that were not identified by a unique peptide were not considered for marker selection, which discarded the only protein that was not identified by the reference databases. We propose that the variability in successful identifications between genome assemblies is useful for marker mining. The results suggest that various strains of P. larvae can exhibit specific traits that set them apart from the established genotypes ERIC I-V.
Collapse
Affiliation(s)
- Tomas Erban
- Proteomics and Metabolomics Laboratory, Crop Research Institute, Prague, Czechia
| | - Bruno Sopko
- Proteomics and Metabolomics Laboratory, Crop Research Institute, Prague, Czechia
| |
Collapse
|
3
|
Chiang Y, Welker F, Collins MJ. Spectra without stories: reporting 94% dark and unidentified ancient proteomes. OPEN RESEARCH EUROPE 2024; 4:71. [PMID: 38903702 PMCID: PMC11187534 DOI: 10.12688/openreseurope.17225.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 03/15/2024] [Indexed: 06/22/2024]
Abstract
Background Data-dependent, bottom-up proteomics is widely used for identifying proteins and peptides. However, one key challenge is that 70% of fragment ion spectra consistently fail to be assigned by conventional database searching. This 'dark matter' of bottom-up proteomics seems to affect fields where non-model organisms, low-abundance proteins, non-tryptic peptides, and complex modifications may be present. While palaeoproteomics may appear as a niche field, understanding and reporting unidentified ancient spectra require collaborative innovation in bioinformatics strategies. This may advance the analysis of complex datasets. Methods 14.97 million high-impact ancient spectra published in Nature and Science portfolios were mined from public repositories. Identification rates, defined as the proportion of assigned fragment ion spectra, were collected as part of deposited database search outputs or parsed using open-source python packages. Results and Conclusions We report that typically 94% of the published ancient spectra remain unidentified. This phenomenon may be caused by multiple factors, notably the limitations of database searching and the selection of user-defined reference data with advanced modification patterns. These 'spectra without stories' highlight the need for widespread data sharing to facilitate methodological development and minimise the loss of often irreplaceable ancient materials. Testing and validating alternative search strategies, such as open searching and de novo sequencing, may also improve overall identification rates. Hence, lessons learnt in palaeoproteomics may benefit other fields grappling with challenging data.
Collapse
Affiliation(s)
- Yun Chiang
- Globe Institute, University of Copenhagen, Copenhagen, Denmark
- The Nice Institute of Chemistry, Universite Cote d'Azur, Nice, France
| | - Frido Welker
- Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Matthew James Collins
- Globe Institute, University of Copenhagen, Copenhagen, Denmark
- McDonald Institute for Archaeological Research, University of Cambridge, Cambridge, England, UK
| |
Collapse
|
4
|
Miravet-Verde S, Mazzolini R, Segura-Morales C, Broto A, Lluch-Senar M, Serrano L. ProTInSeq: transposon insertion tracking by ultra-deep DNA sequencing to identify translated large and small ORFs. Nat Commun 2024; 15:2091. [PMID: 38453908 PMCID: PMC10920889 DOI: 10.1038/s41467-024-46112-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Accepted: 02/14/2024] [Indexed: 03/09/2024] Open
Abstract
Identifying open reading frames (ORFs) being translated is not a trivial task. ProTInSeq is a technique designed to characterize proteomes by sequencing transposon insertions engineered to express a selection marker when they occur in-frame within a protein-coding gene. In the bacterium Mycoplasma pneumoniae, ProTInSeq identifies 83% of its annotated proteins, along with 5 proteins and 153 small ORF-encoded proteins (SEPs; ≤100 aa) that were not previously annotated. Moreover, ProTInSeq can be utilized for detecting translational noise, as well as for relative quantification and transmembrane topology estimation of fitness and non-essential proteins. By integrating various identification approaches, the number of initially annotated SEPs in this bacterium increases from 27 to 329, with a quarter of them predicted to possess antimicrobial potential. Herein, we describe a methodology complementary to Ribo-Seq and mass spectroscopy that can identify SEPs while providing other insights in a proteome with a flexible and cost-effective DNA ultra-deep sequencing approach.
Collapse
Affiliation(s)
- Samuel Miravet-Verde
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003, Barcelona, Spain.
- Department of Biology, Institute of Microbiology and Swiss Institute of Bioinformatics, ETH Zurich, Zurich, Switzerland.
| | | | - Carolina Segura-Morales
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003, Barcelona, Spain
| | - Alicia Broto
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003, Barcelona, Spain
| | - Maria Lluch-Senar
- Pulmobiotics, Dr Aiguader 88, 08003, Barcelona, Spain.
- Institute of Biotechnology and Biomedicine "Vicent Villar Palasi" (IBB), Universitat Autònoma de Barcelona, Barcelona, Spain.
| | - Luis Serrano
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr Aiguader 88, 08003, Barcelona, Spain.
- Universitat Pompeu Fabra (UPF), Barcelona, Spain.
- ICREA, Pg. Lluis Companys 23, 08010, Barcelona, Spain.
| |
Collapse
|
5
|
Shi M, Evans CA, McQuillan JL, Noirel J, Pandhal J. LFQRatio: A Normalization Method to Decipher Quantitative Proteome Changes in Microbial Coculture Systems. J Proteome Res 2024; 23:999-1013. [PMID: 38354288 PMCID: PMC10913063 DOI: 10.1021/acs.jproteome.3c00714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 01/24/2024] [Accepted: 01/31/2024] [Indexed: 02/16/2024]
Abstract
The value of synthetic microbial communities in biotechnology is gaining traction due to their ability to undertake more complex metabolic tasks than monocultures. However, a thorough understanding of strain interactions, productivity, and stability is often required to optimize growth and scale up cultivation. Quantitative proteomics can provide valuable insights into how microbial strains adapt to changing conditions in biomanufacturing. However, current workflows and methodologies are not suitable for simple artificial coculture systems where strain ratios are dynamic. Here, we established a workflow for coculture proteomics using an exemplar system containing two members, Azotobacter vinelandii and Synechococcus elongatus. Factors affecting the quantitative accuracy of coculture proteomics were investigated, including peptide physicochemical characteristics such as molecular weight, isoelectric point, hydrophobicity, and dynamic range as well as factors relating to protein identification such as varying proteome size and shared peptides between species. Different quantification methods based on spectral counts and intensity were evaluated at the protein and cell level. We propose a new normalization method, named "LFQRatio", to reflect the relative contributions of two distinct cell types emerging from cell ratio changes during cocultivation. LFQRatio can be applied to real coculture proteomics experiments, providing accurate insights into quantitative proteome changes in each strain.
Collapse
Affiliation(s)
- Mengxun Shi
- Department
of Chemical and Biological Engineering, The University of Sheffield, Mappin Street, Sheffield S1 3JD, U.K.
| | - Caroline A. Evans
- Department
of Chemical and Biological Engineering, The University of Sheffield, Mappin Street, Sheffield S1 3JD, U.K.
| | - Josie L. McQuillan
- Department
of Chemical and Biological Engineering, The University of Sheffield, Mappin Street, Sheffield S1 3JD, U.K.
| | - Josselin Noirel
- GBCM
Laboratory (EA7528), Conservatoire National des Arts et Métiers, HESAM Université, 2 rue Conté, Paris 75003, France
| | - Jagroop Pandhal
- Department
of Chemical and Biological Engineering, The University of Sheffield, Mappin Street, Sheffield S1 3JD, U.K.
| |
Collapse
|
6
|
Pandeswari PB, Isaac AE, Sabareesh V. Database Creator for Mass Analysis of Peptides and Proteins, DC-MAPP: A Standalone Tool for Simplifying Manual Analysis of Mass Spectral Data to Identify Peptide/Protein Sequences. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2023; 34:1962-1969. [PMID: 37526995 DOI: 10.1021/jasms.3c00030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/03/2023]
Abstract
Proteomic studies typically involve the use of different types of software for annotating experimental tandem mass spectrometric data (MS/MS) and thereby simplifying the process of peptide and protein identification. For such annotations, these softwares calculate the m/z values of the peptide/protein precursor and fragment ions, for which a database of protein sequences must be provided as an input file. The calculated m/z values are stored as another database, which the user usually cannot view. Database Creator for Mass Analysis of Peptides and Proteins (DC-MAPP) is a novel standalone software that can create custom databases for "viewing" the calculated m/z values of precursor and fragment ions, prior to the database search. It contains three modules. Peptide/Protein sequences as per user's choice can be entered as input to the first module for creating a custom database. In the second module, m/z values must be queried-in, which are searched within the custom database to identify protein/peptide sequences. The third module is suited for peptide mass fingerprinting, which can be used to analyze both ESI and MALDI mass spectral data. The feature of "viewing" the custom database can be helpful not only for better understanding the search engine processes, but also for designing multiple reaction monitoring (MRM) methods. Post-translational modifications and protein isoforms can also be analyzed. Since, DC-MAPP relies on the protein/peptide "sequences" for creating custom databases, it may not be applicable for the searches involving spectral libraries. Python language was used for implementation, and the graphical user interface was built with Page/Tcl, making this tool more user-friendly. It is freely available at https://vit.ac.in/DC-MAPP/.
Collapse
Affiliation(s)
- Pandi Boomathi Pandeswari
- Centre for Bio-Separation Technology (CBST), Vellore Institute of Technology (VIT), Vellore, Tamil Nadu - 632014, India
| | - Arnold Emerson Isaac
- Bioinformatics Programming Laboratory, School of Bio Sciences & Technology (SBST), VIT, Vellore, Tamil Nadu - 632014, India
| | - Varatharajan Sabareesh
- Centre for Bio-Separation Technology (CBST), Vellore Institute of Technology (VIT), Vellore, Tamil Nadu - 632014, India
| |
Collapse
|
7
|
Spermatozoa and seminal plasma proteomics: too many molecules, too few markers. The case of bovine and porcine semen. Anim Reprod Sci 2022; 247:107075. [DOI: 10.1016/j.anireprosci.2022.107075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 08/06/2022] [Accepted: 09/20/2022] [Indexed: 11/22/2022]
|
8
|
Abstract
Paleoproteomics, the study of ancient proteins, is a rapidly growing field at the intersection of molecular biology, paleontology, archaeology, paleoecology, and history. Paleoproteomics research leverages the longevity and diversity of proteins to explore fundamental questions about the past. While its origins predate the characterization of DNA, it was only with the advent of soft ionization mass spectrometry that the study of ancient proteins became truly feasible. Technological gains over the past 20 years have allowed increasing opportunities to better understand preservation, degradation, and recovery of the rich bioarchive of ancient proteins found in the archaeological and paleontological records. Growing from a handful of studies in the 1990s on individual highly abundant ancient proteins, paleoproteomics today is an expanding field with diverse applications ranging from the taxonomic identification of highly fragmented bones and shells and the phylogenetic resolution of extinct species to the exploration of past cuisines from dental calculus and pottery food crusts and the characterization of past diseases. More broadly, these studies have opened new doors in understanding past human-animal interactions, the reconstruction of past environments and environmental changes, the expansion of the hominin fossil record through large scale screening of nondiagnostic bone fragments, and the phylogenetic resolution of the vertebrate fossil record. Even with these advances, much of the ancient proteomic record still remains unexplored. Here we provide an overview of the history of the field, a summary of the major methods and applications currently in use, and a critical evaluation of current challenges. We conclude by looking to the future, for which innovative solutions and emerging technology will play an important role in enabling us to access the still unexplored "dark" proteome, allowing for a fuller understanding of the role ancient proteins can play in the interpretation of the past.
Collapse
Affiliation(s)
- Christina Warinner
- Department
of Anthropology, Harvard University, Cambridge, Massachusetts 02138, United States
- Department of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, Leipzig 04103, Germany
| | - Kristine Korzow Richter
- Department
of Anthropology, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Matthew J. Collins
- Department
of Archaeology, Cambridge University, Cambridge CB2 3DZ, United Kingdom
- Section
for Evolutionary Genomics, Globe Institute,
University of Copenhagen, Copenhagen 1350, Denmark
| |
Collapse
|
9
|
Aggarwal S, Raj A, Kumar D, Dash D, Yadav AK. False discovery rate: the Achilles' heel of proteogenomics. Brief Bioinform 2022; 23:6582880. [PMID: 35534181 DOI: 10.1093/bib/bbac163] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 03/14/2022] [Accepted: 04/12/2022] [Indexed: 12/25/2022] Open
Abstract
Proteogenomics refers to the integrated analysis of the genome and proteome that leverages mass-spectrometry (MS)-based proteomics data to improve genome annotations, understand gene expression control through proteoforms and find sequence variants to develop novel insights for disease classification and therapeutic strategies. However, proteogenomic studies often suffer from reduced sensitivity and specificity due to inflated database size. To control the error rates, proteogenomics depends on the target-decoy search strategy, the de-facto method for false discovery rate (FDR) estimation in proteomics. The proteogenomic databases constructed from three- or six-frame nucleotide database translation not only increase the search space and compute-time but also violate the equivalence of target and decoy databases. These searches result in poorer separation between target and decoy scores, leading to stringent FDR thresholds. Understanding these factors and applying modified strategies such as two-pass database search or peptide-class-specific FDR can result in a better interpretation of MS data without introducing additional statistical biases. Based on these considerations, a user can interpret the proteogenomics results appropriately and control false positives and negatives in a more informed manner. In this review, first, we briefly discuss the proteogenomic workflows and limitations in database construction, followed by various considerations that can influence potential novel discoveries in a proteogenomic study. We conclude with suggestions to counter these challenges for better proteogenomic data interpretation.
Collapse
Affiliation(s)
- Suruchi Aggarwal
- Translational Health Science and Technology Institute, NCR Biotech Science Cluster, 3rd milestone, PO Box No. 04, Faridabad-Gurgaon Expressway, Faridabad-121001, Haryana, India
| | - Anurag Raj
- GN Ramachandran Knowledge Centre for Genome Informatics, CSIR-Institute of Genomics & Integrative Biology, South Campus, Mathura Road, New Delhi 110025, India.,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad-201002, India
| | - Dhirendra Kumar
- GN Ramachandran Knowledge Centre for Genome Informatics, CSIR-Institute of Genomics & Integrative Biology, South Campus, Mathura Road, New Delhi 110025, India
| | - Debasis Dash
- GN Ramachandran Knowledge Centre for Genome Informatics, CSIR-Institute of Genomics & Integrative Biology, South Campus, Mathura Road, New Delhi 110025, India.,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad-201002, India
| | - Amit Kumar Yadav
- Translational Health Science and Technology Institute, NCR Biotech Science Cluster, 3rd milestone, PO Box No. 04, Faridabad-Gurgaon Expressway, Faridabad-121001, Haryana, India
| |
Collapse
|
10
|
Hari PS, Balakrishnan L, Kotyada C, Everad John A, Tiwary S, Shah N, Sirdeshmukh R. Proteogenomic Analysis of Breast Cancer Transcriptomic and Proteomic Data, Using De Novo Transcript Assembly: Genome-Wide Identification of Novel Peptides and Clinical Implications. Mol Cell Proteomics 2022; 21:100220. [PMID: 35227895 PMCID: PMC9020135 DOI: 10.1016/j.mcpro.2022.100220] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Revised: 01/16/2022] [Accepted: 02/24/2022] [Indexed: 11/30/2022] Open
Abstract
We have carried out proteogenomic analysis of the breast cancer transcriptomic and proteomic data, available at The Clinical Proteomic Tumor Analysis Consortium resource, to identify novel peptides arising from alternatively spliced events as well as other noncanonical expressions. We used a pipeline that consisted of de novo transcript assembly, six frame-translated custom database, and a combination of search engines to identify novel peptides. A portfolio of 4,387 novel peptide sequences initially identified was further screened through PepQuery validation tool (Clinical Proteomic Tumor Analysis Consortium), which yielded 1,558 novel peptides. We considered the dataset of 1,558 validated through PepQuery to understand their functional and clinical significance, leaving the rest to be further verified using other validation tools and approaches. The novel peptides mapped to the known gene sequences as well as to genomic regions yet undefined for translation, 580 novel peptides mapped to known protein-coding genes, 147 to non–protein-coding genes, and 831 belonged to novel translational sequences. The novel peptides belonging to protein-coding genes represented alternatively spliced events or 5′ or 3′ extensions, whereas others represented translation from pseudogenes, long noncoding RNAs, or novel peptides originating from uncharacterized protein-coding sequences—mostly from the intronic regions of known genes. Seventy-six of the 580 protein-coding genes were associated with cancer hallmark genes, which included key oncogenes, transcription factors, kinases, and cell surface receptors. Survival association analysis of the 76 novel peptide sequences revealed 10 of them to be significant, and we present a panel of six novel peptides, whose high expression was found to be strongly associated with poor survival of patients with human epidermal growth factor receptor 2–enriched subtype. Our analysis represents a landscape of novel peptides of different types that may be expressed in breast cancer tissues, whereas their presence in full-length functional proteins needs further investigations. Novel protein variants and peptides from noncoding sequences are rapidly emerging. Mining of mass spectrometry data using proteogenomic analysis reveals such entities. Novel peptides from coding and noncoding sequences identified in breast cancer. Novel peptides mapped to cancer hallmark genes in breast cancer. Panel of novel peptides with prognostic potential found for HER2-enriched subtype.
Collapse
Affiliation(s)
- P S Hari
- Mazumdar Shaw Center for Translational Research, Narayana Health, Bangalore, India
| | - Lavanya Balakrishnan
- Mazumdar Shaw Center for Translational Research, Narayana Health, Bangalore, India
| | - Chaithanya Kotyada
- Mazumdar Shaw Center for Translational Research, Narayana Health, Bangalore, India
| | | | - Shivani Tiwary
- Simulation and Modeling Sciences, Pfizer Pharma GmBH, Berlin, Germany
| | - Nameeta Shah
- Mazumdar Shaw Center for Translational Research, Narayana Health, Bangalore, India.
| | - Ravi Sirdeshmukh
- Mazumdar Shaw Center for Translational Research, Narayana Health, Bangalore, India; Institute of Bioinformatics, International Tech Park, Bangalore, India; Health Sciences, Manipal Academy of Higher Education, Manipal, India.
| |
Collapse
|
11
|
Rajczewski AT, Jagtap PD, Griffin TJ. An overview of technologies for MS-based proteomics-centric multi-omics. Expert Rev Proteomics 2022; 19:165-181. [PMID: 35466851 PMCID: PMC9613604 DOI: 10.1080/14789450.2022.2070476] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
INTRODUCTION Mass spectrometry-based proteomics reveals dynamic molecular signatures underlying phenotypes reflecting normal and perturbed conditions in living systems. Although valuable on its own, the proteome has only one level of moleclar information, with the genome, epigenome, transcriptome, and metabolome, all providing complementary information. Multi-omic analysis integrating information from one or more of these other domains with proteomic information provides a more complete picture of molecular contributors to dynamic biological systems. AREAS COVERED Here, we discuss the improvements to mass spectrometry-based technologies, focused on peptide-based, bottom-up approaches that have enabled deep, quantitative characterization of complex proteomes. These advances are facilitating the integration of proteomics data with other 'omic information, providing a more complete picture of living systems. We also describe the current state of bioinformatics software and approaches for integrating proteomics and other 'omics data, critical for enabling new discoveries driven by multi-omics. EXPERT COMMENTARY Multi-omics, centered on the integration of proteomics information with other 'omic information, has tremendous promise for biological and biomedical studies. Continued advances in approaches for generating deep, reliable proteomic data and bioinformatics tools aimed at integrating data across 'omic domains will ensure the discoveries offered by these multi-omic studies continue to increase.
Collapse
Affiliation(s)
- Andrew T. Rajczewski
- Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA
| | - Pratik D. Jagtap
- Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA,Coauthor, Research Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA,Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA
| |
Collapse
|
12
|
Wang Z, Pan N, Yan J, Wan J, Wan C. Systematic Identification of Microproteins during the Development of Drosophila melanogaster. J Proteome Res 2022; 21:1114-1123. [PMID: 35227063 DOI: 10.1021/acs.jproteome.2c00004] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Short open reading frame-encoded peptides (SEPs) are microproteins with less than 100 amino acids that play an essential role in the growth and development of organisms. There are plenty of short open reading frames in Drosophila melanogaster that potentially code polypeptides. We chose 11 time points during the life cycle of Drosophila to investigate microproteins, particularly those related to development. Finally, we identified a total of 410 microproteins, of which 27 were noncoding RNA-encoded proteins. Of the 410 microproteins, 74 were expressed in all stages from embryo to adults, whereas 300 microproteins were only found in one or two time points. Approximately, one-third of the microproteins were not reported previously and 44 were obtained from de novo sequencing, validated by synthetic peptides. These microproteins are related to the main bioprocesses of growth and development, such as multicellular organism reproduction, postmating behavior, and oviposition. Over half of the microproteins have predicted functional domains and are conserved across species, suggesting that these microproteins have critical functions in fly development. This work enriches the D. melanogaster proteome and provides a significant data resource for growth and development research.
Collapse
Affiliation(s)
- Zhiwei Wang
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei 430079, People's Republic of China
| | - Ni Pan
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei 430079, People's Republic of China
| | - Jiahao Yan
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei 430079, People's Republic of China
| | - Jian Wan
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei 430079, People's Republic of China
| | - Cuihong Wan
- School of Life Sciences and Hubei Key Laboratory of Genetic Regulation and Integrative Biology, Central China Normal University, Wuhan, Hubei 430079, People's Republic of China
| |
Collapse
|
13
|
Blakeley-Ruiz JA, Kleiner M. Considerations for Constructing a Protein Sequence Database for Metaproteomics. Comput Struct Biotechnol J 2022; 20:937-952. [PMID: 35242286 PMCID: PMC8861567 DOI: 10.1016/j.csbj.2022.01.018] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Revised: 01/14/2022] [Accepted: 01/18/2022] [Indexed: 12/14/2022] Open
Abstract
Mass spectrometry-based metaproteomics has emerged as a prominent technique for interrogating the functions of specific organisms in microbial communities, in addition to total community function. Identifying proteins by mass spectrometry requires matching mass spectra of fragmented peptide ions to a database of protein sequences corresponding to the proteins in the sample. This sequence database determines which protein sequences can be identified from the measurement, and as such the taxonomic and functional information that can be inferred from a metaproteomics measurement. Thus, the construction of the protein sequence database directly impacts the outcome of any metaproteomics study. Several factors, such as source of sequence information and database curation, need to be considered during database construction to maximize accurate protein identifications traceable to the species of origin. In this review, we provide an overview of existing strategies for database construction and the relevant studies that have sought to test and validate these strategies. Based on this review of the literature and our experience we provide a decision tree and best practices for choosing and implementing database construction strategies.
Collapse
Affiliation(s)
- J. Alfredo Blakeley-Ruiz
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, USA
- Center for Gastrointestinal Biology and Disease, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Corresponding authors at: Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, USA.
| | - Manuel Kleiner
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, USA
- Corresponding authors at: Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, USA.
| |
Collapse
|
14
|
Balhara A, Basit A, Argikar UA, Dumouchel JL, Singh S, Prasad B. Comparative Proteomics Analysis of the Postmitochondrial Supernatant Fraction of Human Lens-Free Whole Eye and Liver. Drug Metab Dispos 2021; 49:592-600. [PMID: 33952609 DOI: 10.1124/dmd.120.000297] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Accepted: 04/08/2021] [Indexed: 11/22/2022] Open
Abstract
The increasing incidence of ocular diseases has accelerated research into therapeutic interventions needed for the eye. Ocular enzymes play important roles in the metabolism of drugs and endobiotics. Various ocular drugs are designed as prodrugs that are activated by ocular enzymes. Moreover, ocular enzymes have been implicated in the bioactivation of drugs to their toxic metabolites. The key purpose of this study was to compare global proteomes of the pooled samples of the eye (n = 11) and the liver (n = 50) with a detailed analysis of the abundance of enzymes involved in the metabolism of xenobiotics and endobiotics. We used the postmitochondrial supernatant fraction (S9 fraction) of the lens-free whole eye homogenate as a model to allow accurate comparison with the liver S9 fraction. A total of 269 proteins (including 23 metabolic enzymes) were detected exclusively in the pooled eye S9 against 648 proteins in the liver S9 (including 174 metabolic enzymes), whereas 424 proteins (including 94 metabolic enzymes) were detected in both the organs. The major hepatic cytochrome P450 and UDP-glucuronosyltransferases enzymes were not detected, but aldehyde dehydrogenases and glutathione transferases were the predominant proteins in the eye. The comparative qualitative and quantitative proteomics data in the eye versus liver is expected to help in explaining differential metabolic and physiologic activities in the eye. SIGNIFICANCE STATEMENT: Information on the enzymes involved in xenobiotic and endobiotic metabolism in the human eye in relation to the liver is scarcely available. The study employed global proteomic analysis to compare the proteomes of the lens-free whole eye and the liver with a detailed analysis of the enzymes involved in xenobiotic and endobiotic metabolism. These data will help in better understanding of the ocular metabolism and activation of drugs and endobiotics.
Collapse
Affiliation(s)
- Ankit Balhara
- Department of Pharmaceutical Analysis, National Institute of Pharmaceutical Education and Research (NIPER), S.A.S. Nagar, Punjab, India (An.B., S.S.); Department of Pharmaceutical Sciences, Washington State University, Spokane, Washington (Ab.B., B.P.); Biotransformation Group, Novartis Institutes for BioMedical Research, Cambridge, Massachusetts (U.A.A.); and Department of Molecular Pharmacology and Physiology, Brown University, Providence, Rhode Island (J.L.D.)
| | - Abdul Basit
- Department of Pharmaceutical Analysis, National Institute of Pharmaceutical Education and Research (NIPER), S.A.S. Nagar, Punjab, India (An.B., S.S.); Department of Pharmaceutical Sciences, Washington State University, Spokane, Washington (Ab.B., B.P.); Biotransformation Group, Novartis Institutes for BioMedical Research, Cambridge, Massachusetts (U.A.A.); and Department of Molecular Pharmacology and Physiology, Brown University, Providence, Rhode Island (J.L.D.)
| | - Upendra A Argikar
- Department of Pharmaceutical Analysis, National Institute of Pharmaceutical Education and Research (NIPER), S.A.S. Nagar, Punjab, India (An.B., S.S.); Department of Pharmaceutical Sciences, Washington State University, Spokane, Washington (Ab.B., B.P.); Biotransformation Group, Novartis Institutes for BioMedical Research, Cambridge, Massachusetts (U.A.A.); and Department of Molecular Pharmacology and Physiology, Brown University, Providence, Rhode Island (J.L.D.)
| | - Jennifer L Dumouchel
- Department of Pharmaceutical Analysis, National Institute of Pharmaceutical Education and Research (NIPER), S.A.S. Nagar, Punjab, India (An.B., S.S.); Department of Pharmaceutical Sciences, Washington State University, Spokane, Washington (Ab.B., B.P.); Biotransformation Group, Novartis Institutes for BioMedical Research, Cambridge, Massachusetts (U.A.A.); and Department of Molecular Pharmacology and Physiology, Brown University, Providence, Rhode Island (J.L.D.)
| | - Saranjit Singh
- Department of Pharmaceutical Analysis, National Institute of Pharmaceutical Education and Research (NIPER), S.A.S. Nagar, Punjab, India (An.B., S.S.); Department of Pharmaceutical Sciences, Washington State University, Spokane, Washington (Ab.B., B.P.); Biotransformation Group, Novartis Institutes for BioMedical Research, Cambridge, Massachusetts (U.A.A.); and Department of Molecular Pharmacology and Physiology, Brown University, Providence, Rhode Island (J.L.D.)
| | - Bhagwat Prasad
- Department of Pharmaceutical Analysis, National Institute of Pharmaceutical Education and Research (NIPER), S.A.S. Nagar, Punjab, India (An.B., S.S.); Department of Pharmaceutical Sciences, Washington State University, Spokane, Washington (Ab.B., B.P.); Biotransformation Group, Novartis Institutes for BioMedical Research, Cambridge, Massachusetts (U.A.A.); and Department of Molecular Pharmacology and Physiology, Brown University, Providence, Rhode Island (J.L.D.)
| |
Collapse
|
15
|
Tolani P, Gupta S, Yadav K, Aggarwal S, Yadav AK. Big data, integrative omics and network biology. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2021; 127:127-160. [PMID: 34340766 DOI: 10.1016/bs.apcsb.2021.03.006] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
A cell integrates various signals through a network of biomolecules that crosstalk to synergistically regulate the replication, transcription, translation and other metabolic activities of a cell. These networks regulate signal perception and processing that drives biological functions. The biological complexity cannot be fully captured by a single -omics discipline. The holistic study of an organism-in health, perturbation, exposure to environment and disease, is studied under systems biology. The bottom-up molecular approaches (genes, mRNA, protein, metabolite, etc.) have laid the foundation of current biological knowledge covering the horizon from viruses, bacteria, fungi, plants and animals. Yet, these techniques provide a rather myopic view of biology at the molecular level. To understand how the interconnected molecular components are formed and rewired in disease or exposure to environmental stimuli is the holy grail of modern biology. The omics era was heralded by the genomics revolution but advanced sequencing techniques are now also ubiquitous in transcriptomics, proteomics, metabolomics and lipidomics. Multi-omics data analysis and integration techniques are driving the quest for deeper insights into how the different layers of biomolecules talk to each other in diverse contexts.
Collapse
Affiliation(s)
- Priya Tolani
- Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad, Haryana, India
| | - Srishti Gupta
- Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad, Haryana, India; School of Biosciences and Technology, Vellore Institute of Technology, Vellore, India
| | - Kirti Yadav
- Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad, Haryana, India; Department of Pharmaceutical Biotechnology, Delhi Pharmaceutical Sciences and Research University, New Delhi, India
| | - Suruchi Aggarwal
- Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad, Haryana, India; Department of Molecular Biology and Biotechnology, Cotton University, Guwahati, Assam, India
| | - Amit Kumar Yadav
- Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad, Haryana, India.
| |
Collapse
|
16
|
Aggarwal S, Tolani P, Gupta S, Yadav AK. Posttranslational modifications in systems biology. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2021; 127:93-126. [PMID: 34340775 DOI: 10.1016/bs.apcsb.2021.03.005] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The biological complexity cannot be captured by genes or proteins alone. The protein posttranslational modifications (PTMs) impart functional diversity to the proteome and regulate protein structure, activity, localization and interactions. Their dynamics drive cellular signaling, growth and development while their dysregulation causes many diseases. Mass spectrometry based quantitative profiling of PTMs and bioinformatics analysis tools allow systems level insights into their network architecture. High-resolution profiling of PTM networks will advance disease understanding and precision medicine. It can accelerate the discovery of biomarkers and drug targets. This requires better tools for unbiased, high-throughput and accurate PTM identification, site localization and automated annotation on a systems level.
Collapse
Affiliation(s)
- Suruchi Aggarwal
- Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad, Haryana, India; Department of Molecular Biology and Biotechnology, Cotton University, Guwahati, Assam, India
| | - Priya Tolani
- Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad, Haryana, India
| | - Srishti Gupta
- Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad, Haryana, India; School of Biosciences and Technology, Vellore Institute of Technology, Vellore, India
| | - Amit Kumar Yadav
- Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad, Haryana, India.
| |
Collapse
|
17
|
Huang W, Kane MA. MAPLE: A Microbiome Analysis Pipeline Enabling Optimal Peptide Search and Comparative Taxonomic and Functional Analysis. J Proteome Res 2021; 20:2882-2894. [PMID: 33848166 DOI: 10.1021/acs.jproteome.1c00114] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Metaproteomics by mass spectrometry (MS) is a powerful approach to profile a large number of proteins expressed by all organisms in a highly complex biological or ecological sample, which is able to provide a direct and quantitative assessment of the functional makeup of a microbiota. The human gastrointestinal microbiota has been found playing important roles in human physiology and health, and metaproteomics has been shown to shed light on multiple novel associations between microbiota and diseases. MS-powered proteomics generally relies on genome data to define search space. However, metaproteomics, which simultaneously analyzes all proteins from hundreds to thousands of species, faces significant challenges regarding database search and interpretation of results. To overcome these obstacles, we have developed a user-friendly microbiome analysis pipeline (MAPLE, freely downloadable at http://maple.rx.umaryland.edu/), which is able to define an optimal search space by inferring proteomes specific to samples following the principle of parsimony. MAPLE facilitates highly comparable or better peptide identification compared to a sample-specific metagenome-guided search. In addition, we implemented an automated peptide-centric enrichment analysis function in MAPLE to address issues of traditional protein-centric comparison, enabling straightforward and comprehensive comparison of taxonomic and functional makeup between microbiota.
Collapse
Affiliation(s)
- Weiliang Huang
- Department of Pharmaceutical Sciences, University of Maryland, School of Pharmacy, Baltimore, Maryland 21201, United States
| | - Maureen A Kane
- Department of Pharmaceutical Sciences, University of Maryland, School of Pharmacy, Baltimore, Maryland 21201, United States
| |
Collapse
|
18
|
The challenge of detecting modifications on proteins. Essays Biochem 2020; 64:135-153. [PMID: 31957791 DOI: 10.1042/ebc20190055] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Revised: 12/17/2019] [Accepted: 12/19/2019] [Indexed: 12/16/2022]
Abstract
Post-translational modifications (PTMs) are integral to the regulation of protein function, characterising their role in this process is vital to understanding how cells work in both healthy and diseased states. Mass spectrometry (MS) facilitates the mass determination and sequencing of peptides, and thereby also the detection of site-specific PTMs. However, numerous challenges in this field continue to persist. The diverse chemical properties, low abundance, labile nature and instability of many PTMs, in combination with the more practical issues of compatibility with MS and bioinformatics challenges, contribute to the arduous nature of their analysis. In this review, we present an overview of the established MS-based approaches for analysing PTMs and the common complications associated with their investigation, including examples of specific challenges focusing on phosphorylation, lysine acetylation and redox modifications.
Collapse
|
19
|
Neumann EK, Djambazova KV, Caprioli RM, Spraggins JM. Multimodal Imaging Mass Spectrometry: Next Generation Molecular Mapping in Biology and Medicine. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2020; 31:2401-2415. [PMID: 32886506 PMCID: PMC9278956 DOI: 10.1021/jasms.0c00232] [Citation(s) in RCA: 73] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Imaging mass spectrometry has become a mature molecular mapping technology that is used for molecular discovery in many medical and biological systems. While powerful by itself, imaging mass spectrometry can be complemented by the addition of other orthogonal, chemically informative imaging technologies to maximize the information gained from a single experiment and enable deeper understanding of biological processes. Within this review, we describe MALDI, SIMS, and DESI imaging mass spectrometric technologies and how these have been integrated with other analytical modalities such as microscopy, transcriptomics, spectroscopy, and electrochemistry in a field termed multimodal imaging. We explore the future of this field and discuss forthcoming developments that will bring new insights to help unravel the molecular complexities of biological systems, from single cells to functional tissue structures and organs.
Collapse
Affiliation(s)
- Elizabeth K Neumann
- Department of Biochemistry, Vanderbilt University, 607 Light Hall, Nashville, Tennessee 37205, United States
- Mass Spectrometry Research Center, Vanderbilt University, 465 21st Avenue S #9160, Nashville, Tennessee 37235, United States
| | - Katerina V Djambazova
- Mass Spectrometry Research Center, Vanderbilt University, 465 21st Avenue S #9160, Nashville, Tennessee 37235, United States
- Department of Chemistry, Vanderbilt University, 7330 Stevenson Center, Station B 351822, Nashville, Tennessee 37235, United States
| | - Richard M Caprioli
- Department of Biochemistry, Vanderbilt University, 607 Light Hall, Nashville, Tennessee 37205, United States
- Mass Spectrometry Research Center, Vanderbilt University, 465 21st Avenue S #9160, Nashville, Tennessee 37235, United States
- Department of Chemistry, Vanderbilt University, 7330 Stevenson Center, Station B 351822, Nashville, Tennessee 37235, United States
- Department of Pharmacology, Vanderbilt University, 2220 Pierce Avenue, Nashville, Tennessee 37232, United States
- Department of Medicine, Vanderbilt University, 465 21st Avenue S #9160, Nashville, Tennessee 37235, United States
| | - Jeffrey M Spraggins
- Department of Biochemistry, Vanderbilt University, 607 Light Hall, Nashville, Tennessee 37205, United States
- Mass Spectrometry Research Center, Vanderbilt University, 465 21st Avenue S #9160, Nashville, Tennessee 37235, United States
- Department of Chemistry, Vanderbilt University, 7330 Stevenson Center, Station B 351822, Nashville, Tennessee 37235, United States
| |
Collapse
|
20
|
Taunk K, Kalita B, Kale V, Chanukuppa V, Naiya T, Zingde SM, Rapole S. The development and clinical applications of proteomics: an Indian perspective. Expert Rev Proteomics 2020; 17:433-451. [PMID: 32576061 DOI: 10.1080/14789450.2020.1787157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
INTRODUCTION Proteomic research has been extensively used to identify potential biomarkers or targets for various diseases. Advances in mass spectrometry along with data analytics have led proteomics to become a powerful tool for exploring the critical molecular players associated with diseases, thereby, playing a significant role in the development of proteomic applications for the clinic. AREAS COVERED This review presents recent advances in the development and clinical applications of proteomics in India toward understanding various diseases including cancer, metabolic diseases, and reproductive diseases. Keywords combined with 'clinical proteomics in India' 'proteomic research in India' and 'mass spectrometry' were used to search PubMed. EXPERT OPINION The past decade has seen a significant increase in research in clinical proteomics in India. This approach has resulted in the development of proteomics-based marker technologies for disease management in the country. The majority of these investigations are still in the discovery phase and efforts have to be made to address the intended clinical use so that the identified potential biomarkers reach the clinic. To move toward this necessity, there is a pressing need to establish some key infrastructure requirements and meaningful collaborations between the clinicians and scientists which will enable more effective solutions to address health issues specific to India.
Collapse
Affiliation(s)
- Khushman Taunk
- Proteomics Lab, National Centre for Cell Science , Pune, Maharashtra, India.,Department of Biotechnology, Maulana Abul Kalam Azad University of Technology, West Bengal , Haringhata, West Bengal, India
| | - Bhargab Kalita
- Proteomics Lab, National Centre for Cell Science , Pune, Maharashtra, India
| | - Vaikhari Kale
- Proteomics Lab, National Centre for Cell Science , Pune, Maharashtra, India
| | | | - Tufan Naiya
- Department of Biotechnology, Maulana Abul Kalam Azad University of Technology, West Bengal , Haringhata, West Bengal, India
| | - Surekha M Zingde
- CH3-53, Kendriya Vihar, Sector 11, Kharghar , Navi Mumbai, Maharashtra, India
| | - Srikanth Rapole
- Proteomics Lab, National Centre for Cell Science , Pune, Maharashtra, India
| |
Collapse
|
21
|
Kumar P, Johnson JE, Easterly C, Mehta S, Sajulga R, Nunn B, Jagtap PD, Griffin TJ. A Sectioning and Database Enrichment Approach for Improved Peptide Spectrum Matching in Large, Genome-Guided Protein Sequence Databases. J Proteome Res 2020; 19:2772-2785. [DOI: 10.1021/acs.jproteome.0c00260] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Praveen Kumar
- Bioinformatics and Computational Biology, University of Minnesota−Rochester, Rochester, Minnesota 55904, United States
- Biochemistry Molecular Biology and Biophysics, University of Minnesota−Twin Cities, Minneapolis, Minnesota 55455, United States
| | - James E. Johnson
- Minnesota Supercomputing Institute, University of Minnesota−Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Caleb Easterly
- Biochemistry Molecular Biology and Biophysics, University of Minnesota−Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Subina Mehta
- Biochemistry Molecular Biology and Biophysics, University of Minnesota−Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Ray Sajulga
- Biochemistry Molecular Biology and Biophysics, University of Minnesota−Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Brook Nunn
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Pratik D. Jagtap
- Biochemistry Molecular Biology and Biophysics, University of Minnesota−Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Timothy J. Griffin
- Biochemistry Molecular Biology and Biophysics, University of Minnesota−Twin Cities, Minneapolis, Minnesota 55455, United States
| |
Collapse
|
22
|
Ser Z, Cifani P, Kentsis A. Optimized Cross-Linking Mass Spectrometry for in Situ Interaction Proteomics. J Proteome Res 2019; 18:2545-2558. [PMID: 31083951 DOI: 10.1021/acs.jproteome.9b00085] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Recent development of mass spectrometer cleavable protein cross-linkers and algorithms for their spectral identification now permits large-scale cross-linking mass spectrometry (XL-MS). Here, we optimized the use of cleavable disuccinimidyl sulfoxide (DSSO) cross-linker for labeling native protein complexes in live human cells. We applied a generalized linear mixture model to calibrate cross-link peptide-spectra matching (CSM) scores to control the sensitivity and specificity of large-scale XL-MS. Using specific CSM score thresholds to control the false discovery rate, we found that higher-energy collisional dissociation (HCD) and electron transfer dissociation (ETD) can both be effective for large-scale XL-MS protein interaction mapping. We found that the coverage of protein-protein interaction maps is significantly improved through the use of multiple proteases. In addition, the use of focused sample-specific search databases can be used to improve the specificity of cross-linked peptide spectral matching. Application of this approach to human chromatin labeled in live cells recapitulated known and revealed new protein interactions of nucleosomes and other chromatin-associated complexes in situ. This optimized approach for mapping native protein interactions should be useful for a wide range of biological problems.
Collapse
Affiliation(s)
| | | | - Alex Kentsis
- Department of Pediatrics, Pharmacology, and Physiology & Biophysics, Weill Cornell Medical College , Cornell University , New York , New York 10065 , United States
| |
Collapse
|
23
|
Optimisation of protein extraction for in-depth profiling of the cereal grain proteome. J Proteomics 2019; 197:23-33. [DOI: 10.1016/j.jprot.2019.02.009] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2018] [Revised: 02/01/2019] [Accepted: 02/11/2019] [Indexed: 12/20/2022]
|