1
|
Analysis of the Tropism of SARS-CoV-2 Based on the Host Interactome of the Spike Protein. J Proteome Res 2023; 22:3742-3753. [PMID: 37939376 DOI: 10.1021/acs.jproteome.3c00387] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2023]
Abstract
The β-coronavirus SARS-CoV-2 causes severe acute respiratory syndrome (COVID-19) in humans. It enters and infects epithelial airway cells upon binding of the receptor binding domain (RBD) of the virus entry protein spike to the host receptor protein Angiotensin Converting Enzyme 2 (ACE2). Here, we used coimmunoprecipitation coupled with bottom-up mass spectrometry to identify host proteins that engaged with the spike protein in human bronchial epithelial cells (16HBEo-). We found that the spike protein bound to extracellular laminin and thrombospondin and endoplasmatic reticulum (ER)-resident DJB11 and FBX2 proteins. The ER-resident proteins UGGT1, CALX, HSP7A, and GRP78/BiP bound preferentially to the original Wuhan D614 over the mutated G614 spike protein in the more rapidly spreading Alpha SARS-CoV-2 strain. The increase in protein binding to the D614 spike might be explained by higher accessibility of cryptic sites in "RDB open" and "S2 only" D614 spike protein conformations and may enable SARS-CoV-2 to infect additional, ACE2-negative cell types. Moreover, a novel proteome-based cell type set enrichment analysis (pCtSEA) found that host factors like laminin might render additional cell types such as macrophages and epithelial cells in the nephron permissive to SARS-CoV-2 infection.
Collapse
|
2
|
Abstract
Traditional mass spectrometry-based glycoproteomic approaches have been widely used for site-specific N-glycoform analysis, but a large amount of starting material is needed to obtain sampling that is representative of the vast diversity of N-glycans on glycoproteins. These methods also often include a complicated workflow and very challenging data analysis. These limitations have prevented glycoproteomics from being adapted to high-throughput platforms, and the sensitivity of the analysis is currently inadequate for elucidating N-glycan heterogeneity in clinical samples. Heavily glycosylated spike proteins of enveloped viruses, recombinantly expressed as potential vaccines, are prime targets for glycoproteomic analysis. Since the immunogenicity of spike proteins may be impacted by their glycosylation patterns, site-specific analysis of N-glycoforms provides critical information for vaccine design. Using recombinantly expressed soluble HIV Env trimer, we describe DeGlyPHER, a modification of our previously reported sequential deglycosylation strategy to yield a "single-pot" process. DeGlyPHER is an ultrasensitive, simple, rapid, robust, and efficient approach for site-specific analysis of protein N-glycoforms, that we developed for analysis of limited quantities of glycoproteins.
Collapse
|
3
|
Interactome analysis illustrates diverse gene regulatory processes associated with LIN28A in human iPS cell-derived neural progenitor cells. iScience 2021; 24:103321. [PMID: 34816099 PMCID: PMC8593586 DOI: 10.1016/j.isci.2021.103321] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Revised: 09/07/2021] [Accepted: 10/19/2021] [Indexed: 12/02/2022] Open
Abstract
A single protein can be multifaceted depending on the cellular contexts and interacting molecules. LIN28A is an RNA-binding protein that governs developmental timing, cellular proliferation, differentiation, stem cell pluripotency, and metabolism. In addition to its best-known roles in microRNA biogenesis, diverse molecular roles have been recognized. In the nervous system, LIN28A is known to play critical roles in proliferation and differentiation of neural progenitor cells (NPCs). We profiled the endogenous LIN28A-interacting proteins in NPCs differentiated from human induced pluripotent stem (iPS) cells using immunoprecipitation and liquid chromatography-tandem mass spectrometry. We identified over 500 LIN28A-interacting proteins, including 156 RNA-independent interactors. Functions of these proteins span a wide range of gene regulatory processes. Prompted by the interactome data, we revealed that LIN28A may impact the subcellular distribution of its interactors and stress granule formation upon oxidative stress. Overall, our analysis opens multiple avenues for elaborating molecular roles and characteristics of LIN28A.
Collapse
|
4
|
Abstract
Viruses can evade the host immune system by displaying numerous glycans on their surface "spike-proteins" that cover immune epitopes. We have developed an ultrasensitive "single-pot" method to assess glycan occupancy and the extent of glycan processing from high-mannose to complex forms at each N-glycosylation site. Though aimed at characterizing glycosylation of viral spike-proteins as potential vaccines, this method is applicable for the analysis of site-specific glycosylation of any glycoprotein.
Collapse
|
5
|
Protein Footprinting via Covalent Protein Painting Reveals Structural Changes of the Proteome in Alzheimer's Disease. J Proteome Res 2021; 20:2762-2771. [PMID: 33872013 PMCID: PMC8477671 DOI: 10.1021/acs.jproteome.0c00912] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Misfolding and aggregation of amyloid-β peptide and hyperphosphorylated tau are molecular markers of Alzheimer's disease (AD), and although the 3D structures of these aberrantly folded proteins have been visualized in exquisite detail, no method has been able to survey protein folding across the proteome in AD. Here, we present covalent protein painting (CPP), a mass spectrometry-based protein footprinting approach to quantify the accessibility of lysine ε-amines for covalent modification at the surface of natively folded proteins. We used CPP to survey the reactivity of 2645 lysine residues and therewith the structural proteome of HEK293T cells and found that reactivity increased upon mild heat shock. CPP revealed that the accessibility of lysine residues for covalent modification in tubulin-β (TUBB), in succinate dehydrogenase (SHDB), and in amyloid-β peptide (Aβ) is altered in human postmortem brain samples of patients with neurodegenerative diseases. The structural alterations of TUBB and SHDB in patients with AD, dementia with Lewy bodies (DLB), or both point to broader perturbations of the 3D proteome beyond Aβ and hyperphosphorylated tau.
Collapse
|
6
|
The Host Interactome of Spike Expands the Tropism of SARS-CoV-2. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2021:2021.02.16.431318. [PMID: 33619478 PMCID: PMC7899442 DOI: 10.1101/2021.02.16.431318] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The SARS-CoV-2 virus causes severe acute respiratory syndrome (COVID-19) and has rapidly created a global pandemic. Patients that survive may face a slow recovery with long lasting side effects that can afflict different organs. SARS-CoV-2 primarily infects epithelial airway cells that express the host entry receptor Angiotensin Converting Enzyme 2 (ACE2) which binds to spike protein trimers on the surface of SARS-CoV-2 virions. However, SARS-CoV-2 can spread to other tissues even though they are negative for ACE2. To gain insight into the molecular constituents that might influence SARS-CoV-2 tropism, we determined which additional host factors engage with the viral spike protein in disease-relevant human bronchial epithelial cells (16HBEo-). We found that spike recruited the extracellular proteins laminin and thrombospondin and was retained in the endoplasmatic reticulum (ER) by the proteins DJB11 and FBX2 which support re-folding or degradation of nascent proteins in the ER. Because emerging mutations of the spike protein potentially impact the virus tropism, we compared the interactome of D614 spike with that of the rapidly spreading G614 mutated spike. More D614 than G614 spike associated with the proteins UGGT1, calnexin, HSP7A and GRP78/BiP which ensure glycosylation and folding of proteins in the ER. In contrast to G614 spike, D614 spike was endoproteolytically cleaved, and the N-terminal S1 domain was degraded in the ER even though C-terminal 'S2 only' proteoforms remained present. D614 spike also bound more laminin than G614 spike, which suggested that extracellular laminins may function as co-factor for an alternative, 'S2 only' dependent virus entry. Because the host interactome determines whether an infection is productive, we developed a novel proteome-based cell type set enrichment analysis (pCtSEA). With pCtSEA we determined that the host interactome of the spike protein may extend the tropism of SARS-CoV-2 beyond mucous epithelia to several different cell types, including macrophages and epithelial cells in the nephron. An 'S2 only' dependent, alternative infection of additional cell types with SARS-CoV-2 may impact vaccination strategies and may provide a molecular explanation for a severe or prolonged progression of disease in select COVID-19 patients.
Collapse
|
7
|
Temporal Quantitative Profiling of Newly Synthesized Proteins during Aβ Accumulation. J Proteome Res 2020; 20:763-775. [PMID: 33147027 DOI: 10.1021/acs.jproteome.0c00645] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Accumulation of aggregated amyloid beta (Aβ) in the brain is believed to impair multiple cellular pathways and play a central role in Alzheimer's disease pathology. However, how this process is regulated remains unclear. In theory, measuring protein synthesis is the most direct way to evaluate a cell's response to stimuli, but to date, there have been few reliable methods to do this. To identify the protein regulatory network during the development of Aβ deposition in AD, we applied a new proteomic technique to quantitate newly synthesized protein (NSP) changes in the cerebral cortex and hippocampus of 2-, 5-, and 9-month-old APP/PS1 AD transgenic mice. This bio-orthogonal noncanonical amino acid tagging analysis combined PALM (pulse azidohomoalanine labeling in mammals) and HILAQ (heavy isotope labeled AHA quantitation) to reveal a comprehensive dataset of NSPs prior to and post Aβ deposition, including the identification of proteins not previously associated with AD, and demonstrated that the pattern of differentially expressed NSPs is age-dependent. We also found dysregulated vesicle transportation networks including endosomal subunits, coat protein complex I (COPI), and mitochondrial respiratory chain throughout all time points and two brain regions. These results point to a pathological dysregulation of vesicle transportation which occurs prior to Aβ accumulation and the onset of AD symptoms, which may progressively impact the entire protein network and thereby drive neurodegeneration. This study illustrates key pathway regulation responses to the development of AD pathogenesis by directly measuring the changes in protein synthesis and provides unique insights into the mechanisms that underlie AD.
Collapse
|
8
|
Abstract
Protein degradation is an essential mechanism for maintaining proteostasis in response to internal and external perturbations. Disruption of this process is implicated in many human diseases. We present a new technique, QUAD (Quantification of Azidohomoalanine Degradation), to analyze the global degradation rates in tissues using a non-canonical amino acid and mass spectrometry. QUAD analysis reveals that protein stability varied within tissues, but discernible trends in the data suggest that cellular environment is a major factor dictating stability. Within a tissue, different organelles and protein functions were enriched with different stability patterns. QUAD analysis demonstrated that protein stability is enhanced with age in the brain but not in the liver. Overall, QUAD allows the first global quantitation of protein stability rates in tissues, which will allow new insights and hypotheses in basic and translational research.
Collapse
|
9
|
Abstract
Data-independent acquisition (DIA) is a promising technique for the proteomic analysis of complex protein samples. A number of studies have claimed that DIA experiments are more reproducible than data-dependent acquisition (DDA), but these claims are unsubstantiated since different data analysis methods are used in the two methods. Data analysis in most DIA workflows depends on spectral library searches, whereas DDA typically employs sequence database searches. In this study, we examined the reproducibility of the DIA and DDA results using both sequence database and spectral library search. The comparison was first performed using a cell lysate and then extended to an interactome study. Protein overlap among the technical replicates in both DDA and DIA experiments was 30% higher with library-based identifications than with sequence database identifications. The reproducibility of quantification was also improved with library search compared to database search, with the mean of the coefficient of variation decreasing more than 30% and a reduction in the number of missing values of more than 35%. Our results show that regardless of the acquisition method, higher identification and quantification reproducibility is observed when library search was used.
Collapse
|
10
|
Abstract
Mass spectrometry-based proteomics is an invaluable tool for addressing important biological questions. Data-dependent acquisition methods effectuate stochastic acquisition of data in complex mixtures, which results in missing identifications across replicates. We developed a search approach that improves the reproducibility of data acquired from any mass spectrometer. In our approach, a spectral library is built from the identification results from a database search, and then, the library is used to research the same data files to obtain the final result. We showed that higher identification and quantification reproducibility is achieved with the dual-search approach than with a typical database search. Four datasets with different complexity were compared: (1) data from a cell lysate study performed in our lab, (2) data from an interactome study performed in our lab, (3) a publicly available extracellular vesicles dataset, and (4) a publicly available phosphoproteomics dataset. Our results show that the dual-search approach can be widely and easily used to improve data quality in proteomics data.
Collapse
|
11
|
Comparison of CRISPR Genomic Tagging for Affinity Purification and Endogenous Immunoprecipitation Coupled with Quantitative Mass Spectrometry To Identify the Dynamic AMPKα2 Interactome. J Proteome Res 2019; 18:3703-3714. [PMID: 31398040 DOI: 10.1021/acs.jproteome.9b00378] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Recent advances in genome editing technologies have enabled the insertion of epitope tags at endogenous loci with relative efficiency. We describe an approach for investigation of protein interaction dynamics of the AMP-activated kinase complex AMPK using a catalytic subunit AMPKα2 (PRKAA2 gene) as the bait, based on CRISPR/Cas9-mediated genome editing coupled to stable isotope labeling in cell culture, multidimensional protein identification technology, and computational and statistical analyses. Furthermore, we directly compare this genetic epitope tagging approach to endogenous immunoprecipitations of the same gene under homologous conditions to assess differences in observed interactors. Additionally, we directly compared each enrichment strategy in the genetically modified cell-line with two separate endogenous antibodies. For each approach, we analyzed the interaction profiles of this protein complex under basal and activated states, and after implementing the same analytical, computational, and statistical analyses, we found that high-confidence protein interactors vary greatly with each method and between commercially available endogenous antibodies.
Collapse
|
12
|
Proteomics INTegrator (PINT): An Online Tool To Store, Query, and Visualize Large Proteomics Experiment Results. J Proteome Res 2019; 18:2999-3008. [PMID: 31260318 PMCID: PMC8278777 DOI: 10.1021/acs.jproteome.8b00711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The characterization of complex biological systems based on high-throughput protein quantification through mass spectrometry commonly involves differential expression analysis between replicate samples originating from different experimental conditions. Here we present Proteomics INTegrator (PINT), a new user-friendly Web-based platform-independent system to store, visualize, and query proteomics experiment results. PINT provides an extremely flexible query interface that allows advanced Boolean algebra-based data filtering of many different proteomics features such as confidence values, abundance levels or ratios, data set overlaps, sample characteristics, as well as UniProtKB annotations, which are transparently incorporated into the system. In addition, PINT allows developers to incorporate data visualization and analysis tools, such as PSEA-Quant and Reactome pathway analysis, for data set enrichment analysis. PINT serves as a centralized hub for large-scale proteomics data and as a platform for data analysis, facilitating the interpretation of proteomics results and expediting biologically relevant conclusions.
Collapse
|
13
|
Abstract
Kinases are a major clinical target for human diseases. Identifying the proteins that interact with kinases in vivo will provide information on unreported substrates and will potentially lead to more specific methods for therapeutic kinase regulation. Here, endogenous immunoprecipitations of evolutionally distinct kinases (i.e., Akt, ERK2, and CAMK2) from rodent hippocampi were analyzed by mass spectrometry to generate three highly confident kinase protein-protein interaction networks. Proteins of similar function were identified in the networks, suggesting a universal model for kinase signaling complexes. Protein interactions were observed between kinases with reported symbiotic relationships. The kinase networks were significantly enriched in genes associated with specific neurodevelopmental disorders providing novel structural connections between these disease-associated genes. To demonstrate a functional relationship between the kinases and the network, pharmacological manipulation of Akt in hippocampal slices was shown to regulate the activity of potassium/sodium hyperpolarization-activated cyclic nucleotide-gated channel(HCN1), which was identified in the Akt network. Overall, the kinase protein-protein interaction networks provide molecular insight of the spatial complexity of in vivo kinase signal transduction which is required to achieve the therapeutic potential of kinase manipulation in the brain.
Collapse
|
14
|
Understanding molecular mechanisms of disease through spatial proteomics. Curr Opin Chem Biol 2018; 48:19-25. [PMID: 30308467 DOI: 10.1016/j.cbpa.2018.09.016] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2018] [Revised: 09/17/2018] [Accepted: 09/19/2018] [Indexed: 02/07/2023]
Abstract
Mammalian cells are organized into different compartments that separate and facilitate physiological processes by providing specialized local environments and allowing different, otherwise incompatible biological processes to be carried out simultaneously. Proteins are targeted to these subcellular locations where they fulfill specialized, compartment-specific functions. Spatial proteomics aims to localize and quantify proteins within subcellular structures.
Collapse
|
15
|
PACOM: A Versatile Tool for Integrating, Filtering, Visualizing, and Comparing Multiple Large Mass Spectrometry Proteomics Data Sets. J Proteome Res 2018; 17:1547-1558. [DOI: 10.1021/acs.jproteome.7b00858] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
|
16
|
Increased proteomic complexity in Drosophila hybrids during development. SCIENCE ADVANCES 2018; 4:eaao3424. [PMID: 29441361 PMCID: PMC5810618 DOI: 10.1126/sciadv.aao3424] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Accepted: 01/11/2018] [Indexed: 06/08/2023]
Abstract
Cellular proteomes are thought to be optimized for function, leaving no room for proteome plasticity and, thus, evolution. However, hybrid animals that result from a viable cross of two different species harbor hybrid proteomes of unknown complexity. We charted the hybrid proteome of a viable cross between Drosophila melanogaster females and Drosophila simulans males with bottom-up proteomics. Developing hybrids harbored 20% novel proteins in addition to proteins that were also present in either parental species. In contrast, adult hybrids and developmentally failing embryos of the reciprocal cross showed less additional proteins (5 and 6%, respectively). High levels of heat shock proteins, proteasome-associated proteins, and proteasomal subunits indicated that proteostasis sustains the expanded complexity of the proteome in developing hybrids. We conclude that increased proteostasis gives way to proteomic plasticity and thus opens up additional space for rapid phenotypic variation during embryonic development.
Collapse
|
17
|
The mzIdentML Data Standard Version 1.2, Supporting Advances in Proteome Informatics. Mol Cell Proteomics 2017; 16:1275-1285. [PMID: 28515314 PMCID: PMC5500760 DOI: 10.1074/mcp.m117.068429] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2017] [Revised: 05/15/2017] [Indexed: 12/31/2022] Open
Abstract
The first stable version of the Proteomics Standards Initiative mzIdentML open data standard (version 1.1) was published in 2012-capturing the outputs of peptide and protein identification software. In the intervening years, the standard has become well-supported in both commercial and open software, as well as a submission and download format for public repositories. Here we report a new release of mzIdentML (version 1.2) that is required to keep pace with emerging practice in proteome informatics. New features have been added to support: (1) scores associated with localization of modifications on peptides; (2) statistics performed at the level of peptides; (3) identification of cross-linked peptides; and (4) support for proteogenomics approaches. In addition, there is now improved support for the encoding of de novo sequencing of peptides, spectral library searches, and protein inference. As a key point, the underlying XML schema has only undergone very minor modifications to simplify as much as possible the transition from version 1.1 to version 1.2 for implementers, but there have been several notable updates to the format specification, implementation guidelines, controlled vocabularies and validation software. mzIdentML 1.2 can be described as backwards compatible, in that reading software designed for mzIdentML 1.1 should function in most cases without adaptation. We anticipate that these developments will provide a continued stable base for software teams working to implement the standard. All the related documentation is accessible at http://www.psidev.info/mzidentml.
Collapse
|
18
|
Global quantitative analysis of phosphorylation underlying phencyclidine signaling and sensorimotor gating in the prefrontal cortex. Mol Psychiatry 2016; 21:205-15. [PMID: 25869802 PMCID: PMC4605830 DOI: 10.1038/mp.2015.41] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/10/2014] [Revised: 01/27/2015] [Accepted: 03/02/2015] [Indexed: 01/09/2023]
Abstract
Prepulse inhibition (PPI) is an example of sensorimotor gating and deficits in PPI have been demonstrated in schizophrenia patients. Phencyclidine (PCP) suppression of PPI in animals has been studied to elucidate the pathological elements of schizophrenia. However, the molecular mechanisms underlying PCP treatment or PPI in the brain are still poorly understood. In this study, quantitative phosphoproteomic analysis was performed on the prefrontal cortex from rats that were subjected to PPI after being systemically injected with PCP or saline. PCP downregulated phosphorylation events were significantly enriched in proteins associated with long-term potentiation (LTP). Importantly, this data set identifies functionally novel phosphorylation sites on known LTP-associated signaling molecules. In addition, mutagenesis of a significantly altered phosphorylation site on xCT (SLC7A11), the light chain of system xc-, the cystine/glutamate antiporter, suggests that PCP also regulates the activity of this protein. Finally, new insights were also derived on PPI signaling independent of PCP treatment. This is the first quantitative phosphorylation proteomic analysis providing new molecular insights into sensorimotor gating.
Collapse
|
19
|
∆F508 CFTR interactome remodelling promotes rescue of cystic fibrosis. Nature 2015; 528:510-6. [PMID: 26618866 PMCID: PMC4826614 DOI: 10.1038/nature15729] [Citation(s) in RCA: 183] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2013] [Accepted: 09/14/2015] [Indexed: 12/16/2022]
Abstract
Deletion of phenylalanine 508 of the Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) is the major cause of Cystic Fibrosis (CF), one of the most common inherited childhood diseases. The mutated CFTR anion channel is not fully glycosylated and shows minimal activity in bronchial epithelial cells of CF patients. Low temperature or inhibition of histone deacetylases (HDACi) can partially rescue ΔF508 CFTR cellular processing defects and function. A favorable change of ΔF508 CFTR protein-protein interactions was proposed as mechanism of rescue, however CFTR interactome dynamics during temperature-shift and HDACi rescue are unknown. Here, we report the first comprehensive analysis of the wt and ΔF508 CFTR interactome and its dynamics during temperature shift and HDACi. By using a novel deep proteomic analysis method (CoPIT), we identified 638 individual high-confidence CFTR interactors and discovered a mutation-specific interactome, which is extensively remodeled upon rescue. Detailed analysis of the interactome remodeling identified key novel interactors, whose loss promoted enhanced CFTR channel function in primary CF epithelia or which were critical for normal CFTR biogenesis. Our results demonstrate that global remodeling of ΔF508 CFTR interactions is crucial for rescue, and provide comprehensive insight into the molecular disease mechanisms of CF caused by deletion of F508.
Collapse
|
20
|
From raw data to biological discoveries: a computational analysis pipeline for mass spectrometry-based proteomics. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2015; 26:1820-1826. [PMID: 26002791 PMCID: PMC4607643 DOI: 10.1007/s13361-015-1161-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/30/2015] [Revised: 04/03/2015] [Accepted: 04/05/2015] [Indexed: 06/04/2023]
Abstract
In the last two decades, computational tools for mass spectrometry-based proteomics data analysis have evolved from a few stand-alone software solutions serving specific goals, such as the identification of amino acid sequences based on mass spectrometry spectra, to large-scale complex pipelines integrating multiple computer programs to solve a collection of problems. This software evolution has been mostly driven by the appearance of novel technologies that allowed the community to tackle complex biological problems, such as the identification of proteins that are differentially expressed in two samples under different conditions. The achievement of such objectives requires a large suite of programs to analyze the intricate mass spectrometry data. Our laboratory addresses complex proteomics questions by producing and using algorithms and software packages. Our current computational pipeline includes, among other things, tools for mass spectrometry raw data processing, peptide and protein identification and quantification, post-translational modification analysis, and protein functional enrichment analysis. In this paper, we describe a suite of software packages we have developed to process mass spectrometry-based proteomics data and we highlight some of the new features of previously published programs as well as tools currently under development. Graphical Abstract ᅟ.
Collapse
|
21
|
Abstract
![]()
Quantification
of proteomes by mass spectrometry has proven to
be useful to study human pathology recapitulated in cellular or animal
models of disease. Enriching and quantifying newly synthesized proteins
(NSPs) at set time points by mass spectrometry has the potential to
identify important early regulatory or expression changes associated
with disease states or perturbations. NSP can be enriched from proteomes
by employing pulsed introduction of the noncanonical amino acid, azidohomoalanine
(AHA). We demonstrate that pulsed introduction of AHA in the feed
of mice can label and identify NSP from multiple tissues. Furthermore,
we quantitate differences in new protein expression resulting from
CRE-LOX initiated knockout of LKB1 in mouse livers. Overall, the PALM
strategy allows for the first time in vivo labeling of mouse tissues
to differentiate protein synthesis rates at discrete time points.
Collapse
|
22
|
Multicenter experiment for quality control of peptide-centric LC-MS/MS analysis - A longitudinal performance assessment with nLC coupled to orbitrap MS analyzers. J Proteomics 2015; 127:264-74. [PMID: 25982386 DOI: 10.1016/j.jprot.2015.05.012] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2015] [Revised: 05/07/2015] [Accepted: 05/11/2015] [Indexed: 11/19/2022]
Abstract
Proteomic technologies based on mass spectrometry (MS) have greatly evolved in the past years, and nowadays it is possible to routinely identify thousands of peptides from complex biological samples in a single LC-MS/MS experiment. Despite the advancements in proteomic technologies, the scientific community still faces important challenges in terms of depth and reproducibility of proteomics analyses. Here, we present a multicenter study designed to evaluate long-term performance of LC-MS/MS platforms within the Spanish Proteomics Facilities Network (ProteoRed-ISCIII). The study was performed under well-established standard operating procedures, and demonstrated that it is possible to attain qualitative and quantitative reproducibility over time. Our study highlights the importance of deploying quality assessment metrics routinely in individual laboratories and in multi-laboratory studies. The mass spectrometry data have been deposited to the ProteomeXchange Consortium with the data set identifier PXD000205.This article is part of a Special Issue entitled: HUPO 2014.
Collapse
|
23
|
Quantitative Proteomics of Human Fibroblasts with I1061T Mutation in Niemann-Pick C1 (NPC1) Protein Provides Insights into the Disease Pathogenesis. Mol Cell Proteomics 2015; 14:1734-49. [PMID: 25873482 DOI: 10.1074/mcp.m114.045609] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2014] [Indexed: 11/06/2022] Open
Abstract
Niemann-Pick type C (NPC) disease is a fatal neurodegenerative disorder characterized by the accumulation of unesterified cholesterol in the late endosomal/lysosomal compartments. Mutations in the NPC1 protein are implicated in 95% of patients with NPC disease. The most prevalent mutation is the missense mutation I1061T that occurs in ∼ 15-20% of the disease alleles. In our study, an isobaric labeling-based quantitative analysis of proteome of NPC1(I1061T) primary fibroblasts when compared with wild-type cells identified 281 differentially expressed proteins based on stringent data analysis criteria. Gene ontology enrichment analysis revealed that these proteins play important roles in diverse cellular processes such as protein maturation, energy metabolism, metabolism of reactive oxygen species, antioxidant activity, steroid metabolism, lipid localization, and apoptosis. The relative expression level of a subset of differentially expressed proteins (TOR4A, DHCR24, CLGN, SOD2, CHORDC1, HSPB7, and GAA) was independently and successfully substantiated by Western blotting. We observed that treating NPC1(I1061T) cells with four classes of seven different compounds that are potential NPC drugs increased the expression level of SOD2 and DHCR24. We have also shown an abnormal accumulation of glycogen in NPC1(I1061T) fibroblasts possibly triggered by defective processing of lysosomal alpha-glucosidase. Our study provides a starting point for future more focused investigations to better understand the mechanisms by which the reported dysregulated proteins triggers the pathological cascade in NPC, and furthermore, their effect upon therapeutic interventions.
Collapse
|
24
|
A standardized framing for reporting protein identifications in mzIdentML 1.2. Proteomics 2014; 14:2389-99. [PMID: 25092112 DOI: 10.1002/pmic.201400080] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2014] [Revised: 07/02/2014] [Accepted: 07/31/2014] [Indexed: 11/09/2022]
Abstract
Inferring which protein species have been detected in bottom-up proteomics experiments has been a challenging problem for which solutions have been maturing over the past decade. While many inference approaches now function well in isolation, comparing and reconciling the results generated across different tools remains difficult. It presently stands as one of the greatest barriers in collaborative efforts such as the Human Proteome Project and public repositories such as the PRoteomics IDEntifications (PRIDE) database. Here we present a framework for reporting protein identifications that seeks to improve capabilities for comparing results generated by different inference tools. This framework standardizes the terminology for describing protein identification results, associated with the HUPO-Proteomics Standards Initiative (PSI) mzIdentML standard, while still allowing for differing methodologies to reach that final state. It is proposed that developers of software for reporting identification results will adopt this terminology in their outputs. While the new terminology does not require any changes to the core mzIdentML model, it represents a significant change in practice, and, as such, the rules will be released via a new version of the mzIdentML specification (version 1.2) so that consumers of files are able to determine whether the new guidelines have been adopted by export software.
Collapse
|
25
|
The Minimal Information about a Proteomics Experiment (MIAPE) from the Proteomics Standards Initiative. Methods Mol Biol 2014; 1072:765-80. [PMID: 24136562 DOI: 10.1007/978-1-62703-631-3_53] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
During the last 10 years, the Proteomics Standards Initiative from the Human Proteome Organization (HUPO-PSI) has worked on defining standards for proteomics data representation as well as guidelines that state the minimum information that should be included when reporting a proteomics experiment (MIAPE). Such minimum information must describe the complete experiment, including both experimental protocols and data processing methods, allowing a critical evaluation of the whole process and the potential recreation of the work. In this chapter we describe the standardization work performed by the HUPO-PSI, and then we concentrate on the MIAPE guidelines, highlighting its importance when publishing proteomics experiments particularly in specialized proteomics journals. Finally, we describe existing bioinformatics resources that generate MIAPE compliant reports or that check proteomics data files for MIAPE compliance.
Collapse
|
26
|
Surfing transcriptomic landscapes. A step beyond the annotation of chromosome 16 proteome. J Proteome Res 2013; 13:158-72. [PMID: 24138474 DOI: 10.1021/pr400721r] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
The Spanish team of the Human Proteome Project (SpHPP) marked the annotation of Chr16 and data analysis as one of its priorities. Precise annotation of Chromosome 16 proteins according to C-HPP criteria is presented. Moreover, Human Body Map 2.0 RNA-Seq and Encyclopedia of DNA Elements (ENCODE) data sets were used to obtain further information relative to cell/tissue specific chromosome 16 coding gene expression patterns and to infer the presence of missing proteins. Twenty-four shotgun 2D-LC-MS/MS and gel/LC-MS/MS MIAPE compliant experiments, representing 41% coverage of chromosome 16 proteins, were performed. Furthermore, mapping of large-scale multicenter mass spectrometry data sets from CCD18, MCF7, Jurkat, and Ramos cell lines into RNA-Seq data allowed further insights relative to correlation of chromosome 16 transcripts and proteins. Detection and quantification of chromosome 16 proteins in biological matrices by SRM procedures are also primary goals of the SpHPP. Two strategies were undertaken: one focused on known proteins, taking advantage of MS data already available, and the second, aimed at the detection of the missing proteins, is based on the expression of recombinant proteins to gather MS information and optimize SRM methods that will be used in real biological samples. SRM methods for 49 known proteins and for recombinant forms of 24 missing proteins are reported in this study.
Collapse
|
27
|
Tools (Viewer, Library and Validator) that facilitate use of the peptide and protein identification standard format, termed mzIdentML. Mol Cell Proteomics 2013; 12:3026-35. [PMID: 23813117 PMCID: PMC3820921 DOI: 10.1074/mcp.o113.029777] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
The Proteomics Standards Initiative has recently released the mzIdentML data standard for representing peptide and protein identification results, for example, created by a search engine. When a new standard format is produced, it is important that software tools are available that make it straightforward for laboratory scientists to use it routinely and for bioinformaticians to embed support in their own tools. Here we report the release of several open-source Java-based software packages based on mzIdentML: ProteoIDViewer, mzidLibrary, and mzidValidator. The ProteoIDViewer is a desktop application allowing users to visualize mzIdentML-formatted results originating from any appropriate identification software; it supports visualization of all the features of the mzIdentML format. The mzidLibrary is a software library containing routines for importing data from external search engines, post-processing identification data (such as false discovery rate calculations), combining results from multiple search engines, performing protein inference, setting identification thresholds, and exporting results from mzIdentML to plain text files. The mzidValidator is able to process files and report warnings or errors if files are not correctly formatted or contain some semantic error. We anticipate that these developments will simplify adoption of the new standard in proteomics laboratories and the integration of mzIdentML into other software tools. All three tools are freely available in the public domain.
Collapse
|
28
|
Guidelines for reporting quantitative mass spectrometry based experiments in proteomics. J Proteomics 2013; 95:84-8. [PMID: 23500130 DOI: 10.1016/j.jprot.2013.02.026] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2013] [Revised: 02/25/2013] [Accepted: 02/27/2013] [Indexed: 10/27/2022]
Abstract
UNLABELLED Mass spectrometry is already a well-established protein identification tool and recent methodological and technological developments have also made possible the extraction of quantitative data of protein abundance in large-scale studies. Several strategies for absolute and relative quantitative proteomics and the statistical assessment of quantifications are possible, each having specific measurements and therefore, different data analysis workflows. The guidelines for Mass Spectrometry Quantification allow the description of a wide range of quantitative approaches, including labeled and label-free techniques and also targeted approaches such as Selected Reaction Monitoring (SRM). BIOLOGICAL SIGNIFICANCE The HUPO Proteomics Standards Initiative (HUPO-PSI) has invested considerable efforts to improve the standardization of proteomics data handling, representation and sharing through the development of data standards, reporting guidelines, controlled vocabularies and tooling. In this manuscript, we describe a key output from the HUPO-PSI-namely the MIAPE Quant guidelines, which have developed in parallel with the corresponding data exchange format mzQuantML [1]. The MIAPE Quant guidelines describe the HUPO-PSI proposal concerning the minimum information to be reported when a quantitative data set, derived from mass spectrometry (MS), is submitted to a database or as supplementary information to a journal. The guidelines have been developed with input from a broad spectrum of stakeholders in the proteomics field to represent a true consensus view of the most important data types and metadata, required for a quantitative experiment to be analyzed critically or a data analysis pipeline to be reproduced. It is anticipated that they will influence or be directly adopted as part of journal guidelines for publication and by public proteomics databases and thus may have an impact on proteomics laboratories across the world. This article is part of a Special Issue entitled: Standardization and Quality Control.
Collapse
|
29
|
Abstract
The Chromosome 16 Consortium forms part of the Human Proteome Project that aims to develop an entire map of the proteins encoded by the human genome following a chromosome-centric strategy (C-HPP) to make progress in the understanding of human biology in health and disease (B/D-HPP). A Spanish consortium of 16 laboratories was organized into five working groups: Protein/Antibody microarrays, protein expression and Peptide Standard, S/MRM, Protein Sequencing, Bioinformatics and Clinical healthcare, and Biobanking. The project is conceived on a multicenter configuration, assuming the standards and integration procedures already available in ProteoRed-ISCIII, which is encompassed within HUPO initiatives. The products of the 870 protein coding genes in chromosome 16 were analyzed in Jurkat T lymphocyte cells, MCF-7 epithelial cells, and the CCD18 fibroblast cell line as it is theoretically expected that most chromosome 16 protein coding genes are expressed in at least one of these. The transcriptome and proteome of these cell lines was studied using gene expression microarray and shotgun proteomics approaches, indicating an ample coverage of chromosome 16. With regard to the B/D section, the main research areas have been adopted and a biobanking initiative has been designed to optimize methods for sample collection, management, and storage under normalized conditions and to define QC standards. The general strategy of the Chr-16 HPP and the current state of the different initiatives are discussed.
Collapse
|
30
|
The ProteoRed MIAPE web toolkit: a user-friendly framework to connect and share proteomics standards. Mol Cell Proteomics 2012; 10:M111.008334. [PMID: 21983993 DOI: 10.1074/mcp.m111.008334] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The development of the HUPO-PSI's (Proteomics Standards Initiative) standard data formats and MIAPE (Minimum Information About a Proteomics Experiment) guidelines should improve proteomics data sharing within the scientific community. Proteomics journals have encouraged the use of these standards and guidelines to improve the quality of experimental reporting and ease the evaluation and publication of manuscripts. However, there is an evident lack of bioinformatics tools specifically designed to create and edit standard file formats and reports, or embed them within proteomics workflows. In this article, we describe a new web-based software suite (The ProteoRed MIAPE web toolkit) that performs several complementary roles related to proteomic data standards. First, it can verify that the reports fulfill the minimum information requirements of the corresponding MIAPE modules, highlighting inconsistencies or missing information. Second, the toolkit can convert several XML-based data standards directly into human readable MIAPE reports stored within the ProteoRed MIAPE repository. Finally, it can also perform the reverse operation, allowing users to export from MIAPE reports into XML files for computational processing, data sharing, or public database submission. The toolkit is thus the first application capable of automatically linking the PSI's MIAPE modules with the corresponding XML data exchange standards, enabling bidirectional conversions. This toolkit is freely available at http://www.proteored.org/MIAPE/.
Collapse
|
31
|
Abstract
Quantitative proteomics using stable isotopic 16O/18O labeling has emerged as a very powerful tool, since it has a number of advantages over other methods, including the simplicity of chemistry, the constant mass tag at the C termini and its general applicability. However, due to the small mass difference between labeled and unlabeled peptide species, this approach has usually been restricted to high-resolution mass spectrometers. In this study we explored whether the high-resolution scanning mode, together with the extremely high scanning speed of the linear IT allows the 16O/18O-labeling method to be used for accurate, large-scale quantitative analysis of proteomes. A protocol, including digestion, desalting, labeling, MS and quantitative analysis was developed and tested using protein standards and whole proteome extracts. Using this method we were able to identify and quantify 140 proteins from only 10 mug of a proteome extract from mesenchymal stem cells. Relative expression changes larger than twofold can be identified with this method at the 95% confidence level. Our results demonstrate that accurate quantitative analysis using 16O/18O labeling can be performed in the practice using linear IT MS, without compromising large-scale peptide identification efficiency.
Collapse
|
32
|
Properties of average score distributions of SEQUEST: the probability ratio method. Mol Cell Proteomics 2008; 7:1135-45. [PMID: 18303013 DOI: 10.1074/mcp.m700239-mcp200] [Citation(s) in RCA: 117] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
High throughput identification of peptides in databases from tandem mass spectrometry data is a key technique in modern proteomics. Common approaches to interpret large scale peptide identification results are based on the statistical analysis of average score distributions, which are constructed from the set of best scores produced by large collections of MS/MS spectra by using searching engines such as SEQUEST. Other approaches calculate individual peptide identification probabilities on the basis of theoretical models or from single-spectrum score distributions constructed by the set of scores produced by each MS/MS spectrum. In this work, we study the mathematical properties of average SEQUEST score distributions by introducing the concept of spectrum quality and expressing these average distributions as compositions of single-spectrum distributions. We predict and demonstrate in the practice that average score distributions are dominated by the quality distribution in the spectra collection, except in the low probability region, where it is possible to predict the dependence of average probability on database size. Our analysis leads to a novel indicator, the probability ratio, which takes optimally into account the statistical information provided by the first and second best scores. The probability ratio is a non-parametric and robust indicator that makes spectra classification according to parameters such as charge state unnecessary and allows a peptide identification performance, on the basis of false discovery rates, that is better than that obtained by other empirical statistical approaches. The probability ratio also compares favorably with statistical probability indicators obtained by the construction of single-spectrum SEQUEST score distributions. These results make the robustness, conceptual simplicity, and ease of automation of the probability ratio algorithm a very attractive alternative to determine peptide identification confidences and error rates in high throughput experiments.
Collapse
|
33
|
Statistical Model for Large-Scale Peptide Identification in Databases from Tandem Mass Spectra Using SEQUEST. Anal Chem 2004; 76:6853-60. [PMID: 15571333 DOI: 10.1021/ac049305c] [Citation(s) in RCA: 89] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Recent technological advances have made multidimensional peptide separation techniques coupled with tandem mass spectrometry the method of choice for high-throughput identification of proteins. Due to these advances, the development of software tools for large-scale, fully automated, unambiguous peptide identification is highly necessary. In this work, we have used as a model the nuclear proteome from Jurkat cells and present a processing algorithm that allows accurate predictions of random matching distributions, based on the two SEQUEST scores Xcorr and DeltaCn. Our method permits a very simple and precise calculation of the probabilities associated with individual peptide assignments, as well as of the false discovery rate among the peptides identified in any experiment. A further mathematical analysis demonstrates that the score distributions are highly dependent on database size and precursor mass window and suggests that the probability associated with SEQUEST scores depends on the number of candidate peptide sequences available for the search. Our results highlight the importance of adjusting the filtering criteria to discriminate between correct and incorrect peptide sequences according to the circumstances of each particular experiment.
Collapse
|