1
|
Pang CNI, Ballouz S, Weissberger D, Thibaut LM, Hamey JJ, Gillis J, Wilkins MR, Hart-Smith G. Analytical Guidelines for co-fractionation Mass Spectrometry Obtained through Global Profiling of Gold Standard Saccharomyces cerevisiae Protein Complexes. Mol Cell Proteomics 2020; 19:1876-1895. [PMID: 32817346 PMCID: PMC7664123 DOI: 10.1074/mcp.ra120.002154] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Revised: 07/14/2020] [Indexed: 11/06/2022] Open
Abstract
Co-fractionation MS (CF-MS) is a technique with potential to characterize endogenous and unmanipulated protein complexes on an unprecedented scale. However this potential has been offset by a lack of guidelines for best-practice CF-MS data collection and analysis. To obtain such guidelines, this study thoroughly evaluates novel and published Saccharomyces cerevisiae CF-MS data sets using very high proteome coverage libraries of yeast gold standard complexes. A new method for identifying gold standard complexes in CF-MS data, Reference Complex Profiling, and the Extending 'Guilt-by-Association' by Degree (EGAD) R package are used for these evaluations, which are verified with concurrent analyses of published human data. By evaluating data collection designs, which involve fractionation of cell lysates, it is found that near-maximum recall of complexes can be achieved with fewer samples than published studies. Distributing sample collection across orthogonal fractionation methods, rather than a single high resolution data set, leads to particularly efficient recall. By evaluating 17 different similarity scoring metrics, which are central to CF-MS data analysis, it is found that two metrics rarely used in past CF-MS studies - Spearman and Kendall correlations - and the recently introduced Co-apex metric frequently maximize recall, whereas a popular metric-Euclidean distance-delivers poor recall. The common practice of integrating external genomic data into CF-MS data analysis is also evaluated, revealing that this practice may improve the precision and recall of known complexes but is generally unsuitable for predicting novel complexes in model organisms. If studying nonmodel organisms using orthologous genomic data, it is found that particular subsets of fractionation profiles (e.g. the lowest abundance quartile) should be excluded to minimize false discovery. These assessments are summarized in a series of universally applicable guidelines for precise, sensitive and efficient CF-MS studies of known complexes, and effective predictions of novel complexes for orthogonal experimental validation.
Collapse
Affiliation(s)
- Chi Nam Ignatius Pang
- Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales, Australia
| | - Sara Ballouz
- Garvan Institute of Medical Research, Darlinghurst, Sydney, New South Wales, Australia
| | - Daniel Weissberger
- School of Chemistry, University of New South Wales, Sydney, New South Wales, Australia
| | - Loïc M Thibaut
- School of Mathematics and Statistics, University of New South Wales, Sydney, New South Wales, Australia
| | - Joshua J Hamey
- Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales, Australia
| | - Jesse Gillis
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Woodbury, New York, USA
| | - Marc R Wilkins
- Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales, Australia
| | - Gene Hart-Smith
- Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales, Australia; Department of Molecular Sciences, Macquarie University, Sydney, New South Wales, Australia.
| |
Collapse
|
2
|
Bartolec TK, Smith DL, Pang CNI, Xu YD, Hamey JJ, Wilkins MR. Cross-linking Mass Spectrometry Analysis of the Yeast Nucleus Reveals Extensive Protein-Protein Interactions Not Detected by Systematic Two-Hybrid or Affinity Purification-Mass Spectrometry. Anal Chem 2020; 92:1874-1882. [PMID: 31851481 DOI: 10.1021/acs.analchem.9b03975] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Saccharomyces cerevisiae has the most comprehensively characterized protein-protein interaction network, or interactome, of any eukaryote. This has predominantly been generated through multiple, systematic studies of protein-protein interactions by two-hybrid techniques and of affinity-purified protein complexes. A pressing question is to understand how large-scale cross-linking mass spectrometry (XL-MS) can confirm and extend this interactome. Here, intact yeast nuclei were subject to cross-linking with disuccinimidyl sulfoxide (DSSO) and analyzed using hybrid MS2-MS3 methods. XlinkX identified a total of 2,052 unique residue pair cross-links at 1% FDR. Intraprotein cross-links were found to provide extensive structural constraint data, with almost all intralinks that mapped to known structures and slightly fewer of those mapping to homology models being within 30 Å. Intralinks provided structural information for a further 366 proteins. A method for optimizing interprotein cross-link score cut-offs was developed, through use of extensive known yeast interactions. Its application led to a high confidence, yeast nuclear interactome. Strikingly, almost half of the interactions were not previously detected by two-hybrid or AP-MS techniques. Multiple lines of evidence existed for many such interactions, whether through literature or ortholog interaction data, through multiple unique interlinks between proteins, and/or through replicates. We conclude that XL-MS is a powerful means to measure interactions, that complements two-hybrid and affinity-purification techniques.
Collapse
Affiliation(s)
- Tara K Bartolec
- Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences , University of New South Wales , Sydney , New South Wales 2052 , Australia
| | - Daniela-Lee Smith
- Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences , University of New South Wales , Sydney , New South Wales 2052 , Australia
| | - Chi Nam Ignatius Pang
- Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences , University of New South Wales , Sydney , New South Wales 2052 , Australia
| | - You Dan Xu
- Centre for Advanced Macromolecular Design, School of Chemistry , University of New South Wales , Sydney , New South Wales 2052 , Australia
| | - Joshua J Hamey
- Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences , University of New South Wales , Sydney , New South Wales 2052 , Australia
| | - Marc R Wilkins
- Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences , University of New South Wales , Sydney , New South Wales 2052 , Australia
| |
Collapse
|
3
|
Abstract
The genome-scale cellular network has become a necessary tool in the systematic analysis of microbes. In a cell, there are several layers (i.e., types) of the molecular networks, for example, genome-scale metabolic network (GMN), transcriptional regulatory network (TRN), and signal transduction network (STN). It has been realized that the limitation and inaccuracy of the prediction exist just using only a single-layer network. Therefore, the integrated network constructed based on the networks of the three types attracts more interests. The function of a biological process in living cells is usually performed by the interaction of biological components. Therefore, it is necessary to integrate and analyze all the related components at the systems level for the comprehensively and correctly realizing the physiological function in living organisms. In this review, we discussed three representative genome-scale cellular networks: GMN, TRN, and STN, representing different levels (i.e., metabolism, gene regulation, and cellular signaling) of a cell’s activities. Furthermore, we discussed the integration of the networks of the three types. With more understanding on the complexity of microbial cells, the development of integrated network has become an inevitable trend in analyzing genome-scale cellular networks of microorganisms.
Collapse
Affiliation(s)
- Tong Hao
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin, China
| | - Dan Wu
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin, China
| | - Lingxuan Zhao
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin, China
| | - Qian Wang
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin, China
| | - Edwin Wang
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin, China.,Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Jinsheng Sun
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin, China.,Tianjin Bohai Fisheries Research Institute, Tianjin, China
| |
Collapse
|