1
|
Skawinski CLS, Shah PS. I'm Walking into Spiderwebs: Making Sense of Protein-Protein Interaction Data. J Proteome Res 2024. [PMID: 38556766 DOI: 10.1021/acs.jproteome.3c00892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/02/2024]
Abstract
Protein-protein interactions (PPIs) are at the heart of the molecular landscape permeating life. Proteomics studies can explore this protein interaction landscape using mass spectrometry (MS). Thanks to their high sensitivity, mass spectrometers can easily identify thousands of proteins within a single sample, but that same sensitivity generates tangled spiderwebs of data that hide biologically relevant findings. So, what does a researcher do when she finds herself walking into spiderwebs? In a field focused on discovery, MS data require rigor in their analysis, experimental validation, or a combination of both. In this Review, we provide a brief primer on MS-based experimental methods to identify PPIs. We discuss approaches to analyze the resulting data and remove the proteomic background. We consider the advantages between comprehensive and targeted studies. We also discuss how scoring might be improved through AI-based protein structure information. Women have been essential to the development of proteomics, so we will specifically highlight work by women that has made this field thrive in recent years.
Collapse
Affiliation(s)
- Chase L S Skawinski
- Department of Chemical Engineering, University of California, Davis 95616, California, United States
| | - Priya S Shah
- Department of Chemical Engineering, University of California, Davis 95616, California, United States
- Department of Microbiology and Molecular Genetics, University of California, Davis 95616, California, United States
| |
Collapse
|
2
|
Li B, Altelaar M, van Breukelen B. Identification of Protein Complexes by Integrating Protein Abundance and Interaction Features Using a Deep Learning Strategy. Int J Mol Sci 2023; 24:ijms24097884. [PMID: 37175590 PMCID: PMC10178578 DOI: 10.3390/ijms24097884] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 04/23/2023] [Accepted: 04/24/2023] [Indexed: 05/15/2023] Open
Abstract
Many essential cellular functions are carried out by multi-protein complexes that can be characterized by their protein-protein interactions. The interactions between protein subunits are critically dependent on the strengths of their interactions and their cellular abundances, both of which span orders of magnitude. Despite many efforts devoted to the global discovery of protein complexes by integrating large-scale protein abundance and interaction features, there is still room for improvement. Here, we integrated >7000 quantitative proteomic samples with three published affinity purification/co-fractionation mass spectrometry datasets into a deep learning framework to predict protein-protein interactions (PPIs), followed by the identification of protein complexes using a two-stage clustering strategy. Our deep-learning-technique-based classifier significantly outperformed recently published machine learning prediction models and in the process captured 5010 complexes containing over 9000 unique proteins. The vast majority of proteins in our predicted complexes exhibited low or no tissue specificity, which is an indication that the observed complexes tend to be ubiquitously expressed throughout all cell types and tissues. Interestingly, our combined approach increased the model sensitivity for low abundant proteins, which amongst other things allowed us to detect the interaction of MCM10, which connects to the replicative helicase complex via the MCM6 protein. The integration of protein abundances and their interaction features using a deep learning approach provided a comprehensive map of protein-protein interactions and a unique perspective on possible novel protein complexes.
Collapse
Affiliation(s)
- Bohui Li
- Biomolecular Mass Spectrometry and Proteomics, Padualaan 8, 3584 CH Utrecht, The Netherlands
- Utrecht Institute for Pharmaceutical Sciences (UIPS), Utrecht University, Universiteitsweg 99, 3584 CG Utrecht, The Netherlands
| | - Maarten Altelaar
- Biomolecular Mass Spectrometry and Proteomics, Padualaan 8, 3584 CH Utrecht, The Netherlands
- Utrecht Institute for Pharmaceutical Sciences (UIPS), Utrecht University, Universiteitsweg 99, 3584 CG Utrecht, The Netherlands
- Mass Spectrometry and Proteomics Facility, The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
| | - Bas van Breukelen
- Biomolecular Mass Spectrometry and Proteomics, Padualaan 8, 3584 CH Utrecht, The Netherlands
- Utrecht Institute for Pharmaceutical Sciences (UIPS), Utrecht University, Universiteitsweg 99, 3584 CG Utrecht, The Netherlands
| |
Collapse
|
3
|
Li H, Zhang H, Jiang H. Combining power of different methods to detect associations in large data sets. Brief Bioinform 2021; 23:6447432. [PMID: 34864853 DOI: 10.1093/bib/bbab488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Revised: 10/08/2021] [Accepted: 10/25/2021] [Indexed: 11/13/2022] Open
Abstract
Exploring the relationship between factors of interest is a fundamental step for further analysis on various scientific problems such as understanding the genetic mechanism underlying specific disease, brain functional connectivity analysis. There are many methods proposed for association analysis and each has its own advantages, but none of them is suitable for all kinds of situations. This brings difficulties and confusions to practitioner on which one to use when facing a real problem. In this paper, we propose to combine power of different methods to detect associations in large data sets. It goes as combining the weaker to be stronger. Numerical results from simulation study and real data applications show that our new framework is powerful. Importantly, the framework can also be applied to other problems. Availability: The R script is available at https://jiangdata.github.io/resources/DM.zip.
Collapse
Affiliation(s)
- He Li
- Polytechnic Institute of Zhejiang University, Zhejiang University, Hangzhou, China
| | - Hangxiao Zhang
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Hangjin Jiang
- Center for Data Science, Zhejiang University, Hangzhou, China
| |
Collapse
|
4
|
Treimer E, Niedermayer K, Schumann S, Zenker M, Schmeisser MJ, Kühl SJ. Galloway-Mowat syndrome: New insights from bioinformatics and expression during Xenopus embryogenesis. Gene Expr Patterns 2021; 42:119215. [PMID: 34619372 DOI: 10.1016/j.gep.2021.119215] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 09/15/2021] [Accepted: 10/01/2021] [Indexed: 11/16/2022]
Abstract
Galloway-Mowat syndrome (GAMOS) is a rare developmental disease. Patients suffer from congenital brain anomalies combined with renal abnormalities often resulting in an early-onset steroid-resistant nephrotic syndrome. The etiology of GAMOS has a heterogeneous genetic contribution. Mutations in more than 10 different genes have been reported in GAMOS patients. Among these are mutations in four genes encoding members of the human KEOPS (kinase, endopeptidase and other proteins of small size) complex, including OSGEP, TP53RK, TPRKB and LAGE3. Until now, these components have been functionally mainly investigated in bacteria, eukarya and archaea and in humans in the context of the discovery of its role in GAMOS, but the KEOPS complex members' expression and function during embryogenesis in vertebrates is still unknown. In this study, in silico analysis showed that both gene localization and the protein sequences of the three core KEOPS complex members Osgep, Tp53rk and Tprkb are highly conserved across different species including Xenopus laevis. In addition, we examined the spatio-temporal expression pattern of osgep, tp53rk and tprkb using RT-PCR and whole mount in situ hybridization approaches during early Xenopus development. We observed that all three genes were expressed during early embryogenesis and enriched in tissues and organs affected in GAMOS. More precisely, KEOPS complex genes are expressed in the pronephros, but also in neural tissue such as the developing brain, eye and cranial cartilage. These findings suggest that the KEOPS complex plays an important role during vertebrate embryonic development.
Collapse
Affiliation(s)
- Ernestine Treimer
- Institute for Microscopic Anatomy and Neurobiology, University Medical Center of the Johannes Gutenberg-University Mainz, Mainz, Germany; Institute for Biochemistry and Molecular Biology, University Ulm, Ulm, Germany
| | - Kathrin Niedermayer
- Institute for Biochemistry and Molecular Biology, University Ulm, Ulm, Germany
| | - Sven Schumann
- Institute for Microscopic Anatomy and Neurobiology, University Medical Center of the Johannes Gutenberg-University Mainz, Mainz, Germany
| | - Martin Zenker
- Institute of Human Genetics, University Hospital Magdeburg, Magdeburg, Germany
| | - Michael J Schmeisser
- Institute for Microscopic Anatomy and Neurobiology, University Medical Center of the Johannes Gutenberg-University Mainz, Mainz, Germany; Focus Program Translational Neurosciences, University Medical Center of the Johannes Gutenberg-University Mainz, Mainz, Germany.
| | - Susanne J Kühl
- Institute for Biochemistry and Molecular Biology, University Ulm, Ulm, Germany.
| |
Collapse
|
5
|
Lee Y, Okita TW, Szymanski DB. A co-fractionation mass spectrometry-based prediction of protein complex assemblies in the developing rice aleurone-subaleurone. THE PLANT CELL 2021; 33:2965-2980. [PMID: 34270775 PMCID: PMC8462808 DOI: 10.1093/plcell/koab182] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Accepted: 07/01/2021] [Indexed: 06/13/2023]
Abstract
Multiprotein complexes execute and coordinate diverse cellular processes such as organelle biogenesis, vesicle trafficking, cell signaling, and metabolism. Knowledge about their composition and localization provides useful clues about the mechanisms of cellular homeostasis and system-level control. This is of great biological importance and practical significance in heterotrophic rice (Oryza sativa) endosperm and aleurone-subaleurone tissues, which are a primary source of seed vitamins and stored energy. Dozens of protein complexes have been implicated in the synthesis, transport, and storage of seed proteins, lipids, vitamins, and minerals. Mutations in protein complexes that control RNA transport result in aberrant endosperm with shrunken and floury phenotypes, significantly reducing seed yield and quality. The purpose of this study was to broadly predict protein complex composition in the aleurone-subaleurone layers of developing rice seeds using co-fractionation mass spectrometry. Following orthogonal chromatographic separations of biological replicates, thousands of protein elution profiles were subjected to distance-based clustering to enable large-scale multimerization state measurements and protein complex predictions. The predicted complexes had predicted functions across diverse functional categories, including novel heteromeric RNA binding protein complexes that may influence seed quality. This effective and open-ended proteomics pipeline provides useful clues about system-level posttranslational control during the early stages of rice seed development.
Collapse
Affiliation(s)
- Youngwoo Lee
- Department of Botany and Plant Pathology, Center for Plant Biology, Purdue University, West Lafayette, Indiana 47907, USA
| | - Thomas W. Okita
- Institute of Biological Chemistry, Washington State University, Pullman, Washington 99164, USA
| | - Daniel B. Szymanski
- Department of Botany and Plant Pathology, Center for Plant Biology, Purdue University, West Lafayette, Indiana 47907, USA
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana 47907, USA
| |
Collapse
|
6
|
Swamy KBS, Schuyler SC, Leu JY. Protein Complexes Form a Basis for Complex Hybrid Incompatibility. Front Genet 2021; 12:609766. [PMID: 33633780 PMCID: PMC7900514 DOI: 10.3389/fgene.2021.609766] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 01/20/2021] [Indexed: 12/20/2022] Open
Abstract
Proteins are the workhorses of the cell and execute many of their functions by interacting with other proteins forming protein complexes. Multi-protein complexes are an admixture of subunits, change their interaction partners, and modulate their functions and cellular physiology in response to environmental changes. When two species mate, the hybrid offspring are usually inviable or sterile because of large-scale differences in the genetic makeup between the two parents causing incompatible genetic interactions. Such reciprocal-sign epistasis between inter-specific alleles is not limited to incompatible interactions between just one gene pair; and, usually involves multiple genes. Many of these multi-locus incompatibilities show visible defects, only in the presence of all the interactions, making it hard to characterize. Understanding the dynamics of protein-protein interactions (PPIs) leading to multi-protein complexes is better suited to characterize multi-locus incompatibilities, compared to studying them with traditional approaches of genetics and molecular biology. The advances in omics technologies, which includes genomics, transcriptomics, and proteomics can help achieve this end. This is especially relevant when studying non-model organisms. Here, we discuss the recent progress in the understanding of hybrid genetic incompatibility; omics technologies, and how together they have helped in characterizing protein complexes and in turn multi-locus incompatibilities. We also review advances in bioinformatic techniques suitable for this purpose and propose directions for leveraging the knowledge gained from model-organisms to identify genetic incompatibilities in non-model organisms.
Collapse
Affiliation(s)
- Krishna B. S. Swamy
- Division of Biological and Life Sciences, School of Arts and Sciences, Ahmedabad University, Ahmedabad, India
| | - Scott C. Schuyler
- Department of Biomedical Sciences, College of Medicine, Chang Gung University, Taoyuan, Taiwan
- Division of Head and Neck Surgery, Department of Otolaryngology, Chang Gung Memorial Hospital, Taoyuan, Taiwan
| | - Jun-Yi Leu
- Institute of Molecular Biology, Academia Sinica, Taipei, Taiwan
| |
Collapse
|
7
|
Abstract
In recent biomedical studies, multidimensional profiling, which collects proteomics as well as other types of omics data on the same subjects, is getting increasingly popular. Proteomics, transcriptomics, genomics, epigenomics, and other types of data contain overlapping as well as independent information, which suggests the possibility of integrating multiple types of data to generate more reliable findings/models with better classification/prediction performance. In this chapter, a selective review is conducted on recent data integration techniques for both unsupervised and supervised analysis. The main objective is to provide the "big picture" of data integration that involves proteomics data and discuss the "intuition" beneath the recently developed approaches without invoking too many mathematical details. Potential pitfalls and possible directions for future developments are also discussed.
Collapse
Affiliation(s)
- Mengyun Wu
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China
| | - Yu Jiang
- School of Public Health, University of Memphis, Memphis, TN, USA
| | - Shuangge Ma
- Department of Biostatistics, Yale School of Public Health, Yale University, New Haven, CT, USA.
| |
Collapse
|
8
|
Coupling ecological network analysis with high-throughput sequencing-based surveys: Lessons from the next-generation biomonitoring project. ADV ECOL RES 2021. [DOI: 10.1016/bs.aecr.2021.10.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
|
9
|
Pang CNI, Ballouz S, Weissberger D, Thibaut LM, Hamey JJ, Gillis J, Wilkins MR, Hart-Smith G. Analytical Guidelines for co-fractionation Mass Spectrometry Obtained through Global Profiling of Gold Standard Saccharomyces cerevisiae Protein Complexes. Mol Cell Proteomics 2020; 19:1876-1895. [PMID: 32817346 PMCID: PMC7664123 DOI: 10.1074/mcp.ra120.002154] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Revised: 07/14/2020] [Indexed: 11/06/2022] Open
Abstract
Co-fractionation MS (CF-MS) is a technique with potential to characterize endogenous and unmanipulated protein complexes on an unprecedented scale. However this potential has been offset by a lack of guidelines for best-practice CF-MS data collection and analysis. To obtain such guidelines, this study thoroughly evaluates novel and published Saccharomyces cerevisiae CF-MS data sets using very high proteome coverage libraries of yeast gold standard complexes. A new method for identifying gold standard complexes in CF-MS data, Reference Complex Profiling, and the Extending 'Guilt-by-Association' by Degree (EGAD) R package are used for these evaluations, which are verified with concurrent analyses of published human data. By evaluating data collection designs, which involve fractionation of cell lysates, it is found that near-maximum recall of complexes can be achieved with fewer samples than published studies. Distributing sample collection across orthogonal fractionation methods, rather than a single high resolution data set, leads to particularly efficient recall. By evaluating 17 different similarity scoring metrics, which are central to CF-MS data analysis, it is found that two metrics rarely used in past CF-MS studies - Spearman and Kendall correlations - and the recently introduced Co-apex metric frequently maximize recall, whereas a popular metric-Euclidean distance-delivers poor recall. The common practice of integrating external genomic data into CF-MS data analysis is also evaluated, revealing that this practice may improve the precision and recall of known complexes but is generally unsuitable for predicting novel complexes in model organisms. If studying nonmodel organisms using orthologous genomic data, it is found that particular subsets of fractionation profiles (e.g. the lowest abundance quartile) should be excluded to minimize false discovery. These assessments are summarized in a series of universally applicable guidelines for precise, sensitive and efficient CF-MS studies of known complexes, and effective predictions of novel complexes for orthogonal experimental validation.
Collapse
Affiliation(s)
- Chi Nam Ignatius Pang
- Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales, Australia
| | - Sara Ballouz
- Garvan Institute of Medical Research, Darlinghurst, Sydney, New South Wales, Australia
| | - Daniel Weissberger
- School of Chemistry, University of New South Wales, Sydney, New South Wales, Australia
| | - Loïc M Thibaut
- School of Mathematics and Statistics, University of New South Wales, Sydney, New South Wales, Australia
| | - Joshua J Hamey
- Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales, Australia
| | - Jesse Gillis
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Woodbury, New York, USA
| | - Marc R Wilkins
- Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales, Australia
| | - Gene Hart-Smith
- Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales, Australia; Department of Molecular Sciences, Macquarie University, Sydney, New South Wales, Australia.
| |
Collapse
|
10
|
Kerr CH, Skinnider MA, Andrews DDT, Madero AM, Chan QWT, Stacey RG, Stoynov N, Jan E, Foster LJ. Dynamic rewiring of the human interactome by interferon signaling. Genome Biol 2020; 21:140. [PMID: 32539747 PMCID: PMC7294662 DOI: 10.1186/s13059-020-02050-y] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Accepted: 05/20/2020] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND The type I interferon (IFN) response is an ancient pathway that protects cells against viral pathogens by inducing the transcription of hundreds of IFN-stimulated genes. Comprehensive catalogs of IFN-stimulated genes have been established across species and cell types by transcriptomic and biochemical approaches, but their antiviral mechanisms remain incompletely characterized. Here, we apply a combination of quantitative proteomic approaches to describe the effects of IFN signaling on the human proteome, and apply protein correlation profiling to map IFN-induced rearrangements in the human protein-protein interaction network. RESULTS We identify > 26,000 protein interactions in IFN-stimulated and unstimulated cells, many of which involve proteins associated with human disease and are observed exclusively within the IFN-stimulated network. Differential network analysis reveals interaction rewiring across a surprisingly broad spectrum of cellular pathways in the antiviral response. We identify IFN-dependent protein-protein interactions mediating novel regulatory mechanisms at the transcriptional and translational levels, with one such interaction modulating the transcriptional activity of STAT1. Moreover, we reveal IFN-dependent changes in ribosomal composition that act to buffer IFN-stimulated gene protein synthesis. CONCLUSIONS Our map of the IFN interactome provides a global view of the complex cellular networks activated during the antiviral response, placing IFN-stimulated genes in a functional context, and serves as a framework to understand how these networks are dysregulated in autoimmune or inflammatory disease.
Collapse
Affiliation(s)
- Craig H Kerr
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, V6T 1Z3, Canada
- Current Address: Department of Genetics, Stanford University, Stanford, CA, 94305, USA
| | - Michael A Skinnider
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Daniel D T Andrews
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, V6T 1Z3, Canada
| | - Angel M Madero
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Queenie W T Chan
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - R Greg Stacey
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Nikolay Stoynov
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Eric Jan
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, V6T 1Z3, Canada
| | - Leonard J Foster
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada.
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, V6T 1Z3, Canada.
| |
Collapse
|
11
|
Zilocchi M, Moutaoufik MT, Jessulat M, Phanse S, Aly KA, Babu M. Misconnecting the dots: altered mitochondrial protein-protein interactions and their role in neurodegenerative disorders. Expert Rev Proteomics 2020; 17:119-136. [PMID: 31986926 DOI: 10.1080/14789450.2020.1723419] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Introduction: Mitochondria (mt) are protein-protein interaction (PPI) hubs in the cell where mt-localized and associated proteins interact in a fashion critical for cell fitness. Altered mtPPIs are linked to neurodegenerative disorders (NDs) and drivers of pathological associations to mediate ND progression. Mapping altered mtPPIs will reveal how mt dysfunction is linked to NDs.Areas covered: This review discusses how database sources reflect on the number of mt protein or interaction predictions, and serves as an update on mtPPIs in mt dynamics and homeostasis. Emphasis is given to mRNA expression profiles for mt proteins in human tissues, cellular models relevant to NDs, and altered mtPPIs in NDs such as Parkinson's disease (PD), Amyotrophic lateral sclerosis (ALS) and Alzheimer's disease (AD).Expert opinion: We highlight the scarcity of biomarkers to improve diagnostic accuracy and tracking of ND progression, obstacles in recapitulating NDs using human cellular models to underpin the pathophysiological mechanisms of disease, and the shortage of mt protein interactome reference database(s) of neuronal cells. These bottlenecks are addressed by improvements in induced pluripotent stem cell creation and culturing, patient-derived 3D brain organoids to recapitulate structural arrangements of the brain, and cell sorting to elucidate mt proteome disparities between cell types.
Collapse
Affiliation(s)
- Mara Zilocchi
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada
| | | | - Matthew Jessulat
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada
| | - Sadhna Phanse
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada
| | - Khaled A Aly
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada
| | - Mohan Babu
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada
| |
Collapse
|
12
|
de Souza N, Picotti P. Mass spectrometry analysis of the structural proteome. Curr Opin Struct Biol 2020; 60:57-65. [DOI: 10.1016/j.sbi.2019.10.006] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Accepted: 10/16/2019] [Indexed: 01/01/2023]
|
13
|
Classification of Single Particles from Human Cell Extract Reveals Distinct Structures. Cell Rep 2019; 24:259-268.e3. [PMID: 29972786 PMCID: PMC6109231 DOI: 10.1016/j.celrep.2018.06.022] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2018] [Revised: 05/09/2018] [Accepted: 06/05/2018] [Indexed: 01/27/2023] Open
Abstract
Multi-protein complexes are necessary for nearly all cellular processes, and understanding their structure is required for elucidating their function. Current high-resolution strategies in structural biology are effective but lag behind other fields (e.g., genomics and proteomics) due to their reliance on purified samples rather than heterogeneous mixtures. Here, we present a method combining single-particle analysis by electron microscopy with protein identification by mass spectrometry to structurally characterize macromolecular complexes from human cell extract. We identify HSP60 through two-dimensional classification and obtain three-dimensional structures of native proteasomes directly from ab initio classification of a heterogeneous mixture of protein complexes. In addition, we reveal an ∼1-MDa-size structure of unknown composition and reference our proteomics data to suggest possible identities. Our study shows the power of using a shotgun approach to electron microscopy (shotgun EM) when coupled with mass spectrometry as a tool to uncover the structures of macromolecular machines.
Collapse
|
14
|
Mallam AL, Sae-Lee W, Schaub JM, Tu F, Battenhouse A, Jang YJ, Kim J, Wallingford JB, Finkelstein IJ, Marcotte EM, Drew K. Systematic Discovery of Endogenous Human Ribonucleoprotein Complexes. Cell Rep 2019; 29:1351-1368.e5. [PMID: 31665645 PMCID: PMC6873818 DOI: 10.1016/j.celrep.2019.09.060] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Revised: 08/30/2019] [Accepted: 09/18/2019] [Indexed: 12/16/2022] Open
Abstract
RNA-binding proteins (RBPs) play essential roles in biology and are frequently associated with human disease. Although recent studies have systematically identified individual RNA-binding proteins, their higher-order assembly into ribonucleoprotein (RNP) complexes has not been systematically investigated. Here, we describe a proteomics method for systematic identification of RNP complexes in human cells. We identify 1,428 protein complexes that associate with RNA, indicating that more than 20% of known human protein complexes contain RNA. To explore the role of RNA in the assembly of each complex, we identify complexes that dissociate, change composition, or form stable protein-only complexes in the absence of RNA. We use our method to systematically identify cell-type-specific RNA-associated proteins in mouse embryonic stem cells and finally, distribute our resource, rna.MAP, in an easy-to-use online interface (rna.proteincomplexes.org). Our system thus provides a methodology for explorations across human tissues, disease states, and throughout all domains of life.
Collapse
Affiliation(s)
- Anna L Mallam
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA; Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712, USA; Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA.
| | - Wisath Sae-Lee
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA; Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712, USA; Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA
| | - Jeffrey M Schaub
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA; Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712, USA; Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA
| | - Fan Tu
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA; Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712, USA; Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA
| | - Anna Battenhouse
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA; Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712, USA; Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA
| | - Yu Jin Jang
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA
| | - Jonghwan Kim
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA
| | - John B Wallingford
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA; Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712, USA; Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA
| | - Ilya J Finkelstein
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA; Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712, USA; Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA
| | - Edward M Marcotte
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA; Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712, USA; Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA.
| | - Kevin Drew
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA; Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712, USA; Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA.
| |
Collapse
|
15
|
A Computational Framework for Predicting Direct Contacts and Substructures within Protein Complexes. Biomolecules 2019; 9:biom9110656. [PMID: 31717703 PMCID: PMC6921016 DOI: 10.3390/biom9110656] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2019] [Revised: 10/20/2019] [Accepted: 10/23/2019] [Indexed: 11/17/2022] Open
Abstract
Understanding the physical arrangement of subunits within protein complexes potentially provides valuable clues about how the subunits work together and how the complexes function. The majority of recent research focuses on identifying protein complexes as a whole and seldom studies the inner structures within complexes. In this study, we propose a computational framework to predict direct contacts and substructures within protein complexes. In this framework, we first train a supervised learning model of l2-regularized logistic regression to learn the patterns of direct and indirect interactions within complexes, from where physical subunit interaction networks are predicted. Then, to infer substructures within complexes, we apply a graph clustering method (i.e., maximum modularity clustering (MMC)) and a gene ontology (GO) semantic similarity based functional clustering on partially- and fully-connected networks, respectively. Computational results show that the proposed framework achieves fairly good performance of cross validation and independent test in terms of detecting direct contacts between subunits. Functional analyses further demonstrate the rationality of partitioning the subunits into substructures via the MMC algorithm and functional clustering.
Collapse
|
16
|
Yoon G, Gaynanova I, Müller CL. Microbial Networks in SPRING - Semi-parametric Rank-Based Correlation and Partial Correlation Estimation for Quantitative Microbiome Data. Front Genet 2019; 10:516. [PMID: 31244881 PMCID: PMC6563871 DOI: 10.3389/fgene.2019.00516] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2019] [Accepted: 05/13/2019] [Indexed: 12/15/2022] Open
Abstract
High-throughput microbial sequencing techniques, such as targeted amplicon-based and metagenomic profiling, provide low-cost genomic survey data of microbial communities in their natural environment, ranging from marine ecosystems to host-associated habitats. While standard microbiome profiling data can provide sparse relative abundances of operational taxonomic units or genes, recent advances in experimental protocols give a more quantitative picture of microbial communities by pairing sequencing-based techniques with orthogonal measurements of microbial cell counts from the same sample. These tandem measurements provide absolute microbial count data albeit with a large excess of zeros due to limited sequencing depth. In this contribution we consider the fundamental statistical problem of estimating correlations and partial correlations from such quantitative microbiome data. To this end, we propose a semi-parametric rank-based approach to correlation estimation that can naturally deal with the excess zeros in the data. Combining this estimator with sparse graphical modeling techniques leads to the Semi-Parametric Rank-based approach for INference in Graphical model (SPRING). SPRING enables inference of statistical microbial association networks from quantitative microbiome data which can serve as high-level statistical summary of the underlying microbial ecosystem and can provide testable hypotheses for functional species-species interactions. Due to the absence of verified microbial associations we also introduce a novel quantitative microbiome data generation mechanism which mimics empirical marginal distributions of measured count data while simultaneously allowing user-specified dependencies among the variables. SPRING shows superior network recovery performance on a wide range of realistic benchmark problems with varying network topologies and is robust to misspecifications of the total cell count estimate. To highlight SPRING's broad applicability we infer taxon-taxon associations from the American Gut Project data and genus-genus associations from a recent quantitative gut microbiome dataset. We believe that, as quantitative microbiome profiling data will become increasingly available, the semi-parametric estimators for correlation and partial correlation estimation introduced here provide an important tool for reliable statistical analysis of quantitative microbiome data.
Collapse
Affiliation(s)
- Grace Yoon
- Department of Statistics, Texas A&M University, College Station, TX, United States
| | - Irina Gaynanova
- Department of Statistics, Texas A&M University, College Station, TX, United States
| | - Christian L. Müller
- Center for Computational Mathematics, Flatiron Institute, New York, NY, United States
| |
Collapse
|
17
|
|
18
|
Affiliation(s)
- Trey Ideker
- Department of Medicine, University of California San Diego, La Jolla, California, United States of America
- * E-mail: (TI); (RN)
| | - Ruth Nussinov
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, National Cancer Institute at Frederick, Frederick, Maryland, United States of America
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
- * E-mail: (TI); (RN)
| |
Collapse
|