1
|
Ahi EP. Fish Evo-Devo: Moving Toward Species-Specific and Knowledge-Based Interactome. JOURNAL OF EXPERIMENTAL ZOOLOGY. PART B, MOLECULAR AND DEVELOPMENTAL EVOLUTION 2025; 344:158-168. [PMID: 40170296 DOI: 10.1002/jez.b.23287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2024] [Revised: 12/13/2024] [Accepted: 01/12/2025] [Indexed: 04/03/2025]
Abstract
A knowledge-based interactome maps interactions among proteins and molecules within a cell using experimental data, computational predictions, and literature mining. These interactomes are vital for understanding cellular functions, pathways, and the evolutionary conservation of protein interactions. They reveal how interactions regulate growth, differentiation, and development. Transitioning to functionally validated interactomes is crucial in evolutionary developmental biology (Evo-Devo), especially for non-model species, to uncover unique regulatory networks, evolutionary novelties, and reliable gene interaction models. This enhances our understanding of complex trait evolution across species. The European Evo-Devo 2024 conference in Helsinki hosted the first fish-specific Evo-Devo symposium, highlighting the growing interest in fish models. Advances in genome annotation, genome editing, imaging, and molecular screening are expanding fish Evo-Devo research. High-throughput molecular data have enabled the deduction of gene regulatory networks. The next steps involve creating species-specific interactomes, validating them functionally, and integrating additional molecular data to deepen the understanding of complex regulatory interactions in fish Evo-Devo. This short review aims to address the logical steps for this transition, as well as the necessities and limitations of this journey.
Collapse
Affiliation(s)
- Ehsan Pashay Ahi
- Organismal and Evolutionary Biology Research Programme, University of Helsinki, Helsinki, Finland
| |
Collapse
|
2
|
Nayar G, Altman RB. Heterogeneous network approaches to protein pathway prediction. Comput Struct Biotechnol J 2024; 23:2727-2739. [PMID: 39035835 PMCID: PMC11260399 DOI: 10.1016/j.csbj.2024.06.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 06/17/2024] [Accepted: 06/18/2024] [Indexed: 07/23/2024] Open
Abstract
Understanding protein-protein interactions (PPIs) and the pathways they comprise is essential for comprehending cellular functions and their links to specific phenotypes. Despite the prevalence of molecular data generated by high-throughput sequencing technologies, a significant gap remains in translating this data into functional information regarding the series of interactions that underlie phenotypic differences. In this review, we present an in-depth analysis of heterogeneous network methodologies for modeling protein pathways, highlighting the critical role of integrating multifaceted biological data. It outlines the process of constructing these networks, from data representation to machine learning-driven predictions and evaluations. The work underscores the potential of heterogeneous networks in capturing the complexity of proteomic interactions, thereby offering enhanced accuracy in pathway prediction. This approach not only deepens our understanding of cellular processes but also opens up new possibilities in disease treatment and drug discovery by leveraging the predictive power of comprehensive proteomic data analysis.
Collapse
Affiliation(s)
- Gowri Nayar
- Department of Biomedical Data Science, Stanford University, United States
| | - Russ B. Altman
- Department of Biomedical Data Science, Stanford University, United States
- Department of Genetics, Stanford University, United States
- Department of Medicine, Stanford University, United States
- Department of Bioengineering, Stanford University, United States
| |
Collapse
|
3
|
Shrestha HK, Lee D, Wu Z, Wang Z, Fu Y, Wang X, Serrano GE, Beach TG, Peng J. Profiling Protein-Protein Interactions in the Human Brain by Refined Cofractionation Mass Spectrometry. J Proteome Res 2024; 23:1221-1231. [PMID: 38507900 PMCID: PMC11065482 DOI: 10.1021/acs.jproteome.3c00685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/22/2024]
Abstract
Proteins usually execute their biological functions through interactions with other proteins and by forming macromolecular complexes, but global profiling of protein complexes directly from human tissue samples has been limited. In this study, we utilized cofractionation mass spectrometry (CF-MS) to map protein complexes within the postmortem human brain with experimental replicates. First, we used concatenated anion and cation Ion Exchange Chromatography (IEX) to separate native protein complexes in 192 fractions and then proceeded with Data-Independent Acquisition (DIA) mass spectrometry to analyze the proteins in each fraction, quantifying a total of 4,804 proteins with 3,260 overlapping in both replicates. We improved the DIA's quantitative accuracy by implementing a constant amount of bovine serum albumin (BSA) in each fraction as an internal standard. Next, advanced computational pipelines, which integrate both a database-based complex analysis and an unbiased protein-protein interaction (PPI) search, were applied to identify protein complexes and construct protein-protein interaction networks in the human brain. Our study led to the identification of 486 protein complexes and 10054 binary protein-protein interactions, which represents the first global profiling of human brain PPIs using CF-MS. Overall, this study offers a resource and tool for a wide range of human brain research, including the identification of disease-specific protein complexes in the future.
Collapse
Affiliation(s)
- Him K. Shrestha
- Departments of Structural Biology and Developmental Neurobiology
| | - DongGeun Lee
- Departments of Structural Biology and Developmental Neurobiology
| | - Zhiping Wu
- Departments of Structural Biology and Developmental Neurobiology
| | - Zhen Wang
- Departments of Structural Biology and Developmental Neurobiology
| | - Yingxue Fu
- Departments of Structural Biology and Developmental Neurobiology
- Center for Proteomics and Metabolomics, St. Jude Children’s Research Hospital, Memphis, Tennessee, 38105, USA
| | - Xusheng Wang
- Center for Proteomics and Metabolomics, St. Jude Children’s Research Hospital, Memphis, Tennessee, 38105, USA
| | | | - Thomas G. Beach
- Banner Sun Health Research Institute, Sun City, AZ 85351, USA
| | - Junmin Peng
- Departments of Structural Biology and Developmental Neurobiology
| |
Collapse
|
4
|
Skinnider MA, Akinlaja MO, Foster LJ. Mapping protein states and interactions across the tree of life with co-fractionation mass spectrometry. Nat Commun 2023; 14:8365. [PMID: 38102123 PMCID: PMC10724252 DOI: 10.1038/s41467-023-44139-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2023] [Accepted: 12/01/2023] [Indexed: 12/17/2023] Open
Abstract
We present CFdb, a harmonized resource of interaction proteomics data from 411 co-fractionation mass spectrometry (CF-MS) datasets spanning 21,703 fractions. Meta-analysis of this resource charts protein abundance, phosphorylation, and interactions throughout the tree of life, including a reference map of the human interactome. We show how large-scale CF-MS data can enhance analyses of individual CF-MS datasets, and exemplify this strategy by mapping the honey bee interactome.
Collapse
Affiliation(s)
- Michael A Skinnider
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
- Ludwig Institute for Cancer Research, Princeton University, Princeton, NJ, USA
| | - Mopelola O Akinlaja
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Leonard J Foster
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada.
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, Canada.
| |
Collapse
|
5
|
Zilocchi M, Rahmatbakhsh M, Moutaoufik MT, Broderick K, Gagarinova A, Jessulat M, Phanse S, Aoki H, Aly KA, Babu M. Co-fractionation-mass spectrometry to characterize native mitochondrial protein assemblies in mammalian neurons and brain. Nat Protoc 2023; 18:3918-3973. [PMID: 37985878 DOI: 10.1038/s41596-023-00901-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 08/09/2023] [Indexed: 11/22/2023]
Abstract
Human mitochondrial (mt) protein assemblies are vital for neuronal and brain function, and their alteration contributes to many human disorders, e.g., neurodegenerative diseases resulting from abnormal protein-protein interactions (PPIs). Knowledge of the composition of mt protein complexes is, however, still limited. Affinity purification mass spectrometry (MS) and proximity-dependent biotinylation MS have defined protein partners of some mt proteins, but are too technically challenging and laborious to be practical for analyzing large numbers of samples at the proteome level, e.g., for the study of neuronal or brain-specific mt assemblies, as well as altered mtPPIs on a proteome-wide scale for a disease of interest in brain regions, disease tissues or neurons derived from patients. To address this challenge, we adapted a co-fractionation-MS platform to survey native mt assemblies in adult mouse brain and in human NTERA-2 embryonal carcinoma stem cells or differentiated neuronal-like cells. The workflow consists of orthogonal separations of mt extracts isolated from chemically cross-linked samples to stabilize PPIs, data-dependent acquisition MS to identify co-eluted mt protein profiles from collected fractions and a computational scoring pipeline to predict mtPPIs, followed by network partitioning to define complexes linked to mt functions as well as those essential for neuronal and brain physiological homeostasis. We developed an R/CRAN software package, Macromolecular Assemblies from Co-elution Profiles for automated scoring of co-fractionation-MS data to define complexes from mtPPI networks. Presently, the co-fractionation-MS procedure takes 1.5-3.5 d of proteomic sample preparation, 31 d of MS data acquisition and 8.5 d of data analyses to produce meaningful biological insights.
Collapse
Affiliation(s)
- Mara Zilocchi
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada
| | | | | | - Kirsten Broderick
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada
| | - Alla Gagarinova
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada
- Department of Biology, University of New Brunswick, Fredericton, New Brunswick, Canada
| | - Matthew Jessulat
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada
| | - Sadhna Phanse
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada
| | - Hiroyuki Aoki
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada
| | - Khaled A Aly
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada
| | - Mohan Babu
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada.
| |
Collapse
|
6
|
Hay BN, Akinlaja MO, Baker TC, Houfani AA, Stacey RG, Foster LJ. Integration of data-independent acquisition (DIA) with co-fractionation mass spectrometry (CF-MS) to enhance interactome mapping capabilities. Proteomics 2023; 23:e2200278. [PMID: 37144656 DOI: 10.1002/pmic.202200278] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 04/03/2023] [Accepted: 04/14/2023] [Indexed: 05/06/2023]
Abstract
Proteomics technologies are continually advancing, providing opportunities to develop stronger and more robust protein interaction networks (PINs). In part, this is due to the ever-growing number of high-throughput proteomics methods that are available. This review discusses how data-independent acquisition (DIA) and co-fractionation mass spectrometry (CF-MS) can be integrated to enhance interactome mapping abilities. Furthermore, integrating these two techniques can improve data quality and network generation through extended protein coverage, less missing data, and reduced noise. CF-DIA-MS shows promise in expanding our knowledge of interactomes, notably for non-model organisms (NMOs). CF-MS is a valuable technique on its own, but upon the integration of DIA, the potential to develop robust PINs increases, offering a unique approach for researchers to gain an in-depth understanding into the dynamics of numerous biological processes.
Collapse
Affiliation(s)
- Brenna N Hay
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Mopelola O Akinlaja
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Teesha C Baker
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Aicha Asma Houfani
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - R Greg Stacey
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Leonard J Foster
- Michael Smith Laboratories and Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
7
|
Chen YH, Chao KH, Wong JY, Liu CF, Leu JY, Tsai HK. A feature extraction free approach for protein interactome inference from co-elution data. Brief Bioinform 2023; 24:bbad229. [PMID: 37328692 DOI: 10.1093/bib/bbad229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 05/01/2023] [Accepted: 05/29/2023] [Indexed: 06/18/2023] Open
Abstract
Protein complexes are key functional units in cellular processes. High-throughput techniques, such as co-fractionation coupled with mass spectrometry (CF-MS), have advanced protein complex studies by enabling global interactome inference. However, dealing with complex fractionation characteristics to define true interactions is not a simple task, since CF-MS is prone to false positives due to the co-elution of non-interacting proteins by chance. Several computational methods have been designed to analyze CF-MS data and construct probabilistic protein-protein interaction (PPI) networks. Current methods usually first infer PPIs based on handcrafted CF-MS features, and then use clustering algorithms to form potential protein complexes. While powerful, these methods suffer from the potential bias of handcrafted features and severely imbalanced data distribution. However, the handcrafted features based on domain knowledge might introduce bias, and current methods also tend to overfit due to the severely imbalanced PPI data. To address these issues, we present a balanced end-to-end learning architecture, Software for Prediction of Interactome with Feature-extraction Free Elution Data (SPIFFED), to integrate feature representation from raw CF-MS data and interactome prediction by convolutional neural network. SPIFFED outperforms the state-of-the-art methods in predicting PPIs under the conventional imbalanced training. When trained with balanced data, SPIFFED had greatly improved sensitivity for true PPIs. Moreover, the ensemble SPIFFED model provides different voting schemes to integrate predicted PPIs from multiple CF-MS data. Using the clustering software (i.e. ClusterONE), SPIFFED allows users to infer high-confidence protein complexes depending on the CF-MS experimental designs. The source code of SPIFFED is freely available at: https://github.com/bio-it-station/SPIFFED.
Collapse
Affiliation(s)
- Yu-Hsin Chen
- Bioinformatics Program, Taiwan International Graduate Program, National Taiwan University, Taipei 106, Taiwan
- Bioinformatics Program, Taiwan International Graduate Program, Academic Sinica, Taipei 11529, Taiwan
- Institute of Information Science, Academia Sinica, Taipei, 11529, Taiwan
| | - Kuan-Hao Chao
- Institute of Information Science, Academia Sinica, Taipei, 11529, Taiwan
| | - Jin Yung Wong
- Institute of Information Science, Academia Sinica, Taipei, 11529, Taiwan
| | - Chien-Fu Liu
- Institute of Molecular Biology, Academia Sinica, Taipei, 11529, Taiwan
| | - Jun-Yi Leu
- Institute of Molecular Biology, Academia Sinica, Taipei, 11529, Taiwan
| | - Huai-Kuang Tsai
- Bioinformatics Program, Taiwan International Graduate Program, National Taiwan University, Taipei 106, Taiwan
- Bioinformatics Program, Taiwan International Graduate Program, Academic Sinica, Taipei 11529, Taiwan
- Institute of Information Science, Academia Sinica, Taipei, 11529, Taiwan
| |
Collapse
|
8
|
Pandey AK, Loscalzo J. Network medicine: an approach to complex kidney disease phenotypes. Nat Rev Nephrol 2023:10.1038/s41581-023-00705-0. [PMID: 37041415 DOI: 10.1038/s41581-023-00705-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/13/2023] [Indexed: 04/13/2023]
Abstract
Scientific reductionism has been the basis of disease classification and understanding for more than a century. However, the reductionist approach of characterizing diseases from a limited set of clinical observations and laboratory evaluations has proven insufficient in the face of an exponential growth in data generated from transcriptomics, proteomics, metabolomics and deep phenotyping. A new systematic method is necessary to organize these datasets and build new definitions of what constitutes a disease that incorporates both biological and environmental factors to more precisely describe the ever-growing complexity of phenotypes and their underlying molecular determinants. Network medicine provides such a conceptual framework to bridge these vast quantities of data while providing an individualized understanding of disease. The modern application of network medicine principles is yielding new insights into the pathobiology of chronic kidney diseases and renovascular disorders by expanding the understanding of pathogenic mediators, novel biomarkers and new options for renal therapeutics. These efforts affirm network medicine as a robust paradigm for elucidating new advances in the diagnosis and treatment of kidney disorders.
Collapse
Affiliation(s)
- Arvind K Pandey
- Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women's Hospital, and Harvard Medical School, Boston, MA, USA
| | - Joseph Loscalzo
- Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women's Hospital, and Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
9
|
Yang M, Harrison BR, Promislow DEL. In search of a Drosophila core cellular network with single-cell transcriptome data. G3 GENES|GENOMES|GENETICS 2022; 12:6670625. [PMID: 35976114 PMCID: PMC9526075 DOI: 10.1093/g3journal/jkac212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Accepted: 08/03/2022] [Indexed: 11/29/2022]
Abstract
Along with specialized functions, cells of multicellular organisms also perform essential functions common to most if not all cells. Whether diverse cells do this by using the same set of genes, interacting in a fixed coordinated fashion to execute essential functions, or a subset of genes specific to certain cells, remains a central question in biology. Here, we focus on gene coexpression to search for a core cellular network across a whole organism. Single-cell RNA-sequencing measures gene expression of individual cells, enabling researchers to discover gene expression patterns that contribute to the diversity of cell functions. Current efforts to study cellular functions focus primarily on identifying differentially expressed genes across cells. However, patterns of coexpression between genes are probably more indicative of biological processes than are the expression of individual genes. We constructed cell-type-specific gene coexpression networks using single-cell transcriptome datasets covering diverse cell types from the fruit fly, Drosophila melanogaster. We detected a set of highly coordinated genes preserved across cell types and present this as the best estimate of a core cellular network. This core is very small compared with cell-type-specific gene coexpression networks and shows dense connectivity. Gene members of this core tend to be ancient genes and are enriched for those encoding ribosomal proteins. Overall, we find evidence for a core cellular network in diverse cell types of the fruit fly. The topological, structural, functional, and evolutionary properties of this core indicate that it accounts for only a minority of essential functions.
Collapse
Affiliation(s)
- Ming Yang
- Department of Laboratory Medicine and Pathology, University of Washington School of Medicine , Seattle, WA 98195, USA
| | - Benjamin R Harrison
- Department of Laboratory Medicine and Pathology, University of Washington School of Medicine , Seattle, WA 98195, USA
| | - Daniel E L Promislow
- Department of Laboratory Medicine and Pathology, University of Washington School of Medicine , Seattle, WA 98195, USA
- Department of Biology, University of Washington , Seattle, WA 98195, USA
| |
Collapse
|
10
|
Guo MG, Sosa DN, Altman RB. Challenges and opportunities in network-based solutions for biological questions. Brief Bioinform 2021; 23:6438103. [PMID: 34849568 PMCID: PMC8769687 DOI: 10.1093/bib/bbab437] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 09/09/2021] [Accepted: 09/22/2021] [Indexed: 11/28/2022] Open
Abstract
Network biology is useful for modeling complex biological phenomena; it has attracted attention with the advent of novel graph-based machine learning methods. However, biological applications of network methods often suffer from inadequate follow-up. In this perspective, we discuss obstacles for contemporary network approaches—particularly focusing on challenges representing biological concepts, applying machine learning methods, and interpreting and validating computational findings about biology—in an effort to catalyze actionable biological discovery.
Collapse
Affiliation(s)
- Margaret G Guo
- Stanford Program in Biomedical Informatics, Stanford University, Stanford, CA, USA.,Program in Epithelial Biology, Stanford University, Stanford, CA, USA
| | - Daniel N Sosa
- Stanford Program in Biomedical Informatics, Stanford University, Stanford, CA, USA
| | - Russ B Altman
- Department of Bioengineering, Stanford University, Stanford, CA, USA.,Department of Genetics, Stanford University, Stanford, CA, USA
| |
Collapse
|
11
|
Demirel HC, Arici MK, Tuncbag N. Computational approaches leveraging integrated connections of multi-omic data toward clinical applications. Mol Omics 2021; 18:7-18. [PMID: 34734935 DOI: 10.1039/d1mo00158b] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
In line with the advances in high-throughput technologies, multiple omic datasets have accumulated to study biological systems and diseases coherently. No single omics data type is capable of fully representing cellular activity. The complexity of the biological processes arises from the interactions between omic entities such as genes, proteins, and metabolites. Therefore, multi-omic data integration is crucial but challenging. The impact of the molecular alterations in multi-omic data is not local in the neighborhood of the altered gene or protein; rather, the impact diffuses in the network and changes the functionality of multiple signaling pathways and regulation of the gene expression. Additionally, multi-omic data is high-dimensional and has background noise. Several integrative approaches have been developed to accurately interpret the multi-omic datasets, including machine learning, network-based methods, and their combination. In this review, we overview the most recent integrative approaches and tools with a focus on network-based methods. We then discuss these approaches according to their specific applications, from disease-network and biomarker identification to patient stratification, drug discovery, and repurposing.
Collapse
Affiliation(s)
- Habibe Cansu Demirel
- Graduate School of Informatics, Middle East Technical University, Ankara, 06800, Turkey
| | - Muslum Kaan Arici
- Graduate School of Informatics, Middle East Technical University, Ankara, 06800, Turkey.,Foot and Mouth Diseases Institute, Ministry of Agriculture and Forestry, Ankara, 06044, Turkey
| | - Nurcan Tuncbag
- Chemical and Biological Engineering, College of Engineering, Koc University, Istanbul, 34450, Turkey.,School of Medicine, Koc University, Istanbul, 34450, Turkey.,Koc University Research Center for Translational Medicine (KUTTAM), Istanbul, Turkey.
| |
Collapse
|
12
|
Arici MK, Tuncbag N. Performance Assessment of the Network Reconstruction Approaches on Various Interactomes. Front Mol Biosci 2021; 8:666705. [PMID: 34676243 PMCID: PMC8523993 DOI: 10.3389/fmolb.2021.666705] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Accepted: 07/14/2021] [Indexed: 01/04/2023] Open
Abstract
Beyond the list of molecules, there is a necessity to collectively consider multiple sets of omic data and to reconstruct the connections between the molecules. Especially, pathway reconstruction is crucial to understanding disease biology because abnormal cellular signaling may be pathological. The main challenge is how to integrate the data together in an accurate way. In this study, we aim to comparatively analyze the performance of a set of network reconstruction algorithms on multiple reference interactomes. We first explored several human protein interactomes, including PathwayCommons, OmniPath, HIPPIE, iRefWeb, STRING, and ConsensusPathDB. The comparison is based on the coverage of each interactome in terms of cancer driver proteins, structural information of protein interactions, and the bias toward well-studied proteins. We next used these interactomes to evaluate the performance of network reconstruction algorithms including all-pair shortest path, heat diffusion with flux, personalized PageRank with flux, and prize-collecting Steiner forest (PCSF) approaches. Each approach has its own merits and weaknesses. Among them, PCSF had the most balanced performance in terms of precision and recall scores when 28 pathways from NetPath were reconstructed using the listed algorithms. Additionally, the reference interactome affects the performance of the network reconstruction approaches. The coverage and disease- or tissue-specificity of each interactome may vary, which may result in differences in the reconstructed networks.
Collapse
Affiliation(s)
- M Kaan Arici
- Graduate School of Informatics, Middle East Technical University, Ankara, Turkey.,Foot and Mouth Diseases Institute, Ministry of Agriculture and Forestry, Ankara, Turkey
| | - Nurcan Tuncbag
- Chemical and Biological Engineering, College of Engineering, Koc University, Istanbul, Turkey.,School of Medicine, Koc University, Istanbul, Turkey
| |
Collapse
|
13
|
Skinnider MA, Scott NE, Prudova A, Kerr CH, Stoynov N, Stacey RG, Chan QWT, Rattray D, Gsponer J, Foster LJ. An atlas of protein-protein interactions across mouse tissues. Cell 2021; 184:4073-4089.e17. [PMID: 34214469 DOI: 10.1016/j.cell.2021.06.003] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Revised: 04/05/2021] [Accepted: 06/01/2021] [Indexed: 12/20/2022]
Abstract
Cellular processes arise from the dynamic organization of proteins in networks of physical interactions. Mapping the interactome has therefore been a central objective of high-throughput biology. However, the dynamics of protein interactions across physiological contexts remain poorly understood. Here, we develop a quantitative proteomic approach combining protein correlation profiling with stable isotope labeling of mammals (PCP-SILAM) to map the interactomes of seven mouse tissues. The resulting maps provide a proteome-scale survey of interactome rewiring across mammalian tissues, revealing more than 125,000 unique interactions at a quality comparable to the highest-quality human screens. We identify systematic suppression of cross-talk between the evolutionarily ancient housekeeping interactome and younger, tissue-specific modules. Rewired proteins are tightly regulated by multiple cellular mechanisms and are implicated in disease. Our study opens up new avenues to uncover regulatory mechanisms that shape in vivo interactome responses to physiological and pathophysiological stimuli in mammalian systems.
Collapse
Affiliation(s)
- Michael A Skinnider
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Nichollas E Scott
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada; Peter Doherty Institute, Department of Microbiology and Immunology, The University of Melbourne, Melbourne, VIC 3000, Australia
| | - Anna Prudova
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Craig H Kerr
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada; Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC V6T 1Z3, Canada
| | - Nikolay Stoynov
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - R Greg Stacey
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Queenie W T Chan
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - David Rattray
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada; Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC V6T 1Z3, Canada
| | - Jörg Gsponer
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada; Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC V6T 1Z3, Canada.
| | - Leonard J Foster
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada; Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC V6T 1Z3, Canada.
| |
Collapse
|
14
|
Dandage R, Berger CM, Gagnon-Arsenault I, Moon KM, Stacey RG, Foster LJ, Landry CR. Frequent Assembly of Chimeric Complexes in the Protein Interaction Network of an Interspecies Yeast Hybrid. Mol Biol Evol 2021; 38:1384-1401. [PMID: 33252673 PMCID: PMC8042767 DOI: 10.1093/molbev/msaa298] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Hybrids between species often show extreme phenotypes, including some that take place at the molecular level. In this study, we investigated the phenotypes of an interspecies diploid hybrid in terms of protein–protein interactions inferred from protein correlation profiling. We used two yeast species, Saccharomyces cerevisiae and Saccharomyces uvarum, which are interfertile, but yet have proteins diverged enough to be differentiated using mass spectrometry. Most of the protein–protein interactions are similar between hybrid and parents, and are consistent with the assembly of chimeric complexes, which we validated using an orthogonal approach for the prefoldin complex. We also identified instances of altered protein–protein interactions in the hybrid, for instance, in complexes related to proteostasis and in mitochondrial protein complexes. Overall, this study uncovers the likely frequent occurrence of chimeric protein complexes with few exceptions, which may result from incompatibilities or imbalances between the parental proteomes.
Collapse
Affiliation(s)
- Rohan Dandage
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté des Sciences et de Génie, Université Laval, Québec, QC, Canada.,PROTEO, Le Réseau Québécois de Recherche sur la Fonction, la Structure et L'ingénierie des Protéines, Université Laval, Québec, QC, Canada.,Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada.,Département de Biologie, Faculté des Sciences et de Génie, Université Laval, Québec, QC, Canada
| | - Caroline M Berger
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté des Sciences et de Génie, Université Laval, Québec, QC, Canada.,PROTEO, Le Réseau Québécois de Recherche sur la Fonction, la Structure et L'ingénierie des Protéines, Université Laval, Québec, QC, Canada.,Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada.,Département de Biologie, Faculté des Sciences et de Génie, Université Laval, Québec, QC, Canada
| | - Isabelle Gagnon-Arsenault
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté des Sciences et de Génie, Université Laval, Québec, QC, Canada.,PROTEO, Le Réseau Québécois de Recherche sur la Fonction, la Structure et L'ingénierie des Protéines, Université Laval, Québec, QC, Canada.,Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada.,Département de Biologie, Faculté des Sciences et de Génie, Université Laval, Québec, QC, Canada
| | - Kyung-Mee Moon
- Department of Biochemistry & Molecular Biology, and Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - Richard Greg Stacey
- Department of Biochemistry & Molecular Biology, and Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - Leonard J Foster
- Department of Biochemistry & Molecular Biology, and Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - Christian R Landry
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté des Sciences et de Génie, Université Laval, Québec, QC, Canada.,PROTEO, Le Réseau Québécois de Recherche sur la Fonction, la Structure et L'ingénierie des Protéines, Université Laval, Québec, QC, Canada.,Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada.,Département de Biologie, Faculté des Sciences et de Génie, Université Laval, Québec, QC, Canada
| |
Collapse
|
15
|
Skinnider MA, Foster LJ. Meta-analysis defines principles for the design and analysis of co-fractionation mass spectrometry experiments. Nat Methods 2021; 18:806-815. [PMID: 34211188 DOI: 10.1038/s41592-021-01194-4] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Accepted: 05/20/2021] [Indexed: 02/06/2023]
Abstract
Co-fractionation mass spectrometry (CF-MS) has emerged as a powerful technique for interactome mapping. However, there is little consensus on optimal strategies for the design of CF-MS experiments or their computational analysis. Here, we reanalyzed a total of 206 CF-MS experiments to generate a uniformly processed resource containing over 11 million measurements of protein abundance. We used this resource to benchmark experimental designs for CF-MS studies and systematically optimize computational approaches to network inference. We then applied this optimized methodology to reconstruct a draft-quality human interactome by CF-MS and predict over 700,000 protein-protein interactions across 27 eukaryotic species or clades. Our work defines new resources to illuminate proteome organization over evolutionary timescales and establishes best practices for the design and analysis of CF-MS studies.
Collapse
Affiliation(s)
- Michael A Skinnider
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Leonard J Foster
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada. .,Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, British Columbia, Canada.
| |
Collapse
|
16
|
Liu Z, Ma A, Mathé E, Merling M, Ma Q, Liu B. Network analyses in microbiome based on high-throughput multi-omics data. Brief Bioinform 2021; 22:1639-1655. [PMID: 32047891 PMCID: PMC7986608 DOI: 10.1093/bib/bbaa005] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Revised: 01/07/2020] [Accepted: 01/08/2020] [Indexed: 02/06/2023] Open
Abstract
Together with various hosts and environments, ubiquitous microbes interact closely with each other forming an intertwined system or community. Of interest, shifts of the relationships between microbes and their hosts or environments are associated with critical diseases and ecological changes. While advances in high-throughput Omics technologies offer a great opportunity for understanding the structures and functions of microbiome, it is still challenging to analyse and interpret the omics data. Specifically, the heterogeneity and diversity of microbial communities, compounded with the large size of the datasets, impose a tremendous challenge to mechanistically elucidate the complex communities. Fortunately, network analyses provide an efficient way to tackle this problem, and several network approaches have been proposed to improve this understanding recently. Here, we systemically illustrate these network theories that have been used in biological and biomedical research. Then, we review existing network modelling methods of microbial studies at multiple layers from metagenomics to metabolomics and further to multi-omics. Lastly, we discuss the limitations of present studies and provide a perspective for further directions in support of the understanding of microbial communities.
Collapse
Affiliation(s)
- Zhaoqian Liu
- Department of Biomedical Informatics, College of Medicine, the Ohio State University, Columbus, OH 43210, USA
| | - Anjun Ma
- Department of Biomedical Informatics, College of Medicine, the Ohio State University, Columbus, OH 43210, USA
| | - Ewy Mathé
- Department of Biomedical Informatics, College of Medicine, the Ohio State University, Columbus, OH 43210, USA
| | - Marlena Merling
- Department of Biomedical Informatics, College of Medicine, the Ohio State University, Columbus, OH 43210, USA
| | - Qin Ma
- Department of Biomedical Informatics, College of Medicine, the Ohio State University, Columbus, OH 43210, USA
| | - Bingqiang Liu
- Department of Biomedical Informatics, College of Medicine, the Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
17
|
Skinnider MA, Cai C, Stacey RG, Foster LJ. PrInCE: an R/bioconductor package for protein-protein interaction network inference from co-fractionation mass spectrometry data. Bioinformatics 2021; 37:2775-2777. [PMID: 33471077 DOI: 10.1093/bioinformatics/btab022] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 01/03/2021] [Accepted: 01/08/2021] [Indexed: 11/12/2022] Open
Abstract
SUMMARY We present PrInCE, an R/Bioconductor package that employs a machine-learning approach to infer protein-protein interaction networks from co-fractionation mass spectrometry (CF-MS) data. Previously distributed as a collection of Matlab scripts, our ground-up rewrite of this software package in an open-source language dramatically improves runtime and memory requirements. We describe several new features in the R implementation, including a test for the detection of co-eluting protein complexes and a method for differential network analysis. PrInCE is extensively documented and fully compatible with Bioconductor classes, ensuring it can fit seamlessly into existing proteomics workflows. AVAILABILITY AND IMPLEMENTATION PrInCE is available from Bioconductor (https://www.bioconductor.org/packages/devel/bioc/html/PrInCE.html). Source code is freely available from GitHub under the MIT license (https://github.com/fosterlab/PrInCE). Support is provided via the GitHub issues tracker (https://github.com/fosterlab/PrInCE/issues). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Michael A Skinnider
- Michael Smith Laboratories, University of British Columbia, Vancouver, Columbia, Canada British
| | - Charley Cai
- Michael Smith Laboratories, University of British Columbia, Vancouver, Columbia, Canada British
| | - R Greg Stacey
- Michael Smith Laboratories, University of British Columbia, Vancouver, Columbia, Canada British
| | - Leonard J Foster
- Michael Smith Laboratories, University of British Columbia, Vancouver, Columbia, Canada British.,Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
18
|
Slater O, Miller B, Kontoyianni M. Decoding Protein-protein Interactions: An Overview. Curr Top Med Chem 2021; 20:855-882. [PMID: 32101126 DOI: 10.2174/1568026620666200226105312] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2019] [Revised: 11/27/2019] [Accepted: 11/27/2019] [Indexed: 12/24/2022]
Abstract
Drug discovery has focused on the paradigm "one drug, one target" for a long time. However, small molecules can act at multiple macromolecular targets, which serves as the basis for drug repurposing. In an effort to expand the target space, and given advances in X-ray crystallography, protein-protein interactions have become an emerging focus area of drug discovery enterprises. Proteins interact with other biomolecules and it is this intricate network of interactions that determines the behavior of the system and its biological processes. In this review, we briefly discuss networks in disease, followed by computational methods for protein-protein complex prediction. Computational methodologies and techniques employed towards objectives such as protein-protein docking, protein-protein interactions, and interface predictions are described extensively. Docking aims at producing a complex between proteins, while interface predictions identify a subset of residues on one protein that could interact with a partner, and protein-protein interaction sites address whether two proteins interact. In addition, approaches to predict hot spots and binding sites are presented along with a representative example of our internal project on the chemokine CXC receptor 3 B-isoform and predictive modeling with IP10 and PF4.
Collapse
Affiliation(s)
- Olivia Slater
- Department of Pharmaceutical Sciences, Southern Illinois University, Edwardsville, IL 62026, United States
| | - Bethany Miller
- Department of Pharmaceutical Sciences, Southern Illinois University, Edwardsville, IL 62026, United States
| | - Maria Kontoyianni
- Department of Pharmaceutical Sciences, Southern Illinois University, Edwardsville, IL 62026, United States
| |
Collapse
|
19
|
Federico A, Monti S. Contextualized Protein-Protein Interactions. PATTERNS 2021; 2:100153. [PMID: 33511361 PMCID: PMC7815950 DOI: 10.1016/j.patter.2020.100153] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Revised: 10/20/2020] [Accepted: 10/28/2020] [Indexed: 12/29/2022]
Abstract
Protein-protein interaction (PPI) databases are an important bioinformatics resource, yet existing literature-curated databases usually represent cell-type-agnostic interactions, which is at variance with our understanding that protein dynamics are context specific and highly dependent on their environment. Here, we provide a resource derived through data mining to infer disease- and tissue-relevant interactions by annotating existing PPI databases with cell-contextual information extracted from reporting studies. This resource is applicable to the reconstruction and analysis of disease-centric molecular interaction networks. We have made the data and method publicly available and plan to release scheduled updates in the future. We expect these resources to be of interest to a wide audience of researchers in the life sciences. We present PPI Context: contextualization of existing literature-curated PPIs A resource for filtering PPIs by cell-line information mined from reporting studies A fast and flexible pipeline implementing the presented data mining method
Existing literature-curated protein-protein interaction (PPI) databases usually aggregate cell-type-agnostic interactions, yet PPIs are dependent on environmental conditions. Thus, new methods and resources for inferring the context in which a PPI is reported will extend their application and use in disease-centric modeling. We expect the resource presented in this article to be of high interest to those querying known interactions of proteins of interest, reconstruction and analyses of molecular interaction networks, and multi-omics data integration approaches.
Collapse
Affiliation(s)
- Anthony Federico
- Section of Computational Biomedicine, Boston University School of Medicine, Boston, MA 02118, USA.,Bioinformatics Program, Boston University, Boston, MA 02215, USA
| | - Stefano Monti
- Section of Computational Biomedicine, Boston University School of Medicine, Boston, MA 02118, USA.,Bioinformatics Program, Boston University, Boston, MA 02215, USA
| |
Collapse
|
20
|
Stacey RG, Skinnider MA, Foster LJ. On the Robustness of Graph-Based Clustering to Random Network Alterations. Mol Cell Proteomics 2020; 20:100002. [PMID: 33592499 PMCID: PMC7896145 DOI: 10.1074/mcp.ra120.002275] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 10/30/2020] [Accepted: 11/04/2020] [Indexed: 11/23/2022] Open
Abstract
Biological functions emerge from complex and dynamic networks of protein-protein interactions. Because these protein-protein interaction networks, or interactomes, represent pairwise connections within a hierarchically organized system, it is often useful to identify higher-order associations embedded within them, such as multimember protein complexes. Graph-based clustering techniques are widely used to accomplish this goal, and dozens of field-specific and general clustering algorithms exist. However, interactomes can be prone to errors, especially when inferred from high-throughput biochemical assays. Therefore, robustness to network-level noise is an important criterion. Here, we tested the robustness of a range of graph-based clustering algorithms in the presence of noise, including algorithms common across domains and those specific to protein networks. Strikingly, we found that all of the clustering algorithms tested here markedly amplified network-level noise. Randomly rewiring only 1% of network edges yielded more than a 50% change in clustering results. Moreover, we found the impact of network noise on individual clusters was not uniform: some clusters were consistently robust to injected noise, whereas others were not. Therefore we developed the clust.perturb R package and Shiny web application to measure the reproducibility of clusters by randomly perturbing the network. We show that clust.perturb results are predictive of real-world cluster stability: poorly reproducible clusters as identified by clust.perturb are significantly less likely to be reclustered across experiments. We conclude that graph-based clustering amplifies noise in protein interaction networks, but quantifying the robustness of a cluster to network noise can separate stable protein complexes from spurious associations.
Collapse
Affiliation(s)
- R Greg Stacey
- Michael Smith Laboratories, University of British Columbia, Vancouver, Canada.
| | - Michael A Skinnider
- Michael Smith Laboratories, University of British Columbia, Vancouver, Canada
| | - Leonard J Foster
- Michael Smith Laboratories, University of British Columbia, Vancouver, Canada; Department of Biochemistry, University of British Columbia, Vancouver, Canada
| |
Collapse
|
21
|
Pang CNI, Ballouz S, Weissberger D, Thibaut LM, Hamey JJ, Gillis J, Wilkins MR, Hart-Smith G. Analytical Guidelines for co-fractionation Mass Spectrometry Obtained through Global Profiling of Gold Standard Saccharomyces cerevisiae Protein Complexes. Mol Cell Proteomics 2020; 19:1876-1895. [PMID: 32817346 PMCID: PMC7664123 DOI: 10.1074/mcp.ra120.002154] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Revised: 07/14/2020] [Indexed: 11/06/2022] Open
Abstract
Co-fractionation MS (CF-MS) is a technique with potential to characterize endogenous and unmanipulated protein complexes on an unprecedented scale. However this potential has been offset by a lack of guidelines for best-practice CF-MS data collection and analysis. To obtain such guidelines, this study thoroughly evaluates novel and published Saccharomyces cerevisiae CF-MS data sets using very high proteome coverage libraries of yeast gold standard complexes. A new method for identifying gold standard complexes in CF-MS data, Reference Complex Profiling, and the Extending 'Guilt-by-Association' by Degree (EGAD) R package are used for these evaluations, which are verified with concurrent analyses of published human data. By evaluating data collection designs, which involve fractionation of cell lysates, it is found that near-maximum recall of complexes can be achieved with fewer samples than published studies. Distributing sample collection across orthogonal fractionation methods, rather than a single high resolution data set, leads to particularly efficient recall. By evaluating 17 different similarity scoring metrics, which are central to CF-MS data analysis, it is found that two metrics rarely used in past CF-MS studies - Spearman and Kendall correlations - and the recently introduced Co-apex metric frequently maximize recall, whereas a popular metric-Euclidean distance-delivers poor recall. The common practice of integrating external genomic data into CF-MS data analysis is also evaluated, revealing that this practice may improve the precision and recall of known complexes but is generally unsuitable for predicting novel complexes in model organisms. If studying nonmodel organisms using orthologous genomic data, it is found that particular subsets of fractionation profiles (e.g. the lowest abundance quartile) should be excluded to minimize false discovery. These assessments are summarized in a series of universally applicable guidelines for precise, sensitive and efficient CF-MS studies of known complexes, and effective predictions of novel complexes for orthogonal experimental validation.
Collapse
Affiliation(s)
- Chi Nam Ignatius Pang
- Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales, Australia
| | - Sara Ballouz
- Garvan Institute of Medical Research, Darlinghurst, Sydney, New South Wales, Australia
| | - Daniel Weissberger
- School of Chemistry, University of New South Wales, Sydney, New South Wales, Australia
| | - Loïc M Thibaut
- School of Mathematics and Statistics, University of New South Wales, Sydney, New South Wales, Australia
| | - Joshua J Hamey
- Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales, Australia
| | - Jesse Gillis
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Woodbury, New York, USA
| | - Marc R Wilkins
- Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales, Australia
| | - Gene Hart-Smith
- Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales, Australia; Department of Molecular Sciences, Macquarie University, Sydney, New South Wales, Australia.
| |
Collapse
|
22
|
Mazandu GK, Hooper C, Opap K, Makinde F, Nembaware V, Thomford NE, Chimusa ER, Wonkam A, Mulder NJ. IHP-PING-generating integrated human protein-protein interaction networks on-the-fly. Brief Bioinform 2020; 22:5943797. [PMID: 33129201 DOI: 10.1093/bib/bbaa277] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 09/12/2020] [Accepted: 09/21/2020] [Indexed: 01/04/2023] Open
Abstract
Advances in high-throughput sequencing technologies have resulted in an exponential growth of publicly accessible biological datasets. In the 'big data' driven 'post-genomic' context, much work is being done to explore human protein-protein interactions (PPIs) for a systems level based analysis to uncover useful signals and gain more insights to advance current knowledge and answer specific biological and health questions. These PPIs are experimentally or computationally predicted, stored in different online databases and some of PPI resources are updated regularly. As with many biological datasets, such regular updates continuously render older PPI datasets potentially outdated. Moreover, while many of these interactions are shared between these online resources, each resource includes its own identified PPIs and none of these databases exhaustively contains all existing human PPI maps. In this context, it is essential to enable the integration of or combining interaction datasets from different resources, to generate a PPI map with increased coverage and confidence. To allow researchers to produce an integrated human PPI datasets in real-time, we introduce the integrated human protein-protein interaction network generator (IHP-PING) tool. IHP-PING is a flexible python package which generates a human PPI network from freely available online resources. This tool extracts and integrates heterogeneous PPI datasets to generate a unified PPI network, which is stored locally for further applications.
Collapse
Affiliation(s)
- Gaston K Mazandu
- Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, CIDRI-Africa WT Centre, University of Cape Town, Health Sciences Campus. Anzio Rd, Observatory, 7925, South Africa.,African Institute for Mathematical Sciences, 5-7 Melrose Road, Muizenberg, 7945, Cape Town, South Africa.,Division of Human Genetics, Department of Pathology, University of Cape Town, Health Sciences Campus, Anzio Rd, Observatory, 7925, South Africa
| | - Christopher Hooper
- Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, CIDRI-Africa WT Centre, University of Cape Town, Health Sciences Campus. Anzio Rd, Observatory, 7925, South Africa
| | - Kenneth Opap
- Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, CIDRI-Africa WT Centre, University of Cape Town, Health Sciences Campus. Anzio Rd, Observatory, 7925, South Africa
| | - Funmilayo Makinde
- Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, CIDRI-Africa WT Centre, University of Cape Town, Health Sciences Campus. Anzio Rd, Observatory, 7925, South Africa.,African Institute for Mathematical Sciences, 5-7 Melrose Road, Muizenberg, 7945, Cape Town, South Africa
| | - Victoria Nembaware
- Division of Human Genetics, Department of Pathology, University of Cape Town, Health Sciences Campus, Anzio Rd, Observatory, 7925, South Africa
| | - Nicholas E Thomford
- Division of Human Genetics, Department of Pathology, University of Cape Town, Health Sciences Campus, Anzio Rd, Observatory, 7925, South Africa.,School of Medical Sciences, University of Cape Coast, PMB, Cape Coast, Ghana
| | - Emile R Chimusa
- Division of Human Genetics, Department of Pathology, University of Cape Town, Health Sciences Campus, Anzio Rd, Observatory, 7925, South Africa
| | - Ambroise Wonkam
- Division of Human Genetics, Department of Pathology, University of Cape Town, Health Sciences Campus, Anzio Rd, Observatory, 7925, South Africa
| | - Nicola J Mulder
- Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, CIDRI-Africa WT Centre, University of Cape Town, Health Sciences Campus. Anzio Rd, Observatory, 7925, South Africa
| |
Collapse
|
23
|
Kerr CH, Skinnider MA, Andrews DDT, Madero AM, Chan QWT, Stacey RG, Stoynov N, Jan E, Foster LJ. Dynamic rewiring of the human interactome by interferon signaling. Genome Biol 2020; 21:140. [PMID: 32539747 PMCID: PMC7294662 DOI: 10.1186/s13059-020-02050-y] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Accepted: 05/20/2020] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND The type I interferon (IFN) response is an ancient pathway that protects cells against viral pathogens by inducing the transcription of hundreds of IFN-stimulated genes. Comprehensive catalogs of IFN-stimulated genes have been established across species and cell types by transcriptomic and biochemical approaches, but their antiviral mechanisms remain incompletely characterized. Here, we apply a combination of quantitative proteomic approaches to describe the effects of IFN signaling on the human proteome, and apply protein correlation profiling to map IFN-induced rearrangements in the human protein-protein interaction network. RESULTS We identify > 26,000 protein interactions in IFN-stimulated and unstimulated cells, many of which involve proteins associated with human disease and are observed exclusively within the IFN-stimulated network. Differential network analysis reveals interaction rewiring across a surprisingly broad spectrum of cellular pathways in the antiviral response. We identify IFN-dependent protein-protein interactions mediating novel regulatory mechanisms at the transcriptional and translational levels, with one such interaction modulating the transcriptional activity of STAT1. Moreover, we reveal IFN-dependent changes in ribosomal composition that act to buffer IFN-stimulated gene protein synthesis. CONCLUSIONS Our map of the IFN interactome provides a global view of the complex cellular networks activated during the antiviral response, placing IFN-stimulated genes in a functional context, and serves as a framework to understand how these networks are dysregulated in autoimmune or inflammatory disease.
Collapse
Affiliation(s)
- Craig H Kerr
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, V6T 1Z3, Canada
- Current Address: Department of Genetics, Stanford University, Stanford, CA, 94305, USA
| | - Michael A Skinnider
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Daniel D T Andrews
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, V6T 1Z3, Canada
| | - Angel M Madero
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Queenie W T Chan
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - R Greg Stacey
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Nikolay Stoynov
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Eric Jan
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, V6T 1Z3, Canada
| | - Leonard J Foster
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada.
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, V6T 1Z3, Canada.
| |
Collapse
|
24
|
Dwivedi SK, Tjärnberg A, Tegnér J, Gustafsson M. Deriving disease modules from the compressed transcriptional space embedded in a deep autoencoder. Nat Commun 2020; 11:856. [PMID: 32051402 PMCID: PMC7016183 DOI: 10.1038/s41467-020-14666-6] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2019] [Accepted: 01/22/2020] [Indexed: 01/05/2023] Open
Abstract
Disease modules in molecular interaction maps have been useful for characterizing diseases. Yet biological networks, that commonly define such modules are incomplete and biased toward some well-studied disease genes. Here we ask whether disease-relevant modules of genes can be discovered without prior knowledge of a biological network, instead training a deep autoencoder from large transcriptional data. We hypothesize that modules could be discovered within the autoencoder representations. We find a statistically significant enrichment of genome-wide association studies (GWAS) relevant genes in the last layer, and to a successively lesser degree in the middle and first layers respectively. In contrast, we find an opposite gradient where a modular protein-protein interaction signal is strongest in the first layer, but then vanishing smoothly deeper in the network. We conclude that a data-driven discovery approach is sufficient to discover groups of disease-related genes.
Collapse
Affiliation(s)
- Sanjiv K Dwivedi
- Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, Linköping, Sweden
| | - Andreas Tjärnberg
- Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, Linköping, Sweden
- Department of Biology, Center For Genomics and Systems Biology, New York University, New York, NY, 10008, USA
- Center for Developmental Genetics, Department of Biology, New York University, New York, NY, USA
| | - Jesper Tegnér
- Biological and Environmental Sciences and Engineering Division, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
- Unit of Computational Medicine, Department of Medicine, Solna, Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden
- Science for Life Laboratory, Solna, Sweden
| | - Mika Gustafsson
- Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, Linköping, Sweden.
| |
Collapse
|
25
|
Konopka T, Smedley D. Incremental data integration for tracking genotype-disease associations. PLoS Comput Biol 2020; 16:e1007586. [PMID: 31986132 PMCID: PMC7004389 DOI: 10.1371/journal.pcbi.1007586] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 02/06/2020] [Accepted: 12/03/2019] [Indexed: 12/30/2022] Open
Abstract
Functional annotation of genes remains a challenge in fundamental biology and is a limiting factor for translational medicine. Computational approaches have been developed to process heterogeneous data into meaningful metrics, but often do not address how findings might be updated when new evidence comes to light. To address this challenge, we describe requirements for a framework for incremental data integration and propose an implementation based on phenotype ontologies and Bayesian probability updates. We apply the framework to quantify similarities between gene annotations and disease profiles. Within this scope, we categorize human diseases according to how well they can be recapitulated by animal models and quantify similarities between human diseases and mouse models produced by the International Mouse Phenotyping Consortium. The flexibility of the approach allows us to incorporate negative phenotypic data to better prioritize candidate genes, and to stratify disease mapping using sex-dependent phenotypes. All our association scores can be updated and we exploit this feature to showcase integration with curated annotations from high-precision assays. Incremental integration is thus a suitable framework for tracking functional annotations and linking to complex human pathology. Human diseases are often caused or influenced by genetic factors. The link between a particular gene and a specific disease is well-established in some cases. However, the roles of many genes are still unclear and many diseases do not have an understood genetic mechanism. Dissecting such interactions requires using a range of experimental approaches and assessing the results in a holistic manner. Computational methods already exist for comparing phenotypes observed in models and patients, and they work well when the phenotypes are detailed. In this work we argue that algorithms should also be able to report meaningful assessments based on preliminary data, and to update reports in a coherent manner when new information comes to light. These requirements lead to specific mathematical properties, which define incremental integration. We implement these requirements in a computational framework. We study the extent individual rare human diseases might be recapitulated by animal models. We compute gene-disease associations using data from public resources, including previously unused negative data. Altogether, these examples illustrate the framework can use observations in model systems to track gene-disease associations in the human context.
Collapse
Affiliation(s)
- Tomasz Konopka
- William Harvey Research Institute, Queen Mary University of London, London, United Kingdom
- * E-mail: (TK); (DS)
| | - Damian Smedley
- William Harvey Research Institute, Queen Mary University of London, London, United Kingdom
- * E-mail: (TK); (DS)
| |
Collapse
|
26
|
Salas D, Stacey RG, Akinlaja M, Foster LJ. Next-generation Interactomics: Considerations for the Use of Co-elution to Measure Protein Interaction Networks. Mol Cell Proteomics 2020; 19:1-10. [PMID: 31792070 PMCID: PMC6944233 DOI: 10.1074/mcp.r119.001803] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2019] [Revised: 11/26/2019] [Indexed: 12/26/2022] Open
Abstract
Understanding how proteins interact is crucial to understanding cellular processes. Among the available interactome mapping methods, co-elution stands out as a method that is simultaneous in nature and capable of identifying interactions between all the proteins detected in a sample. The general workflow in co-elution methods involves the mild extraction of protein complexes and their separation into several fractions, across which proteins bound together in the same complex will show similar co-elution profiles when analyzed appropriately. In this review we discuss the different separation, quantification and bioinformatic strategies used in co-elution studies, and the important considerations in designing these studies. The benefits of co-elution versus other methods makes it a valuable starting point when asking questions that involve the perturbation of the interactome.
Collapse
Affiliation(s)
- Daniela Salas
- Michael Smith Laboratories and Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, Canada; Department of Chemistry, Simon Fraser University, Burnaby, BC, Canada
| | - R Greg Stacey
- Michael Smith Laboratories and Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Mopelola Akinlaja
- Michael Smith Laboratories and Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Leonard J Foster
- Michael Smith Laboratories and Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, Canada.
| |
Collapse
|
27
|
Hu LZ, Goebels F, Tan JH, Wolf E, Kuzmanov U, Wan C, Phanse S, Xu C, Schertzberg M, Fraser AG, Bader GD, Emili A. EPIC: software toolkit for elution profile-based inference of protein complexes. Nat Methods 2019; 16:737-742. [PMID: 31308550 PMCID: PMC7995176 DOI: 10.1038/s41592-019-0461-4] [Citation(s) in RCA: 64] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Accepted: 05/15/2019] [Indexed: 11/08/2022]
Abstract
Protein complexes are key macromolecular machines of the cell, but their description remains incomplete. We and others previously reported an experimental strategy for global characterization of native protein assemblies based on chromatographic fractionation of biological extracts coupled to precision mass spectrometry analysis (chromatographic fractionation-mass spectrometry, CF-MS), but the resulting data are challenging to process and interpret. Here, we describe EPIC (elution profile-based inference of complexes), a software toolkit for automated scoring of large-scale CF-MS data to define high-confidence multi-component macromolecules from diverse biological specimens. As a case study, we used EPIC to map the global interactome of Caenorhabditis elegans, defining 612 putative worm protein complexes linked to diverse biological processes. These included novel subunits and assemblies unique to nematodes that we validated using orthogonal methods. The open source EPIC software is freely available as a Jupyter notebook packaged in a Docker container (https://hub.docker.com/r/baderlab/bio-epic/).
Collapse
Affiliation(s)
- Lucas ZhongMing Hu
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Florian Goebels
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| | - June H Tan
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Eric Wolf
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Uros Kuzmanov
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| | - Cuihong Wan
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
- School of Life Science, Central China Normal University, Wuhan, China
| | - Sadhna Phanse
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| | - Changjiang Xu
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| | - Mike Schertzberg
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| | - Andrew G Fraser
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Gary D Bader
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.
| | - Andrew Emili
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.
- Departments of Biochemistry and Biology, Boston University, Boston, MA, USA.
| |
Collapse
|
28
|
Carlson ML, Stacey RG, Young JW, Wason IS, Zhao Z, Rattray DG, Scott N, Kerr CH, Babu M, Foster LJ, Duong Van Hoa F. Profiling the Escherichia coli membrane protein interactome captured in Peptidisc libraries. eLife 2019; 8:46615. [PMID: 31364989 PMCID: PMC6697469 DOI: 10.7554/elife.46615] [Citation(s) in RCA: 49] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2019] [Accepted: 07/30/2019] [Indexed: 12/23/2022] Open
Abstract
Protein-correlation-profiling (PCP), in combination with quantitative proteomics, has emerged as a high-throughput method for the rapid identification of dynamic protein complexes in native conditions. While PCP has been successfully applied to soluble proteomes, characterization of the membrane interactome has lagged, partly due to the necessary use of detergents to maintain protein solubility. Here, we apply the peptidisc, a ‘one-size fits all’ membrane mimetic, for the capture of the Escherichia coli cell envelope proteome and its high-resolution fractionation in the absence of detergent. Analysis of the SILAC-labeled peptidisc library via PCP allows generation of over 4900 possible binary interactions out of >700,000 random associations. Using well-characterized membrane protein systems such as the SecY translocon, the Bam complex and the MetNI transporter, we demonstrate that our dataset is a useful resource for identifying transient and surprisingly novel protein interactions. For example, we discover a trans-periplasmic supercomplex comprising subunits of the Bam and Sec machineries, including membrane-bound chaperones YfgM and PpiD. We identify RcsF and OmpA as bone fide interactors of BamA, and we show that MetQ association with the ABC transporter MetNI depends on its N-terminal lipid anchor. We also discover NlpA as a novel interactor of MetNI complex. Most of these interactions are largely undetected by standard detergent-based purification. Together, the peptidisc workflow applied to the proteomic field is emerging as a promising novel approach to characterize membrane protein interactions under native expression conditions and without genetic manipulation.
Collapse
Affiliation(s)
- Michael Luke Carlson
- Life Sciences Institute, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of British Columbia, Vancouver, Canada
| | - R Greg Stacey
- Michael Smith Laboratory, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of British Columbia, Vancouver, Canada
| | - John William Young
- Life Sciences Institute, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of British Columbia, Vancouver, Canada
| | - Irvinder Singh Wason
- Life Sciences Institute, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of British Columbia, Vancouver, Canada
| | - Zhiyu Zhao
- Life Sciences Institute, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of British Columbia, Vancouver, Canada
| | - David G Rattray
- Michael Smith Laboratory, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of British Columbia, Vancouver, Canada
| | - Nichollas Scott
- Michael Smith Laboratory, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of British Columbia, Vancouver, Canada
| | - Craig H Kerr
- Michael Smith Laboratory, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of British Columbia, Vancouver, Canada
| | - Mohan Babu
- Department of Biochemistry, Faculty of Science, University of Regina, Regina, Canada
| | - Leonard J Foster
- Michael Smith Laboratory, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of British Columbia, Vancouver, Canada
| | - Franck Duong Van Hoa
- Life Sciences Institute, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of British Columbia, Vancouver, Canada
| |
Collapse
|
29
|
Stacey RG, Skinnider MA, Chik JHL, Foster LJ. Context-specific interactions in literature-curated protein interaction databases. BMC Genomics 2018; 19:758. [PMID: 30340458 PMCID: PMC6194712 DOI: 10.1186/s12864-018-5139-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2018] [Accepted: 10/03/2018] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Databases of literature-curated protein-protein interactions (PPIs) are often used to interpret high-throughput interactome mapping studies and estimate error rates. These databases combine interactions across thousands of published studies and experimental techniques. Because the tendency for two proteins to interact depends on the local conditions, this heterogeneity of conditions means that only a subset of database PPIs are interacting during any given experiment. A typical use of these databases as gold standards in interactome mapping projects, however, assumes that PPIs included in the database are indeed interacting under the experimental conditions of the study. RESULTS Using raw data from 20 co-fractionation experiments and six published interactomes, we demonstrate that this assumption is often false, with up to 55% of purported gold standard interactions showing no evidence of interaction, on average. We identify a subset of CORUM database complexes that do show consistent evidence of interaction in co-fractionation studies, and we use this subset as gold standards to dramatically improve interactome mapping as judged by the number of predicted interactions at a given error rate. CONCLUSIONS We recommend using this CORUM subset as the gold standard set in future co-fractionation studies. More generally, we recommend using the subset of literature-curated PPIs that are specific to the experimental context whenever possible.
Collapse
Affiliation(s)
- R. Greg Stacey
- Michael Smith Laboratories, University of British Columbia, Vancouver, V6T 1Z4 Canada
| | - Michael A. Skinnider
- Michael Smith Laboratories, University of British Columbia, Vancouver, V6T 1Z4 Canada
| | - Jenny H. L. Chik
- Current Address: International Collaboration On Repair Discoveries (ICORD), Vancouver Coastal Health Research Institute and Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC Canada
| | - Leonard J. Foster
- Michael Smith Laboratories, University of British Columbia, Vancouver, V6T 1Z4 Canada
- Department of Biochemistry, University of British Columbia, Vancouver, V6T 1Z3 Canada
| |
Collapse
|