1
|
Samulevich ML, Carman LE, Aneskievich BJ. Investigating Protein-Protein Interactions of Autophagy-Involved TNIP1. Methods Mol Biol 2025; 2879:63-82. [PMID: 38441723 DOI: 10.1007/7651_2024_525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/19/2025]
Abstract
Myriad proteins are involved in the process of autophagy, which they participate in via their protein-protein interactions (PPI). Herein we outline a methodology for examining such interactions utilizing the case of intrinsically disordered protein (IDP) TNIP1 and its interaction with linear M1-linked polyubiquitin. This includes methods for recombinant production, purification, immuno-identification, and analysis of an IDP associated with autophagy, its ordered binding partner, and means of quantitatively analyzing their interaction.
Collapse
Affiliation(s)
- Michael L Samulevich
- Graduate Program in Pharmacology & Toxicology, University of Connecticut, Storrs, CT, USA
| | - Liam E Carman
- Graduate Program in Pharmacology & Toxicology, University of Connecticut, Storrs, CT, USA
| | - Brian J Aneskievich
- Department of Pharmaceutical Sciences, School of Pharmacy, University of Connecticut, Storrs, CT, USA.
| |
Collapse
|
2
|
Wu S, Zhang S, Liu CM, Fernie AR, Yan S. Recent Advances in Mass Spectrometry-Based Protein Interactome Studies. Mol Cell Proteomics 2025; 24:100887. [PMID: 39608603 PMCID: PMC11745815 DOI: 10.1016/j.mcpro.2024.100887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2024] [Revised: 11/09/2024] [Accepted: 11/25/2024] [Indexed: 11/30/2024] Open
Abstract
The foundation of all biological processes is the network of diverse and dynamic protein interactions with other molecules in cells known as the interactome. Understanding the interactome is crucial for elucidating molecular mechanisms but has been a longstanding challenge. Recent developments in mass spectrometry (MS)-based techniques, including affinity purification, proximity labeling, cross-linking, and co-fractionation mass spectrometry (MS), have significantly enhanced our abilities to study the interactome. They do so by identifying and quantifying protein interactions yielding profound insights into protein organizations and functions. This review summarizes recent advances in MS-based interactomics, focusing on the development of techniques that capture protein-protein, protein-metabolite, and protein-nucleic acid interactions. Additionally, we discuss how integrated MS-based approaches have been applied to diverse biological samples, focusing on significant discoveries that have leveraged our understanding of cellular functions. Finally, we highlight state-of-the-art bioinformatic approaches for predictions of interactome and complex modeling, as well as strategies for combining experimental interactome data with computation methods, thereby enhancing the ability of MS-based techniques to identify protein interactomes. Indeed, advances in MS technologies and their integrations with computational biology provide new directions and avenues for interactome research, leveraging new insights into mechanisms that govern the molecular architecture of living cells and, thereby, our comprehension of biological processes.
Collapse
Affiliation(s)
- Shaowen Wu
- State Key Laboratory of Swine and Poultry Breeding Industry, Guangdong Key Laboratory of Crop Germplasm Resources Preservation and Utilization, Agro-biological Gene Research Center, Guangdong Academy of Agricultural Sciences, Guangzhou, China
| | - Sheng Zhang
- Proteomics and Metabolomics Facility, Institute of Biotechnology, Cornell University, Ithaca, New York, USA
| | - Chun-Ming Liu
- Key Laboratory of Plant Molecular Physiology Institute of Botany, Chinese Academy of Sciences, Beijing, China
| | - Alisdair R Fernie
- Root Biology and Symbiosis, Max Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany
| | - Shijuan Yan
- State Key Laboratory of Swine and Poultry Breeding Industry, Guangdong Key Laboratory of Crop Germplasm Resources Preservation and Utilization, Agro-biological Gene Research Center, Guangdong Academy of Agricultural Sciences, Guangzhou, China.
| |
Collapse
|
3
|
Breckels LM, Hutchings C, Ingole KD, Kim S, Lilley KS, Makwana MV, McCaskie KJA, Villanueva E. Advances in spatial proteomics: Mapping proteome architecture from protein complexes to subcellular localizations. Cell Chem Biol 2024; 31:1665-1687. [PMID: 39303701 DOI: 10.1016/j.chembiol.2024.08.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Revised: 08/12/2024] [Accepted: 08/20/2024] [Indexed: 09/22/2024]
Abstract
Proteins are responsible for most intracellular functions, which they perform as part of higher-order molecular complexes, located within defined subcellular niches. Localization is both dynamic and context specific and mislocalization underlies a multitude of diseases. It is thus vital to be able to measure the components of higher-order protein complexes and their subcellular location dynamically in order to fully understand cell biological processes. Here, we review the current range of highly complementary approaches that determine the subcellular organization of the proteome. We discuss the scale and resolution at which these approaches are best employed and the caveats that should be taken into consideration when applying them. We also look to the future and emerging technologies that are paving the way for a more comprehensive understanding of the functional roles of protein isoforms, which is essential for unraveling the complexities of cell biology and the development of disease treatments.
Collapse
Affiliation(s)
- Lisa M Breckels
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, UK
| | - Charlotte Hutchings
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, UK
| | - Kishor D Ingole
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, UK
| | - Suyeon Kim
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, UK
| | - Kathryn S Lilley
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, UK.
| | - Mehul V Makwana
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, UK
| | - Kieran J A McCaskie
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, UK
| | - Eneko Villanueva
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, UK
| |
Collapse
|
4
|
Wehrhan L, Keller BG. Prebound State Discovered in the Unbinding Pathway of Fluorinated Variants of the Trypsin-BPTI Complex Using Random Acceleration Molecular Dynamics Simulations. J Chem Inf Model 2024; 64:5194-5206. [PMID: 38870039 PMCID: PMC11234359 DOI: 10.1021/acs.jcim.4c00338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2024]
Abstract
The serine protease trypsin forms a tightly bound inhibitor complex with the bovine pancreatic trypsin inhibitor (BPTI). The complex is stabilized by the P1 residue Lys15, which interacts with negatively charged amino acids at the bottom of the S1 pocket. Truncating the P1 residue of wildtype BPTI to α-aminobutyric acid (Abu) leaves a complex with moderate inhibitor strength, which is held in place by additional hydrogen bonds at the protein-protein interface. Fluorination of the Abu residue partially restores the inhibitor strength. The mechanism with which fluorination can restore the inhibitor strength is unknown, and accurate computational investigation requires knowledge of the binding and unbinding pathways. The preferred unbinding pathway is likely to be complex, as encounter states have been described before, and unrestrained umbrella sampling simulations of these complexes suggest additional energetic minima. Here, we use random acceleration molecular dynamics to find a new metastable state in the unbinding pathway of Abu-BPTI variants and wildtype BPTI from trypsin, which we call the prebound state. The prebound state and the fully bound state differ by a substantial shift in the position, a slight shift in the orientation of the BPTI variants, and changes in the interaction pattern. Particularly important is the breaking of three hydrogen bonds around Arg17. Fluorination of the P1 residue lowers the energy barrier of the transition between the fully bound state and prebound state and also lowers the energy minimum of the prebound state. While the effect of fluorination is in general difficult to quantify, here, it is in part caused by favorable stabilization of a hydrogen bond between Gln194 and Cys14. The interaction pattern of the prebound state offers insights into the inhibitory mechanism of BPTI and might add valuable information for the design of serine protease inhibitors.
Collapse
Affiliation(s)
- Leon Wehrhan
- Department of Biology, Chemistry, and Pharmacy, Freie Universität Berlin, Arnimallee 22, Berlin 14195, Germany
| | - Bettina G Keller
- Department of Biology, Chemistry, and Pharmacy, Freie Universität Berlin, Arnimallee 22, Berlin 14195, Germany
| |
Collapse
|
5
|
Grassmann G, Miotto M, Desantis F, Di Rienzo L, Tartaglia GG, Pastore A, Ruocco G, Monti M, Milanetti E. Computational Approaches to Predict Protein-Protein Interactions in Crowded Cellular Environments. Chem Rev 2024; 124:3932-3977. [PMID: 38535831 PMCID: PMC11009965 DOI: 10.1021/acs.chemrev.3c00550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 02/20/2024] [Accepted: 02/21/2024] [Indexed: 04/11/2024]
Abstract
Investigating protein-protein interactions is crucial for understanding cellular biological processes because proteins often function within molecular complexes rather than in isolation. While experimental and computational methods have provided valuable insights into these interactions, they often overlook a critical factor: the crowded cellular environment. This environment significantly impacts protein behavior, including structural stability, diffusion, and ultimately the nature of binding. In this review, we discuss theoretical and computational approaches that allow the modeling of biological systems to guide and complement experiments and can thus significantly advance the investigation, and possibly the predictions, of protein-protein interactions in the crowded environment of cell cytoplasm. We explore topics such as statistical mechanics for lattice simulations, hydrodynamic interactions, diffusion processes in high-viscosity environments, and several methods based on molecular dynamics simulations. By synergistically leveraging methods from biophysics and computational biology, we review the state of the art of computational methods to study the impact of molecular crowding on protein-protein interactions and discuss its potential revolutionizing effects on the characterization of the human interactome.
Collapse
Affiliation(s)
- Greta Grassmann
- Department
of Biochemical Sciences “Alessandro Rossi Fanelli”, Sapienza University of Rome, Rome 00185, Italy
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
| | - Mattia Miotto
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
| | - Fausta Desantis
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
- The
Open University Affiliated Research Centre at Istituto Italiano di
Tecnologia, Genoa 16163, Italy
| | - Lorenzo Di Rienzo
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
| | - Gian Gaetano Tartaglia
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
- Department
of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Genoa 16163, Italy
- Center
for Human Technologies, Genoa 16152, Italy
| | - Annalisa Pastore
- Experiment
Division, European Synchrotron Radiation
Facility, Grenoble 38043, France
| | - Giancarlo Ruocco
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
- Department
of Physics, Sapienza University, Rome 00185, Italy
| | - Michele Monti
- RNA
System Biology Lab, Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Genoa 16163, Italy
| | - Edoardo Milanetti
- Center
for Life Nano & Neuro Science, Istituto
Italiano di Tecnologia, Rome 00161, Italy
- Department
of Physics, Sapienza University, Rome 00185, Italy
| |
Collapse
|
6
|
Pan Y, Wang Y, Guan J, Zhou S. PCGAN: a generative approach for protein complex identification from protein interaction networks. Bioinformatics 2023; 39:btad473. [PMID: 37531266 PMCID: PMC10457665 DOI: 10.1093/bioinformatics/btad473] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2022] [Revised: 07/23/2023] [Accepted: 08/01/2023] [Indexed: 08/04/2023] Open
Abstract
MOTIVATION Protein complexes are groups of polypeptide chains linked by non-covalent protein-protein interactions, which play important roles in biological systems and perform numerous functions, including DNA transcription, mRNA translation, and signal transduction. In the past decade, a number of computational methods have been developed to identify protein complexes from protein interaction networks by mining dense subnetworks or subgraphs. RESULTS In this article, different from the existing works, we propose a novel approach for this task based on generative adversarial networks, which is called PCGAN, meaning identifying Protein Complexes by GAN. With the help of some real complexes as training samples, our method can learn a model to generate new complexes from a protein interaction network. To effectively support model training and testing, we construct two more comprehensive and reliable protein interaction networks and a larger gold standard complex set by merging existing ones of the same organism (including human and yeast). Extensive comparison studies indicate that our method is superior to existing protein complex identification methods in terms of various performance metrics. Furthermore, functional enrichment analysis shows that the identified complexes are of high biological significance, which indicates that these generated protein complexes are very possibly real complexes. AVAILABILITY AND IMPLEMENTATION https://github.com/yul-pan/PCGAN.
Collapse
Affiliation(s)
- Yuliang Pan
- Department of Computer Science and Technology, Tongji University, Shanghai 201804, China
| | - Yang Wang
- Department of Computer Science and Technology, Tongji University, Shanghai 201804, China
| | - Jihong Guan
- Department of Computer Science and Technology, Tongji University, Shanghai 201804, China
| | - Shuigeng Zhou
- Shanghai Key Laboratory of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200438, China
| |
Collapse
|
7
|
Dai S, Liu S, Zhou C, Yu F, Zhu G, Zhang W, Deng H, Burlingame A, Yu W, Wang T, Li N. Capturing the hierarchically assorted modules of protein-protein interactions in the organized nucleome. MOLECULAR PLANT 2023; 16:930-961. [PMID: 36960533 DOI: 10.1016/j.molp.2023.03.013] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Revised: 02/16/2023] [Accepted: 03/21/2023] [Indexed: 05/04/2023]
Abstract
Nuclear proteins are major constituents and key regulators of nucleome topological organization and manipulators of nuclear events. To decipher the global connectivity of nuclear proteins and the hierarchically organized modules of their interactions, we conducted two rounds of cross-linking mass spectrometry (XL-MS) analysis, one of which followed a quantitative double chemical cross-linking mass spectrometry (in vivoqXL-MS) workflow, and identified 24,140 unique crosslinks in total from the nuclei of soybean seedlings. This in vivo quantitative interactomics enabled the identification of 5340 crosslinks that can be converted into 1297 nuclear protein-protein interactions (PPIs), 1220 (94%) of which were non-confirmative (or novel) nuclear PPIs compared with those in repositories. There were 250 and 26 novel interactors of histones and the nucleolar box C/D small nucleolar ribonucleoprotein complex, respectively. Modulomic analysis of orthologous Arabidopsis PPIs produced 27 and 24 master nuclear PPI modules (NPIMs) that contain the condensate-forming protein(s) and the intrinsically disordered region-containing proteins, respectively. These NPIMs successfully captured previously reported nuclear protein complexes and nuclear bodies in the nucleus. Surprisingly, these NPIMs were hierarchically assorted into four higher-order communities in a nucleomic graph, including genome and nucleolus communities. This combinatorial pipeline of 4C quantitative interactomics and PPI network modularization revealed 17 ethylene-specific module variants that participate in a broad range of nuclear events. The pipeline was able to capture both nuclear protein complexes and nuclear bodies, construct the topological architectures of PPI modules and module variants in the nucleome, and probably map the protein compositions of biomolecular condensates.
Collapse
Affiliation(s)
- Shuaijian Dai
- Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
| | - Shichang Liu
- Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Chen Zhou
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Fengchao Yu
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Guang Zhu
- Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Wenhao Zhang
- Tsinghua-Peking Joint Centre for Life Sciences, Centre for Structural Biology, School of Life Sciences and School of Medicine, Tsinghua University, Beijing 100084, China
| | - Haiteng Deng
- Tsinghua-Peking Joint Centre for Life Sciences, Centre for Structural Biology, School of Life Sciences and School of Medicine, Tsinghua University, Beijing 100084, China
| | - Al Burlingame
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA, USA
| | - Weichuan Yu
- The HKUST Shenzhen-Hong Kong Collaborative Innovation Research Institute, Futian, Shenzhen, Guangdong 518057, China; Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, China.
| | - Tingliang Wang
- Tsinghua-Peking Joint Centre for Life Sciences, Centre for Structural Biology, School of Life Sciences and School of Medicine, Tsinghua University, Beijing 100084, China.
| | - Ning Li
- Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong SAR, China; Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Hong Kong, China; The HKUST Shenzhen-Hong Kong Collaborative Innovation Research Institute, Futian, Shenzhen, Guangdong 518057, China.
| |
Collapse
|
8
|
Li B, Altelaar M, van Breukelen B. Identification of Protein Complexes by Integrating Protein Abundance and Interaction Features Using a Deep Learning Strategy. Int J Mol Sci 2023; 24:ijms24097884. [PMID: 37175590 PMCID: PMC10178578 DOI: 10.3390/ijms24097884] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 04/23/2023] [Accepted: 04/24/2023] [Indexed: 05/15/2023] Open
Abstract
Many essential cellular functions are carried out by multi-protein complexes that can be characterized by their protein-protein interactions. The interactions between protein subunits are critically dependent on the strengths of their interactions and their cellular abundances, both of which span orders of magnitude. Despite many efforts devoted to the global discovery of protein complexes by integrating large-scale protein abundance and interaction features, there is still room for improvement. Here, we integrated >7000 quantitative proteomic samples with three published affinity purification/co-fractionation mass spectrometry datasets into a deep learning framework to predict protein-protein interactions (PPIs), followed by the identification of protein complexes using a two-stage clustering strategy. Our deep-learning-technique-based classifier significantly outperformed recently published machine learning prediction models and in the process captured 5010 complexes containing over 9000 unique proteins. The vast majority of proteins in our predicted complexes exhibited low or no tissue specificity, which is an indication that the observed complexes tend to be ubiquitously expressed throughout all cell types and tissues. Interestingly, our combined approach increased the model sensitivity for low abundant proteins, which amongst other things allowed us to detect the interaction of MCM10, which connects to the replicative helicase complex via the MCM6 protein. The integration of protein abundances and their interaction features using a deep learning approach provided a comprehensive map of protein-protein interactions and a unique perspective on possible novel protein complexes.
Collapse
Affiliation(s)
- Bohui Li
- Biomolecular Mass Spectrometry and Proteomics, Padualaan 8, 3584 CH Utrecht, The Netherlands
- Utrecht Institute for Pharmaceutical Sciences (UIPS), Utrecht University, Universiteitsweg 99, 3584 CG Utrecht, The Netherlands
| | - Maarten Altelaar
- Biomolecular Mass Spectrometry and Proteomics, Padualaan 8, 3584 CH Utrecht, The Netherlands
- Utrecht Institute for Pharmaceutical Sciences (UIPS), Utrecht University, Universiteitsweg 99, 3584 CG Utrecht, The Netherlands
- Mass Spectrometry and Proteomics Facility, The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
| | - Bas van Breukelen
- Biomolecular Mass Spectrometry and Proteomics, Padualaan 8, 3584 CH Utrecht, The Netherlands
- Utrecht Institute for Pharmaceutical Sciences (UIPS), Utrecht University, Universiteitsweg 99, 3584 CG Utrecht, The Netherlands
| |
Collapse
|
9
|
Zhan Y, Liu J, Wu M, Tan CSH, Li X, Ou-Yang L. A partially shared joint clustering framework for detecting protein complexes from multiple state-specific signed interaction networks. Comput Biol Med 2023; 159:106936. [PMID: 37105110 DOI: 10.1016/j.compbiomed.2023.106936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 03/27/2023] [Accepted: 04/13/2023] [Indexed: 04/29/2023]
Abstract
Detecting protein complexes is critical for studying cellular organizations and functions. The accumulation of protein-protein interaction (PPI) data enables the identification of protein complexes computationally. Although a great number of computational methods have been proposed to identify protein complexes from PPI networks, most of them ignore the signs of PPIs that reflect the ways proteins interact (activation or inhibition). As not all PPIs imply co-complex relationships, taking into account the signs of PPIs can benefit the identification of protein complexes. Moreover, PPI networks are not static, but vary with the change of cell states or environments. However, existing methods are primarily designed for single-network clustering, and rarely consider joint clustering of multiple PPI networks. In this study, we propose a novel partially shared signed network clustering (PS-SNC) model for identifying protein complexes from multiple state-specific signed PPI networks jointly. PS-SNC can not only consider the signs of PPIs, but also identify the common and unique protein complexes in different states. Experimental results on synthetic and real datasets show that our PS-SNC model can achieve better performance than other state-of-the-art protein complex detection methods. Extensive analysis on real datasets demonstrate the effectiveness of PS-SNC in revealing novel insights about the underlying patterns of different cell lines.
Collapse
Affiliation(s)
- Youlin Zhan
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen Key Laboratory of Media Security, and Guangdong Laboratory of Artificial Intelligence and Digital Economy(SZ), College of Electronics and Information Engineering, Shenzhen University, Shenzhen, 518060, China
| | - Jiahan Liu
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen Key Laboratory of Media Security, and Guangdong Laboratory of Artificial Intelligence and Digital Economy(SZ), College of Electronics and Information Engineering, Shenzhen University, Shenzhen, 518060, China
| | - Min Wu
- Institute for Infocomm Research (I2R), Agency of Science, Technology, and Research (A*STAR), 138632, Singapore
| | - Chris Soon Heng Tan
- Department of Chemistry, College of Science, Southern University of Science and Technology, Shenzhen, 518055, China
| | - Xiaoli Li
- Institute for Infocomm Research (I2R), Agency of Science, Technology, and Research (A*STAR), 138632, Singapore
| | - Le Ou-Yang
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen Key Laboratory of Media Security, and Guangdong Laboratory of Artificial Intelligence and Digital Economy(SZ), College of Electronics and Information Engineering, Shenzhen University, Shenzhen, 518060, China; Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, 518129, China.
| |
Collapse
|
10
|
Paquet E, Viktor HL, Madi K, Wu J. Deformable Protein Shape Classification Based on Deep Learning, and the Fractional Fokker-Planck and Kähler-Dirac Equations. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:391-407. [PMID: 35085073 DOI: 10.1109/tpami.2022.3146796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
The classification of deformable protein shapes, based solely on their macromolecular surfaces, is a challenging problem in protein-protein interaction prediction and protein design. Shape classification is made difficult by the fact that proteins are dynamic, flexible entities with high geometrical complexity. In this paper, we introduce a novel description for such deformable shapes. This description is based on the bifractional Fokker-Planck and Dirac-Kähler equations. These equations analyse and probe protein shapes in terms of a scalar, vectorial and non-commuting quaternionic field, allowing for a more comprehensive description of the protein shapes. An underlying non-Markovian Lévy random walk establishes geometrical relationships between distant regions while recalling previous analyses. Classification is performed with a multiobjective deep hierarchical pyramidal neural network, thus performing a multilevel analysis of the description. Our approach is applied to the SHREC'19 dataset for deformable protein shapes classification and to the SHREC'16 dataset for deformable partial shapes classification, demonstrating the effectiveness and generality of our approach.
Collapse
|
11
|
Manipur I, Giordano M, Piccirillo M, Parashuraman S, Maddalena L. Community Detection in Protein-Protein Interaction Networks and Applications. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:217-237. [PMID: 34951849 DOI: 10.1109/tcbb.2021.3138142] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
The ability to identify and characterize not only the protein-protein interactions but also their internal modular organization through network analysis is fundamental for understanding the mechanisms of biological processes at the molecular level. Indeed, the detection of the network communities can enhance our understanding of the molecular basis of disease pathology, and promote drug discovery and disease treatment in personalized medicine. This work gives an overview of recent computational methods for the detection of protein complexes and functional modules in protein-protein interaction networks, also providing a focus on some of its applications. We propose a systematic reformulation of frequently adopted taxonomies for these methods, also proposing new categories to keep up with the most recent research. We review the literature of the last five years (2017-2021) and provide links to existing data and software resources. Finally, we survey recent works exploiting module identification and analysis, in the context of a variety of disease processes for biomarker identification and therapeutic target detection. Our review provides the interested reader with an up-to-date and self-contained view of the existing research, with links to state-of-the-art literature and resources, as well as hints on open issues and future research directions in complex detection and its applications.
Collapse
|
12
|
Lim H, Tsai CJ, Keskin O, Nussinov R, Gursoy A. HMI-PRED 2.0: a biologist-oriented web application for prediction of host-microbe protein-protein interaction by interface mimicry. Bioinformatics 2022; 38:4962-4965. [PMID: 36124958 PMCID: PMC9620825 DOI: 10.1093/bioinformatics/btac633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 08/05/2022] [Accepted: 09/15/2022] [Indexed: 11/19/2022] Open
Abstract
SUMMARY HMI-PRED 2.0 is a publicly available web service for the prediction of host-microbe protein-protein interaction by interface mimicry that is intended to be used without extensive computational experience. A microbial protein structure is screened against a database covering the entire available structural space of complexes of known human proteins. AVAILABILITY AND IMPLEMENTATION HMI-PRED 2.0 provides user-friendly graphic interfaces for predicting, visualizing and analyzing host-microbe interactions. HMI-PRED 2.0 is available at https://hmipred.org/.
Collapse
Affiliation(s)
- Hansaim Lim
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research in the Cancer Innovation Laboratory, NCI-Frederick, Frederick, MD 21702, USA
| | - Chung-Jung Tsai
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research in the Cancer Innovation Laboratory, NCI-Frederick, Frederick, MD 21702, USA
| | - Ozlem Keskin
- Department of Chemical and Biological Engineering, Koç University, Istanbul 34450, Turkey
| | - Ruth Nussinov
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research in the Cancer Innovation Laboratory, NCI-Frederick, Frederick, MD 21702, USA
- Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Attila Gursoy
- Department of Computer Engineering, Koç University, Istanbul 34450, Turkey
| |
Collapse
|
13
|
Wang R, Wang C, Ma H. Detecting protein complexes with multiple properties by an adaptive harmony search algorithm. BMC Bioinformatics 2022; 23:414. [PMID: 36207692 PMCID: PMC9541083 DOI: 10.1186/s12859-022-04923-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Accepted: 09/12/2022] [Indexed: 11/27/2022] Open
Abstract
Background Accurate identification of protein complexes in protein-protein interaction (PPI) networks is crucial for understanding the principles of cellular organization. Most computational methods ignore the fact that proteins in a protein complex have a functional similarity and are co-localized and co-expressed at the same place and time, respectively. Meanwhile, the parameters of the current methods are specified by users, so these methods cannot effectively deal with different input PPI networks. Result To address these issues, this study proposes a new method called MP-AHSA to detect protein complexes with Multiple Properties (MP), and an Adaptation Harmony Search Algorithm is developed to optimize the parameters of the MP algorithm. First, a weighted PPI network is constructed using functional annotations, and multiple biological properties and the Markov cluster algorithm (MCL) are used to mine protein complex cores. Then, a fitness function is defined, and a protein complex forming strategy is designed to detect attachment proteins and form protein complexes. Next, a protein complex filtering strategy is formulated to filter out the protein complexes. Finally, an adaptation harmony search algorithm is developed to determine the MP algorithm’s parameters automatically. Conclusions Experimental results show that the proposed MP-AHSA method outperforms 14 state-of-the-art methods for identifying protein complexes. Also, the functional enrichment analyses reveal that the protein complexes identified by the MP-AHSA algorithm have significant biological relevance. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04923-4.
Collapse
Affiliation(s)
- Rongquan Wang
- School of Computer and Communication Engineering, University of Science and Technology Beijing, No. 30 Xueyuan Road, Haidian District, Beijing, 100083, China
| | - Caixia Wang
- School of International Economics, China Foreign Affairs University, 24 Zhanlanguan Road, Xicheng District, Beijing, 100037, China
| | - Huimin Ma
- School of Computer and Communication Engineering, University of Science and Technology Beijing, No. 30 Xueyuan Road, Haidian District, Beijing, 100083, China.
| |
Collapse
|
14
|
Hesami M, Alizadeh M, Jones AMP, Torkamaneh D. Machine learning: its challenges and opportunities in plant system biology. Appl Microbiol Biotechnol 2022; 106:3507-3530. [PMID: 35575915 DOI: 10.1007/s00253-022-11963-6] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 03/14/2022] [Accepted: 05/07/2022] [Indexed: 12/25/2022]
Abstract
Sequencing technologies are evolving at a rapid pace, enabling the generation of massive amounts of data in multiple dimensions (e.g., genomics, epigenomics, transcriptomic, metabolomics, proteomics, and single-cell omics) in plants. To provide comprehensive insights into the complexity of plant biological systems, it is important to integrate different omics datasets. Although recent advances in computational analytical pipelines have enabled efficient and high-quality exploration and exploitation of single omics data, the integration of multidimensional, heterogenous, and large datasets (i.e., multi-omics) remains a challenge. In this regard, machine learning (ML) offers promising approaches to integrate large datasets and to recognize fine-grained patterns and relationships. Nevertheless, they require rigorous optimizations to process multi-omics-derived datasets. In this review, we discuss the main concepts of machine learning as well as the key challenges and solutions related to the big data derived from plant system biology. We also provide in-depth insight into the principles of data integration using ML, as well as challenges and opportunities in different contexts including multi-omics, single-cell omics, protein function, and protein-protein interaction. KEY POINTS: • The key challenges and solutions related to the big data derived from plant system biology have been highlighted. • Different methods of data integration have been discussed. • Challenges and opportunities of the application of machine learning in plant system biology have been highlighted and discussed.
Collapse
Affiliation(s)
- Mohsen Hesami
- Department of Plant Agriculture, University of Guelph, Guelph, ON, N1G 2W1, Canada
| | - Milad Alizadeh
- Department of Botany, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | | | - Davoud Torkamaneh
- Département de Phytologie, Université Laval, Québec City, QC, G1V 0A6, Canada. .,Institut de Biologie Intégrative Et Des Systèmes (IBIS), Université Laval, Québec City, QC, G1V 0A6, Canada.
| |
Collapse
|
15
|
Inference of Molecular Regulatory Systems Using Statistical Path-Consistency Algorithm. ENTROPY 2022; 24:e24050693. [PMID: 35626576 PMCID: PMC9142129 DOI: 10.3390/e24050693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 05/12/2022] [Accepted: 05/12/2022] [Indexed: 11/16/2022]
Abstract
One of the key challenges in systems biology and molecular sciences is how to infer regulatory relationships between genes and proteins using high-throughout omics datasets. Although a wide range of methods have been designed to reverse engineer the regulatory networks, recent studies show that the inferred network may depend on the variable order in the dataset. In this work, we develop a new algorithm, called the statistical path-consistency algorithm (SPCA), to solve the problem of the dependence of variable order. This method generates a number of different variable orders using random samples, and then infers a network by using the path-consistent algorithm based on each variable order. We propose measures to determine the edge weights using the corresponding edge weights in the inferred networks, and choose the edges with the largest weights as the putative regulations between genes or proteins. The developed method is rigorously assessed by the six benchmark networks in DREAM challenges, the mitogen-activated protein (MAP) kinase pathway, and a cancer-specific gene regulatory network. The inferred networks are compared with those obtained by using two up-to-date inference methods. The accuracy of the inferred networks shows that the developed method is effective for discovering molecular regulatory systems.
Collapse
|
16
|
Omranian S, Nikoloski Z, Grimm DG. Computational identification of protein complexes from network interactions: Present state, challenges, and the way forward. Comput Struct Biotechnol J 2022; 20:2699-2712. [PMID: 35685359 PMCID: PMC9166428 DOI: 10.1016/j.csbj.2022.05.049] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 05/25/2022] [Accepted: 05/25/2022] [Indexed: 01/05/2023] Open
|
17
|
Rogawski R, Sharon M. Characterizing Endogenous Protein Complexes with Biological Mass Spectrometry. Chem Rev 2022; 122:7386-7414. [PMID: 34406752 PMCID: PMC9052418 DOI: 10.1021/acs.chemrev.1c00217] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Indexed: 01/11/2023]
Abstract
Biological mass spectrometry (MS) encompasses a range of methods for characterizing proteins and other biomolecules. MS is uniquely powerful for the structural analysis of endogenous protein complexes, which are often heterogeneous, poorly abundant, and refractive to characterization by other methods. Here, we focus on how biological MS can contribute to the study of endogenous protein complexes, which we define as complexes expressed in the physiological host and purified intact, as opposed to reconstituted complexes assembled from heterologously expressed components. Biological MS can yield information on complex stoichiometry, heterogeneity, topology, stability, activity, modes of regulation, and even structural dynamics. We begin with a review of methods for isolating endogenous complexes. We then describe the various biological MS approaches, focusing on the type of information that each method yields. We end with future directions and challenges for these MS-based methods.
Collapse
Affiliation(s)
- Rivkah Rogawski
- Department of Biomolecular
Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Michal Sharon
- Department of Biomolecular
Sciences, Weizmann Institute of Science, Rehovot 7610001, Israel
| |
Collapse
|
18
|
Xiang J, Meng X, Zhao Y, Wu FX, Li M. HyMM: hybrid method for disease-gene prediction by integrating multiscale module structure. Brief Bioinform 2022; 23:6547263. [PMID: 35275996 DOI: 10.1093/bib/bbac072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 01/18/2022] [Accepted: 02/13/2022] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Identifying disease-related genes is an important issue in computational biology. Module structure widely exists in biomolecule networks, and complex diseases are usually thought to be caused by perturbations of local neighborhoods in the networks, which can provide useful insights for the study of disease-related genes. However, the mining and effective utilization of the module structure is still challenging in such issues as a disease gene prediction. RESULTS We propose a hybrid disease-gene prediction method integrating multiscale module structure (HyMM), which can utilize multiscale information from local to global structure to more effectively predict disease-related genes. HyMM extracts module partitions from local to global scales by multiscale modularity optimization with exponential sampling, and estimates the disease relatedness of genes in partitions by the abundance of disease-related genes within modules. Then, a probabilistic model for integration of gene rankings is designed in order to integrate multiple predictions derived from multiscale module partitions and network propagation, and a parameter estimation strategy based on functional information is proposed to further enhance HyMM's predictive power. By a series of experiments, we reveal the importance of module partitions at different scales, and verify the stable and good performance of HyMM compared with eight other state-of-the-arts and its further performance improvement derived from the parameter estimation. CONCLUSIONS The results confirm that HyMM is an effective framework for integrating multiscale module structure to enhance the ability to predict disease-related genes, which may provide useful insights for the study of the multiscale module structure and its application in such issues as a disease-gene prediction.
Collapse
Affiliation(s)
- Ju Xiang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China; Department of Basic Medical Sciences & Academician Workstation, Changsha Medical University, Changsha, Hunan 410219, China
| | - Xiangmao Meng
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Yichao Zhao
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering and Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK, S7N 5A9, Canada
| | - Min Li
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
19
|
Xiang J, Zhang J, Zhao Y, Wu FX, Li M. Biomedical data, computational methods and tools for evaluating disease-disease associations. Brief Bioinform 2022; 23:6522999. [PMID: 35136949 DOI: 10.1093/bib/bbac006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 01/04/2022] [Accepted: 01/05/2022] [Indexed: 12/12/2022] Open
Abstract
In recent decades, exploring potential relationships between diseases has been an active research field. With the rapid accumulation of disease-related biomedical data, a lot of computational methods and tools/platforms have been developed to reveal intrinsic relationship between diseases, which can provide useful insights to the study of complex diseases, e.g. understanding molecular mechanisms of diseases and discovering new treatment of diseases. Human complex diseases involve both external phenotypic abnormalities and complex internal molecular mechanisms in organisms. Computational methods with different types of biomedical data from phenotype to genotype can evaluate disease-disease associations at different levels, providing a comprehensive perspective for understanding diseases. In this review, available biomedical data and databases for evaluating disease-disease associations are first summarized. Then, existing computational methods for disease-disease associations are reviewed and classified into five groups in terms of the usages of biomedical data, including disease semantic-based, phenotype-based, function-based, representation learning-based and text mining-based methods. Further, we summarize software tools/platforms for computation and analysis of disease-disease associations. Finally, we give a discussion and summary on the research of disease-disease associations. This review provides a systematic overview for current disease association research, which could promote the development and applications of computational methods and tools/platforms for disease-disease associations.
Collapse
Affiliation(s)
- Ju Xiang
- School of Computer Science and Engineering, Central South University, China
| | - Jiashuai Zhang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Yichao Zhao
- School of Computer Science and Engineering, Central South University, China
| | - Fang-Xiang Wu
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Min Li
- Division of Biomedical Engineering and Department of Mechanical Engineering at University of Saskatchewan, Saskatoon, Canada
| |
Collapse
|
20
|
Redhu N, Thakur Z. Network biology and applications. Bioinformatics 2022. [DOI: 10.1016/b978-0-323-89775-4.00024-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open
|
21
|
Zhou J, Xiong W, Wang Y, Guan J. Protein Function Prediction Based on PPI Networks: Network Reconstruction vs Edge Enrichment. Front Genet 2022; 12:758131. [PMID: 34970299 PMCID: PMC8712557 DOI: 10.3389/fgene.2021.758131] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Accepted: 11/11/2021] [Indexed: 01/21/2023] Open
Abstract
Over the past decades, massive amounts of protein-protein interaction (PPI) data have been accumulated due to the advancement of high-throughput technologies, and but data quality issues (noise or incompleteness) of PPI have been still affecting protein function prediction accuracy based on PPI networks. Although two main strategies of network reconstruction and edge enrichment have been reported on the effectiveness of boosting the prediction performance in numerous literature studies, there still lack comparative studies of the performance differences between network reconstruction and edge enrichment. Inspired by the question, this study first uses three protein similarity metrics (local, global and sequence) for network reconstruction and edge enrichment in PPI networks, and then evaluates the performance differences of network reconstruction, edge enrichment and the original networks on two real PPI datasets. The experimental results demonstrate that edge enrichment work better than both network reconstruction and original networks. Moreover, for the edge enrichment of PPI networks, the sequence similarity outperformes both local and global similarity. In summary, our study can help biologists select suitable pre-processing schemes and achieve better protein function prediction for PPI networks.
Collapse
Affiliation(s)
- Jiaogen Zhou
- Jiangsu Provincial Engineering Research Center for Intelligent Monitoring and Ecological Management of Pond and Reservoir Water Environment, Huaiyin Normal University, Huian, China
| | - Wei Xiong
- Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai, China
| | - Yang Wang
- Department of Computer Science and Technology, Tongji University, Shanghai, China
| | - Jihong Guan
- Department of Computer Science and Technology, Tongji University, Shanghai, China
| |
Collapse
|
22
|
Sensitivity of family GH11 Bacillus amyloliquefaciens xylanase A (BaxA) and the T33I mutant to Oryza sativa xylanase inhibitor protein (OsXIP): An experimental and computational study. Enzyme Microb Technol 2022; 156:109998. [DOI: 10.1016/j.enzmictec.2022.109998] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2021] [Revised: 01/17/2022] [Accepted: 01/27/2022] [Indexed: 11/22/2022]
|
23
|
Omranian S, Angeleska A, Nikoloski Z. Efficient and accurate identification of protein complexes from protein-protein interaction networks based on the clustering coefficient. Comput Struct Biotechnol J 2021; 19:5255-5263. [PMID: 34630943 PMCID: PMC8479235 DOI: 10.1016/j.csbj.2021.09.014] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 09/13/2021] [Accepted: 09/13/2021] [Indexed: 12/23/2022] Open
Abstract
Provided a family of efficient network algorithms for protein complex identification. The parameter-free family outperforms existing approaches on different networks. It exactly recovered ~ 35% of protein complexes in a pan-plant PPI network. We examined of network perturbations on predicted protein complexes.
Identification of protein complexes from protein-protein interaction (PPI) networks is a key problem in PPI mining, solved by parameter-dependent approaches that suffer from small recall rates. Here we introduce GCC-v, a family of efficient, parameter-free algorithms to accurately predict protein complexes using the (weighted) clustering coefficient of proteins in PPI networks. Through comparative analyses with gold standards and PPI networks from Escherichia coli, Saccharomyces cerevisiae, and Homo sapiens, we demonstrate that GCC-v outperforms twelve state-of-the-art approaches for identification of protein complexes with respect to twelve performance measures in at least 85.71% of scenarios. We also show that GCC-v results in the exact recovery of ∼35% of protein complexes in a pan-plant PPI network and discover 144 new protein complexes in Arabidopsis thaliana, with high support from GO semantic similarity. Our results indicate that findings from GCC-v are robust to network perturbations, which has direct implications to assess the impact of the PPI network quality on the predicted protein complexes.
Collapse
Affiliation(s)
- Sara Omranian
- Bioinformatics, Institute of Biochemistry and Biology, University of Potsdam, 14476 Potsdam, Germany.,Systems Biology and Mathematical Modeling, Max Planck Institute of Molecular Plant Physiology, 14476 Potsdam, Germany
| | | | - Zoran Nikoloski
- Bioinformatics, Institute of Biochemistry and Biology, University of Potsdam, 14476 Potsdam, Germany.,Systems Biology and Mathematical Modeling, Max Planck Institute of Molecular Plant Physiology, 14476 Potsdam, Germany
| |
Collapse
|
24
|
Omranian S, Angeleska A, Nikoloski Z. PC2P: Parameter-free network-based prediction of protein complexes. Bioinformatics 2021; 37:73-81. [PMID: 33416831 PMCID: PMC8034538 DOI: 10.1093/bioinformatics/btaa1089] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Revised: 12/17/2020] [Accepted: 12/30/2020] [Indexed: 11/12/2022] Open
Abstract
Motivation Prediction of protein complexes from protein–protein interaction (PPI) networks is an important problem in systems biology, as they control different cellular functions. The existing solutions employ algorithms for network community detection that identify dense subgraphs in PPI networks. However, gold standards in yeast and human indicate that protein complexes can also induce sparse subgraphs, introducing further challenges in protein complex prediction. Results To address this issue, we formalize protein complexes as biclique spanned subgraphs, which include both sparse and dense subgraphs. We then cast the problem of protein complex prediction as a network partitioning into biclique spanned subgraphs with removal of minimum number of edges, called coherent partition. Since finding a coherent partition is a computationally intractable problem, we devise a parameter-free greedy approximation algorithm, termed Protein Complexes from Coherent Partition (PC2P), based on key properties of biclique spanned subgraphs. Through comparison with nine contenders, we demonstrate that PC2P: (i) successfully identifies modular structure in networks, as a prerequisite for protein complex prediction, (ii) outperforms the existing solutions with respect to a composite score of five performance measures on 75% and 100% of the analyzed PPI networks and gold standards in yeast and human, respectively, and (iii,iv) does not compromise GO semantic similarity and enrichment score of the predicted protein complexes. Therefore, our study demonstrates that clustering of networks in terms of biclique spanned subgraphs is a promising framework for detection of complexes in PPI networks. Availability and implementation https://github.com/SaraOmranian/PC2P. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sara Omranian
- Bioinformatics, Institute of Biochemistry and Biology, University of Potsdam, 14476, Potsdam, Germany.,Systems Biology and Mathematical Modeling, Max Planck Institute of Molecular Plant Physiology, 14476, Potsdam, Germany
| | | | - Zoran Nikoloski
- Bioinformatics, Institute of Biochemistry and Biology, University of Potsdam, 14476, Potsdam, Germany.,Systems Biology and Mathematical Modeling, Max Planck Institute of Molecular Plant Physiology, 14476, Potsdam, Germany.,Centre of Plant Systems Biology and Biotechnology (CPSBB), Plovdiv, Bulgaria
| |
Collapse
|
25
|
Li Y, Yao W, Lin J, Gao G, Huang C, Wu Y. Design, synthesis, and biological evaluation of phenyloxadiazole derivatives as potential antifungal agents against phytopathogenic fungi. MONATSHEFTE FUR CHEMIE 2021. [DOI: 10.1007/s00706-020-02717-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
26
|
Massoud TF, Paulmurugan R. Molecular Imaging of Protein–Protein Interactions and Protein Folding. Mol Imaging 2021. [DOI: 10.1016/b978-0-12-816386-3.00071-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
|
27
|
Wu Z, Liao Q, Fan S, Liu B. idenPC-CAP: Identify protein complexes from weighted RNA-protein heterogeneous interaction networks using co-assemble partner relation. Brief Bioinform 2020; 22:6041167. [PMID: 33333549 DOI: 10.1093/bib/bbaa372] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 11/07/2020] [Accepted: 11/20/2020] [Indexed: 12/18/2022] Open
Abstract
Protein complexes play important roles in most cellular processes. The available genome-wide protein-protein interaction (PPI) data make it possible for computational methods identifying protein complexes from PPI networks. However, PPI datasets usually contain a large ratio of false positive noise. Moreover, different types of biomolecules in a living cell cooperate to form a union interaction network. Because previous computational methods focus only on PPIs ignoring other types of biomolecule interactions, their predicted protein complexes often contain many false positive proteins. In this study, we develop a novel computational method idenPC-CAP to identify protein complexes from the RNA-protein heterogeneous interaction network consisting of RNA-RNA interactions, RNA-protein interactions and PPIs. By considering interactions among proteins and RNAs, the new method reduces the ratio of false positive proteins in predicted protein complexes. The experimental results demonstrate that idenPC-CAP outperforms the other state-of-the-art methods in this field.
Collapse
Affiliation(s)
- Zhourun Wu
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
| | - Qing Liao
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
| | - Shixi Fan
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
| | - Bin Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, China
| |
Collapse
|
28
|
Mazandu GK, Hooper C, Opap K, Makinde F, Nembaware V, Thomford NE, Chimusa ER, Wonkam A, Mulder NJ. IHP-PING-generating integrated human protein-protein interaction networks on-the-fly. Brief Bioinform 2020; 22:5943797. [PMID: 33129201 DOI: 10.1093/bib/bbaa277] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 09/12/2020] [Accepted: 09/21/2020] [Indexed: 01/04/2023] Open
Abstract
Advances in high-throughput sequencing technologies have resulted in an exponential growth of publicly accessible biological datasets. In the 'big data' driven 'post-genomic' context, much work is being done to explore human protein-protein interactions (PPIs) for a systems level based analysis to uncover useful signals and gain more insights to advance current knowledge and answer specific biological and health questions. These PPIs are experimentally or computationally predicted, stored in different online databases and some of PPI resources are updated regularly. As with many biological datasets, such regular updates continuously render older PPI datasets potentially outdated. Moreover, while many of these interactions are shared between these online resources, each resource includes its own identified PPIs and none of these databases exhaustively contains all existing human PPI maps. In this context, it is essential to enable the integration of or combining interaction datasets from different resources, to generate a PPI map with increased coverage and confidence. To allow researchers to produce an integrated human PPI datasets in real-time, we introduce the integrated human protein-protein interaction network generator (IHP-PING) tool. IHP-PING is a flexible python package which generates a human PPI network from freely available online resources. This tool extracts and integrates heterogeneous PPI datasets to generate a unified PPI network, which is stored locally for further applications.
Collapse
Affiliation(s)
- Gaston K Mazandu
- Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, CIDRI-Africa WT Centre, University of Cape Town, Health Sciences Campus. Anzio Rd, Observatory, 7925, South Africa.,African Institute for Mathematical Sciences, 5-7 Melrose Road, Muizenberg, 7945, Cape Town, South Africa.,Division of Human Genetics, Department of Pathology, University of Cape Town, Health Sciences Campus, Anzio Rd, Observatory, 7925, South Africa
| | - Christopher Hooper
- Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, CIDRI-Africa WT Centre, University of Cape Town, Health Sciences Campus. Anzio Rd, Observatory, 7925, South Africa
| | - Kenneth Opap
- Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, CIDRI-Africa WT Centre, University of Cape Town, Health Sciences Campus. Anzio Rd, Observatory, 7925, South Africa
| | - Funmilayo Makinde
- Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, CIDRI-Africa WT Centre, University of Cape Town, Health Sciences Campus. Anzio Rd, Observatory, 7925, South Africa.,African Institute for Mathematical Sciences, 5-7 Melrose Road, Muizenberg, 7945, Cape Town, South Africa
| | - Victoria Nembaware
- Division of Human Genetics, Department of Pathology, University of Cape Town, Health Sciences Campus, Anzio Rd, Observatory, 7925, South Africa
| | - Nicholas E Thomford
- Division of Human Genetics, Department of Pathology, University of Cape Town, Health Sciences Campus, Anzio Rd, Observatory, 7925, South Africa.,School of Medical Sciences, University of Cape Coast, PMB, Cape Coast, Ghana
| | - Emile R Chimusa
- Division of Human Genetics, Department of Pathology, University of Cape Town, Health Sciences Campus, Anzio Rd, Observatory, 7925, South Africa
| | - Ambroise Wonkam
- Division of Human Genetics, Department of Pathology, University of Cape Town, Health Sciences Campus, Anzio Rd, Observatory, 7925, South Africa
| | - Nicola J Mulder
- Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, CIDRI-Africa WT Centre, University of Cape Town, Health Sciences Campus. Anzio Rd, Observatory, 7925, South Africa
| |
Collapse
|
29
|
Ivanov S, Lagunin A, Filimonov D, Tarasova O. Network-Based Analysis of OMICs Data to Understand the HIV-Host Interaction. Front Microbiol 2020; 11:1314. [PMID: 32625189 PMCID: PMC7311653 DOI: 10.3389/fmicb.2020.01314] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Accepted: 05/25/2020] [Indexed: 12/22/2022] Open
Abstract
The interaction of human immunodeficiency virus with human cells is responsible for all stages of the viral life cycle, from the infection of CD4+ cells to reverse transcription, integration, and the assembly of new viral particles. To date, a large amount of OMICs data as well as information from functional genomics screenings regarding the HIV–host interaction has been accumulated in the literature and in public databases. We processed databases containing HIV–host interactions and found 2910 HIV-1-human protein-protein interactions, mostly related to viral group M subtype B, 137 interactions between human and HIV-1 coding and non-coding RNAs, essential for viral lifecycle and cell defense mechanisms, 232 transcriptomics, 27 proteomics, and 34 epigenomics HIV-related experiments. Numerous studies regarding network-based analysis of corresponding OMICs data have been published in recent years. We overview various types of molecular networks, which can be created using OMICs data, including HIV–human protein–protein interaction networks, co-expression networks, gene regulatory and signaling networks, and approaches for the analysis of their topology and dynamics. The network-based analysis can be used to determine the critical pathways and key proteins involved in the HIV life cycle, cellular and immune responses to infection, viral escape from host defense mechanisms, and mechanisms mediating different susceptibility of humans to infection. The proteins and pathways identified in these studies represent a basis for developing new anti-HIV therapeutic strategies such as new drugs preventing infection of CD4+ cells and viral replication, effective vaccines, “shock and kill” and “block and lock” approaches to cure latent infection.
Collapse
Affiliation(s)
- Sergey Ivanov
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow, Russia.,Department of Bioinformatics, Pirogov Russian National Research Medical University, Moscow, Russia
| | - Alexey Lagunin
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow, Russia.,Department of Bioinformatics, Pirogov Russian National Research Medical University, Moscow, Russia
| | - Dmitry Filimonov
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow, Russia
| | - Olga Tarasova
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow, Russia
| |
Collapse
|
30
|
Liu B, Gao X, Zhang H. BioSeq-Analysis2.0: an updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches. Nucleic Acids Res 2020; 47:e127. [PMID: 31504851 PMCID: PMC6847461 DOI: 10.1093/nar/gkz740] [Citation(s) in RCA: 269] [Impact Index Per Article: 53.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Revised: 08/07/2019] [Accepted: 08/17/2019] [Indexed: 12/14/2022] Open
Abstract
As the first web server to analyze various biological sequences at sequence level based on machine learning approaches, many powerful predictors in the field of computational biology have been developed with the assistance of the BioSeq-Analysis. However, the BioSeq-Analysis can be only applied to the sequence-level analysis tasks, preventing its applications to the residue-level analysis tasks, and an intelligent tool that is able to automatically generate various predictors for biological sequence analysis at both residue level and sequence level is highly desired. In this regard, we decided to publish an important updated server covering a total of 26 features at the residue level and 90 features at the sequence level called BioSeq-Analysis2.0 (http://bliulab.net/BioSeq-Analysis2.0/), by which the users only need to upload the benchmark dataset, and the BioSeq-Analysis2.0 can generate the predictors for both residue-level analysis and sequence-level analysis tasks. Furthermore, the corresponding stand-alone tool was also provided, which can be downloaded from http://bliulab.net/BioSeq-Analysis2.0/download/. To the best of our knowledge, the BioSeq-Analysis2.0 is the first tool for generating predictors for biological sequence analysis tasks at residue level. Specifically, the experimental results indicated that the predictors developed by BioSeq-Analysis2.0 can achieve comparable or even better performance than the existing state-of-the-art predictors.
Collapse
Affiliation(s)
- Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China.,Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing, China
| | - Xin Gao
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
| | - Hanyu Zhang
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
| |
Collapse
|
31
|
Wu Z, Liao Q, Liu B. idenPC-MIIP: identify protein complexes from weighted PPI networks using mutual important interacting partner relation. Brief Bioinform 2020; 22:1972-1983. [PMID: 32065215 DOI: 10.1093/bib/bbaa016] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Revised: 01/15/2020] [Accepted: 01/27/2020] [Indexed: 12/28/2022] Open
Abstract
Protein complexes are key units for studying a cell system. During the past decades, the genome-scale protein-protein interaction (PPI) data have been determined by high-throughput approaches, which enables the identification of protein complexes from PPI networks. However, the high-throughput approaches often produce considerable fraction of false positive and negative samples. In this study, we propose the mutual important interacting partner relation to reflect the co-complex relationship of two proteins based on their interaction neighborhoods. In addition, a new algorithm called idenPC-MIIP is developed to identify protein complexes from weighted PPI networks. The experimental results on two widely used datasets show that idenPC-MIIP outperforms 17 state-of-the-art methods, especially for identification of small protein complexes with only two or three proteins.
Collapse
Affiliation(s)
- Zhourun Wu
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
| | - Qing Liao
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
| | - Bin Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, China, and School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
32
|
Liu B, Zhu Y, Yan K. Fold-LTR-TCP: protein fold recognition based on triadic closure principle. Brief Bioinform 2019; 21:2185-2193. [DOI: 10.1093/bib/bbz139] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2019] [Revised: 10/01/2019] [Accepted: 10/09/2019] [Indexed: 11/13/2022] Open
Abstract
Abstract
As an important task in protein structure and function studies, protein fold recognition has attracted more and more attention. The existing computational predictors in this field treat this task as a multi-classification problem, ignoring the relationship among proteins in the dataset. However, previous studies showed that their relationship is critical for protein homology analysis. In this study, the protein fold recognition is treated as an information retrieval task. The Learning to Rank model (LTR) was employed to retrieve the query protein against the template proteins to find the template proteins in the same fold with the query protein in a supervised manner. The triadic closure principle (TCP) was performed on the ranking list generated by the LTR to improve its accuracy by considering the relationship among the query protein and the template proteins in the ranking list. Finally, a predictor called Fold-LTR-TCP was proposed. The rigorous test on the LE benchmark dataset showed that the Fold-LTR-TCP predictor achieved an accuracy of 73.2%, outperforming all the other competing methods.
Collapse
Affiliation(s)
- Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing, China
| | - Yulin Zhu
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong 518055, China
| | - Ke Yan
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong 518055, China
| |
Collapse
|