1
|
Anyaegbunam UA, Vagiona AC, ten Cate V, Bauer K, Schmidlin T, Distler U, Tenzer S, Araldi E, Bindila L, Wild P, Andrade-Navarro MA. A Map of the Lipid-Metabolite-Protein Network to Aid Multi-Omics Integration. Biomolecules 2025; 15:484. [PMID: 40305217 PMCID: PMC12024871 DOI: 10.3390/biom15040484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2025] [Revised: 03/13/2025] [Accepted: 03/20/2025] [Indexed: 05/02/2025] Open
Abstract
The integration of multi-omics data offers transformative potential for elucidating complex molecular mechanisms underlying biological processes and diseases. In this study, we developed a lipid-metabolite-protein network that combines a protein-protein interaction network and enzymatic and genetic interactions of proteins with metabolites and lipids to provide a unified framework for multi-omics integration. Using hyperbolic embedding, the network visualizes connections across omics layers, accessible through a user-friendly Shiny R (version 1.10.0) software package. This framework ranks molecules across omics layers based on functional proximity, enabling intuitive exploration. Application in a cardiovascular disease (CVD) case study identified lipids and metabolites associated with CVD-related proteins. The analysis confirmed known associations, like cholesterol esters and sphingomyelin, and highlighted potential novel biomarkers, such as 4-imidazoleacetate and indoleacetaldehyde. Furthermore, we used the network to analyze empagliflozin's temporal effects on lipid metabolism. Functional enrichment analysis of proteins associated with lipid signatures revealed dynamic shifts in biological processes, with early effects impacting phospholipid metabolism and long-term effects affecting sphingolipid biosynthesis. Our framework offers a versatile tool for hypothesis generation, functional analysis, and biomarker discovery. By bridging molecular layers, this approach advances our understanding of disease mechanisms and therapeutic effects, with broad applications in computational biology and precision medicine.
Collapse
Affiliation(s)
- Uchenna Alex Anyaegbunam
- Computational Biology and Data Mining Group (CBDM), Institute of Organismic and Molecular Evolution (iOME), Johannes Gutenberg University, 55122 Mainz, Germany
| | - Aimilia-Christina Vagiona
- Computational Biology and Data Mining Group (CBDM), Institute of Organismic and Molecular Evolution (iOME), Johannes Gutenberg University, 55122 Mainz, Germany
| | - Vincent ten Cate
- Preventive Cardiology and Preventive Medicine, Department of Cardiology, University Medical Center, Johannes-Gutenberg University Mainz, Langenbeckstr. 1, 55131 Mainz, Germany
- Clinical Epidemiology and Systems Medicine, Center for Thrombosis and Hemostasis (CTH), University Medical Center, 55131 Mainz, Germany
- German Center for Cardiovascular Research (DZHK), Partner Site Rhine Main, University Medical Center, Johannes-Gutenberg University Mainz, 55131 Mainz, Germany
| | - Katrin Bauer
- Preventive Cardiology and Preventive Medicine, Department of Cardiology, University Medical Center, Johannes-Gutenberg University Mainz, Langenbeckstr. 1, 55131 Mainz, Germany
- German Center for Cardiovascular Research (DZHK), Partner Site Rhine Main, University Medical Center, Johannes-Gutenberg University Mainz, 55131 Mainz, Germany
- Computational Systems Medicine, Center for Thrombosis and Hemostasis (CTH), 55131 Mainz, Germany
| | - Thierry Schmidlin
- Institute of Immunology, University Medical Center, Johannes-Gutenberg University Mainz, 55131 Mainz, Germany
- Research Centre for Immunotherapy (FZI), University Medical Center, Johannes-Gutenberg University Mainz, 55131 Mainz, Germany
| | - Ute Distler
- Institute of Immunology, University Medical Center, Johannes-Gutenberg University Mainz, 55131 Mainz, Germany
- Research Centre for Immunotherapy (FZI), University Medical Center, Johannes-Gutenberg University Mainz, 55131 Mainz, Germany
| | - Stefan Tenzer
- Institute of Immunology, University Medical Center, Johannes-Gutenberg University Mainz, 55131 Mainz, Germany
- Research Centre for Immunotherapy (FZI), University Medical Center, Johannes-Gutenberg University Mainz, 55131 Mainz, Germany
| | - Elisa Araldi
- Preventive Cardiology and Preventive Medicine, Department of Cardiology, University Medical Center, Johannes-Gutenberg University Mainz, Langenbeckstr. 1, 55131 Mainz, Germany
- German Center for Cardiovascular Research (DZHK), Partner Site Rhine Main, University Medical Center, Johannes-Gutenberg University Mainz, 55131 Mainz, Germany
- Computational Systems Medicine, Center for Thrombosis and Hemostasis (CTH), 55131 Mainz, Germany
- Systems Medicine Laboratory, Department of Medicine and Surgery, University of Parma, 43121 Parma, Italy
| | - Laura Bindila
- Institute of Physiological Chemistry, University Medical Center, 55131 Mainz, Germany
| | - Philipp Wild
- Preventive Cardiology and Preventive Medicine, Department of Cardiology, University Medical Center, Johannes-Gutenberg University Mainz, Langenbeckstr. 1, 55131 Mainz, Germany
- Clinical Epidemiology and Systems Medicine, Center for Thrombosis and Hemostasis (CTH), University Medical Center, 55131 Mainz, Germany
- German Center for Cardiovascular Research (DZHK), Partner Site Rhine Main, University Medical Center, Johannes-Gutenberg University Mainz, 55131 Mainz, Germany
| | - Miguel A. Andrade-Navarro
- Computational Biology and Data Mining Group (CBDM), Institute of Organismic and Molecular Evolution (iOME), Johannes Gutenberg University, 55122 Mainz, Germany
| |
Collapse
|
2
|
Vagiona AC, Notopoulou S, Zdráhal Z, Gonçalves-Kulik M, Petrakis S, Andrade-Navarro MA. Prediction of protein interactions with function in protein (de-)phosphorylation. PLoS One 2025; 20:e0319084. [PMID: 40029919 PMCID: PMC11875375 DOI: 10.1371/journal.pone.0319084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2024] [Accepted: 01/28/2025] [Indexed: 03/06/2025] Open
Abstract
Protein-protein interactions (PPIs) form a complex network called "interactome" that regulates many functions in the cell. In recent years, there is an increasing accumulation of evidence supporting the existence of a hyperbolic geometry underlying the network representation of complex systems such as the interactome. In particular, it has been shown that the embedding of the human Protein-Interaction Network (hPIN) in hyperbolic space (H2) captures biologically relevant information. Here we explore whether this mapping contains information that would allow us to predict the function of PPIs, more specifically interactions related to post-translational modification (PTM). We used a random forest algorithm to predict PTM-related directed PPIs, concretely, protein phosphorylation and dephosphorylation, based on hyperbolic properties and centrality measures of the hPIN mapped in H2. To evaluate the efficacy of our algorithm, we predicted PTM-related PPIs of ataxin-1, a protein which is responsible for Spinocerebellar Ataxia type 1 (SCA1). Proteomics analysis in a cellular model revealed that several of the predicted PTM-PPIs were indeed dysregulated in a SCA1-related disease network. A compact cluster composed of ataxin-1, its dysregulated PTM-PPIs and their common upstream regulators may represent critical interactions for disease pathology. Thus, our algorithm may infer phosphorylation activity on proteins through directed PPIs.
Collapse
Affiliation(s)
- Aimilia-Christina Vagiona
- Faculty of Biology, Insitute of Organismic and Molecular Evolution, Johannes Gutenberg University, Biozentrum I, Mainz, Germany
| | - Sofia Notopoulou
- Institute of Applied Biosciences/Centre for Research and Technology Hellas, Thessaloniki, Greece
| | - Zbyněk Zdráhal
- Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Mariane Gonçalves-Kulik
- Faculty of Biology, Insitute of Organismic and Molecular Evolution, Johannes Gutenberg University, Biozentrum I, Mainz, Germany
| | - Spyros Petrakis
- Institute of Applied Biosciences/Centre for Research and Technology Hellas, Thessaloniki, Greece
| | - Miguel A. Andrade-Navarro
- Faculty of Biology, Insitute of Organismic and Molecular Evolution, Johannes Gutenberg University, Biozentrum I, Mainz, Germany
| |
Collapse
|
3
|
Sulyok B, Palla G. Greedy routing optimisation in hyperbolic networks. Sci Rep 2023; 13:23026. [PMID: 38155205 PMCID: PMC10754836 DOI: 10.1038/s41598-023-50244-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 12/17/2023] [Indexed: 12/30/2023] Open
Abstract
Finding the optimal embedding of networks into low-dimensional hyperbolic spaces is a challenge that received considerable interest in recent years, with several different approaches proposed in the literature. In general, these methods take advantage of the exponentially growing volume of the hyperbolic space as a function of the radius from the origin, allowing a (roughly) uniform spatial distribution of the nodes even for scale-free small-world networks, where the connection probability between pairs decays with hyperbolic distance. One of the motivations behind hyperbolic embedding is that optimal placement of the nodes in a hyperbolic space is widely thought to enable efficient navigation on top of the network. According to that, one of the measures that can be used to quantify the quality of different embeddings is given by the fraction of successful greedy paths following a simple navigation protocol based on the hyperbolic coordinates. In the present work, we develop an optimisation scheme for this score in the native disk representation of the hyperbolic space. This optimisation algorithm can be either used as an embedding method alone, or it can be applied to improve this score for embeddings obtained from other methods. According to our tests on synthetic and real networks, the proposed optimisation can considerably enhance the success rate of greedy paths in several cases, improving the given embedding from the point of view of navigability.
Collapse
Affiliation(s)
- Bendegúz Sulyok
- Department of Biological Physics, Eötvös Loránd University, Pázmány P. stny. 1/A, 1117, Budapest, Hungary
| | - Gergely Palla
- Department of Biological Physics, Eötvös Loránd University, Pázmány P. stny. 1/A, 1117, Budapest, Hungary.
- Data-Driven Health Division of National Laboratory for Health Security, Health Services Management Training Centre, Semmelweis University, Kútvölgyi út 2, 1125, Budapest, Hungary.
| |
Collapse
|
4
|
Gkekas I, Vagiona AC, Pechlivanis N, Kastrinaki G, Pliatsika K, Iben S, Xanthopoulos K, Psomopoulos FE, Andrade-Navarro MA, Petrakis S. Intranuclear inclusions of polyQ-expanded ATXN1 sequester RNA molecules. Front Mol Neurosci 2023; 16:1280546. [PMID: 38125008 PMCID: PMC10730666 DOI: 10.3389/fnmol.2023.1280546] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Accepted: 11/16/2023] [Indexed: 12/23/2023] Open
Abstract
Spinocerebellar ataxia type 1 (SCA1) is an autosomal dominant neurodegenerative disease caused by a trinucleotide (CAG) repeat expansion in the ATXN1 gene. It is characterized by the presence of polyglutamine (polyQ) intranuclear inclusion bodies (IIBs) within affected neurons. In order to investigate the impact of polyQ IIBs in SCA1 pathogenesis, we generated a novel protein aggregation model by inducible overexpression of the mutant ATXN1(Q82) isoform in human neuroblastoma SH-SY5Y cells. Moreover, we developed a simple and reproducible protocol for the efficient isolation of insoluble IIBs. Biophysical characterization showed that polyQ IIBs are enriched in RNA molecules which were further identified by next-generation sequencing. Finally, a protein interaction network analysis indicated that sequestration of essential RNA transcripts within ATXN1(Q82) IIBs may affect the ribosome resulting in error-prone protein synthesis and global proteome instability. These findings provide novel insights into the molecular pathogenesis of SCA1, highlighting the role of polyQ IIBs and their impact on critical cellular processes.
Collapse
Affiliation(s)
- Ioannis Gkekas
- Centre for Research and Technology Hellas, Institute of Applied Biosciences, Thessaloniki, Greece
- Laboratory of Pharmacology, School of Pharmacy, Faculty of Health Sciences, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | | | - Nikolaos Pechlivanis
- Centre for Research and Technology Hellas, Institute of Applied Biosciences, Thessaloniki, Greece
| | - Georgia Kastrinaki
- Aerosol and Particle Technology Laboratory, Centre for Research and Technology Hellas, Chemical Process and Energy Resources Institute, Thessaloniki, Greece
| | - Katerina Pliatsika
- Centre for Research and Technology Hellas, Institute of Applied Biosciences, Thessaloniki, Greece
- Laboratory of Pharmacology, School of Pharmacy, Faculty of Health Sciences, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Sebastian Iben
- Department of Dermatology and Allergic Diseases, University of Ulm, Ulm, Germany
| | - Konstantinos Xanthopoulos
- Centre for Research and Technology Hellas, Institute of Applied Biosciences, Thessaloniki, Greece
- Laboratory of Pharmacology, School of Pharmacy, Faculty of Health Sciences, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Fotis E. Psomopoulos
- Centre for Research and Technology Hellas, Institute of Applied Biosciences, Thessaloniki, Greece
| | | | - Spyros Petrakis
- Centre for Research and Technology Hellas, Institute of Applied Biosciences, Thessaloniki, Greece
| |
Collapse
|
5
|
Jankowski R, Allard A, Boguñá M, Serrano MÁ. The D-Mercator method for the multidimensional hyperbolic embedding of real networks. Nat Commun 2023; 14:7585. [PMID: 37990019 PMCID: PMC10663512 DOI: 10.1038/s41467-023-43337-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Accepted: 11/07/2023] [Indexed: 11/23/2023] Open
Abstract
One of the pillars of the geometric approach to networks has been the development of model-based mapping tools that embed real networks in its latent geometry. In particular, the tool Mercator embeds networks into the hyperbolic plane. However, some real networks are better described by the multidimensional formulation of the underlying geometric model. Here, we introduce D-Mercator, a model-based embedding method that produces multidimensional maps of real networks into the (D + 1)-hyperbolic space, where the similarity subspace is represented as a D-sphere. We used D-Mercator to produce multidimensional hyperbolic maps of real networks and estimated their intrinsic dimensionality in terms of navigability and community structure. Multidimensional representations of real networks are instrumental in the identification of factors that determine connectivity and in elucidating fundamental issues that hinge on dimensionality, such as the presence of universality in critical behavior.
Collapse
Affiliation(s)
- Robert Jankowski
- Departament de Física de la Matèria Condensada, Universitat de Barcelona, Martí i Franquès 1, 08028, Barcelona, Spain
- Universitat de Barcelona Institute of Complex Systems (UBICS), Universitat de Barcelona, Barcelona, Spain
| | - Antoine Allard
- Département de Physique, de Génie Physique et d'optique, Université Laval, Québec, Québec, G1V 0A6, Canada
- Centre Interdisciplinaire en Modélisation Mathématique, Université Laval, Québec, Québec, G1V 0A6, Canada
| | - Marián Boguñá
- Departament de Física de la Matèria Condensada, Universitat de Barcelona, Martí i Franquès 1, 08028, Barcelona, Spain
- Universitat de Barcelona Institute of Complex Systems (UBICS), Universitat de Barcelona, Barcelona, Spain
| | - M Ángeles Serrano
- Departament de Física de la Matèria Condensada, Universitat de Barcelona, Martí i Franquès 1, 08028, Barcelona, Spain.
- Universitat de Barcelona Institute of Complex Systems (UBICS), Universitat de Barcelona, Barcelona, Spain.
- ICREA, Pg. Lluís Companys 23, E-08010, Barcelona, Spain.
| |
Collapse
|
6
|
Zahra NUA, Vagiona AC, Uddin R, Andrade-Navarro MA. Selection of Multi-Drug Targets against Drug-Resistant Mycobacterium tuberculosis XDR1219 Using the Hyperbolic Mapping of the Protein Interaction Network. Int J Mol Sci 2023; 24:14050. [PMID: 37762354 PMCID: PMC10530867 DOI: 10.3390/ijms241814050] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 09/06/2023] [Accepted: 09/12/2023] [Indexed: 09/29/2023] Open
Abstract
Tuberculosis remains the leading cause of death from a single pathogen. On the other hand, antimicrobial resistance (AMR) makes it increasingly difficult to deal with this disease. We present the hyperbolic embedding of the Mycobacterium tuberculosis protein interaction network (mtbPIN) of resistant strain (MTB XDR1219) to determine the biological relevance of its latent geometry. In this hypermap, proteins with similar interacting partners occupy close positions. An analysis of the hypermap of available drug targets (DTs) and their direct and intermediate interactors was used to identify potentially useful drug combinations and drug targets. We identify rpsA and rpsL as close DTs targeted by different drugs (pyrazinamide and aminoglycosides, respectively) and propose that the combination of these drugs could have a synergistic effect. We also used the hypermap to explain the effects of drugs that affect multiple DTs, for example, forcing the bacteria to deal with multiple stresses like ethambutol, which affects the synthesis of both arabinogalactan and lipoarabinomannan. Our strategy uncovers novel potential DTs, such as dprE1 and dnaK proteins, which interact with two close DT pairs: arabinosyltransferases (embC and embB), Ser/Thr protein kinase (pknB) and RNA polymerase (rpoB), respectively. Our approach provides mechanistic explanations for existing drugs and suggests new DTs. This strategy can also be applied to the study of other resistant strains.
Collapse
Affiliation(s)
- Noor ul Ain Zahra
- Lab 103 PCMD ext., Dr. Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Karachi 75270, Pakistan;
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University, Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany;
| | - Aimilia-Christina Vagiona
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University, Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany;
| | - Reaz Uddin
- Lab 103 PCMD ext., Dr. Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Karachi 75270, Pakistan;
| | - Miguel A. Andrade-Navarro
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University, Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany;
| |
Collapse
|
7
|
Hyperbolic matrix factorization improves prediction of drug-target associations. Sci Rep 2023; 13:959. [PMID: 36653463 PMCID: PMC9849222 DOI: 10.1038/s41598-023-27995-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Accepted: 01/11/2023] [Indexed: 01/19/2023] Open
Abstract
Past research in computational systems biology has focused more on the development and applications of advanced statistical and numerical optimization techniques and much less on understanding the geometry of the biological space. By representing biological entities as points in a low dimensional Euclidean space, state-of-the-art methods for drug-target interaction (DTI) prediction implicitly assume the flat geometry of the biological space. In contrast, recent theoretical studies suggest that biological systems exhibit tree-like topology with a high degree of clustering. As a consequence, embedding a biological system in a flat space leads to distortion of distances between biological objects. Here, we present a novel matrix factorization methodology for drug-target interaction prediction that uses hyperbolic space as the latent biological space. When benchmarked against classical, Euclidean methods, hyperbolic matrix factorization exhibits superior accuracy while lowering embedding dimension by an order of magnitude. We see this as additional evidence that the hyperbolic geometry underpins large biological networks.
Collapse
|
8
|
Jiang H, Li L, Zeng Y, Fan J, Shen L. Low-Complexity Hyperbolic Embedding Schemes for Temporal Complex Networks. SENSORS (BASEL, SWITZERLAND) 2022; 22:9306. [PMID: 36502008 PMCID: PMC9736245 DOI: 10.3390/s22239306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 11/23/2022] [Accepted: 11/23/2022] [Indexed: 06/17/2023]
Abstract
Hyperbolic embedding can effectively preserve the property of complex networks. Though some state-of-the-art hyperbolic node embedding approaches are proposed, most of them are still not well suited for the dynamic evolution process of temporal complex networks. The complexities of the adaptability and embedding update to the scale of complex networks with moderate variation are still challenging problems. To tackle the challenges, we propose hyperbolic embedding schemes for the temporal complex network within two dynamic evolution processes. First, we propose a low-complexity hyperbolic embedding scheme by using matrix perturbation, which is well-suitable for medium-scale complex networks with evolving temporal characteristics. Next, we construct the geometric initialization by merging nodes within the hyperbolic circular domain. To realize fast initialization for a large-scale network, an R tree is used to search the nodes to narrow down the search range. Our evaluations are implemented for both synthetic networks and realistic networks within different downstream applications. The results show that our hyperbolic embedding schemes have low complexity and are adaptable to networks with different scales for different downstream tasks.
Collapse
Affiliation(s)
- Hao Jiang
- School of Electronic Information, Wuhan University, Wuhan 430072, China
| | - Lixia Li
- School of Electronic Information, Wuhan University, Wuhan 430072, China
- Wuhan Digital Engineering Institute, Wuhan 430074, China
| | - Yuanyuan Zeng
- School of Electronic Information, Wuhan University, Wuhan 430072, China
| | - Jiajun Fan
- School of Electronic Information, Wuhan University, Wuhan 430072, China
| | - Lijuan Shen
- School of Electronic Information, Wuhan University, Wuhan 430072, China
| |
Collapse
|
9
|
Almagro P, Boguñá M, Serrano MÁ. Detecting the ultra low dimensionality of real networks. Nat Commun 2022; 13:6096. [PMID: 36243754 PMCID: PMC9569339 DOI: 10.1038/s41467-022-33685-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 09/27/2022] [Indexed: 12/24/2022] Open
Abstract
Reducing dimension redundancy to find simplifying patterns in high-dimensional datasets and complex networks has become a major endeavor in many scientific fields. However, detecting the dimensionality of their latent space is challenging but necessary to generate efficient embeddings to be used in a multitude of downstream tasks. Here, we propose a method to infer the dimensionality of networks without the need for any a priori spatial embedding. Due to the ability of hyperbolic geometry to capture the complex connectivity of real networks, we detect ultra low dimensionality far below values reported using other approaches. We applied our method to real networks from different domains and found unexpected regularities, including: tissue-specific biomolecular networks being extremely low dimensional; brain connectomes being close to the three dimensions of their anatomical embedding; and social networks and the Internet requiring slightly higher dimensionality. Beyond paving the way towards an ultra efficient dimensional reduction, our findings help address fundamental issues that hinge on dimensionality, such as universality in critical behavior.
Collapse
Affiliation(s)
- Pedro Almagro
- Departamento de Ciencias de la Computación e Inteligencia Artificial, Universidad de Sevilla, Sevilla, Spain
| | - Marián Boguñá
- Departament de Física de la Matèria Condensada, Universitat de Barcelona, Martí i Franquès 1, 08028, Barcelona, Spain
- Universitat de Barcelona Institute of Complex Systems (UBICS), Universitat de Barcelona, Barcelona, Spain
| | - M Ángeles Serrano
- Departament de Física de la Matèria Condensada, Universitat de Barcelona, Martí i Franquès 1, 08028, Barcelona, Spain.
- Universitat de Barcelona Institute of Complex Systems (UBICS), Universitat de Barcelona, Barcelona, Spain.
- Institució Catalana de Recerca i Estudis Avaçats (ICREA), Pg. Lluís Companys 23, 08010, Barcelona, Spain.
| |
Collapse
|
10
|
Joint Detection of Community and Structural Hole Spanner of Networks in Hyperbolic Space. ENTROPY 2022; 24:e24070894. [PMID: 35885117 PMCID: PMC9319712 DOI: 10.3390/e24070894] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 06/24/2022] [Accepted: 06/24/2022] [Indexed: 02/04/2023]
Abstract
Community detection and structural hole spanner (the node bridging different communities) identification, revealing the mesoscopic and microscopic structural properties of complex networks, have drawn much attention in recent years. As the determinant of mesoscopic structure, communities and structural hole spanners discover the clustering and hierarchy of networks, which has a key impact on transmission phenomena such as epidemic transmission, information diffusion, etc. However, most existing studies address the two tasks independently, which ignores the structural correlation between mesoscale and microscale and suffers from high computational costs. In this article, we propose an algorithm for simultaneously detecting communities and structural hole spanners via hyperbolic embedding (SDHE). Specifically, we first embed networks into a hyperbolic plane, in which, the angular distribution of the nodes reveals community structures of the embedded network. Then, we analyze the critical gap to detect communities and the angular region where structural hole spanners may exist. Finally, we identify structural hole spanners via two-step connectivity. Experimental results on synthetic networks and real networks demonstrate the effectiveness of our proposed algorithm compared with several state-of-the-art methods.
Collapse
|
11
|
Vagiona AC, Mier P, Petrakis S, Andrade-Navarro MA. Analysis of Huntington's Disease Modifiers Using the Hyperbolic Mapping of the Protein Interaction Network. Int J Mol Sci 2022; 23:5853. [PMID: 35628660 PMCID: PMC9144261 DOI: 10.3390/ijms23105853] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Revised: 05/19/2022] [Accepted: 05/19/2022] [Indexed: 02/05/2023] Open
Abstract
Huntington's disease (HD) is caused by the production of a mutant huntingtin (HTT) with an abnormally long poly-glutamine (polyQ) tract, forming aggregates and inclusions in neurons. Previous work by us and others has shown that an increase or decrease in polyQ-triggered aggregates can be passive simply due to the interaction of proteins with the aggregates. To search for proteins with active (functional) effects, which might be more effective in finding therapies and mechanisms of HD, we selected among the proteins that interact with HTT a total of 49 pairs of proteins that, while being paralogous to each other (and thus expected to have similar passive interaction with HTT), are located in different regions of the protein interaction network (suggesting participation in different pathways or complexes). Three of these 49 pairs contained members with opposite effects on HD, according to the literature. The negative members of the three pairs, MID1, IKBKG, and IKBKB, interact with PPP2CA and TUBB, which are known negative factors in HD, as well as with HSP90AA1 and RPS3. The positive members of the three pairs interact with HSPA9. Our results provide potential HD modifiers of functional relevance and reveal the dynamic aspect of paralog evolution within the interaction network.
Collapse
Affiliation(s)
- Aimilia-Christina Vagiona
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University, Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany; (A.-C.V.); (P.M.)
| | - Pablo Mier
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University, Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany; (A.-C.V.); (P.M.)
| | - Spyros Petrakis
- Institute of Applied Biosciences/Centre for Research and Technology Hellas, 57001 Thessaloniki, Greece;
| | - Miguel A. Andrade-Navarro
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University, Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany; (A.-C.V.); (P.M.)
| |
Collapse
|
12
|
Ye D, Jiang H, Jiang Y, Wang Q, Hu Y. Community preserving mapping for network hyperbolic embedding. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
13
|
Zhou M, Jin H, Wu Q, Xie H, Han Q. Betweenness centrality-based community adaptive network representation for link prediction. APPL INTELL 2022. [DOI: 10.1007/s10489-021-02633-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
14
|
Kovács B, Balogh SG, Palla G. Generalised popularity-similarity optimisation model for growing hyperbolic networks beyond two dimensions. Sci Rep 2022; 12:968. [PMID: 35046448 PMCID: PMC8770586 DOI: 10.1038/s41598-021-04379-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Accepted: 12/14/2021] [Indexed: 11/22/2022] Open
Abstract
Hyperbolic network models have gained considerable attention in recent years, mainly due to their capability of explaining many peculiar features of real-world networks. One of the most widely known models of this type is the popularity-similarity optimisation (PSO) model, working in the native disk representation of the two-dimensional hyperbolic space and generating networks with small-world property, scale-free degree distribution, high clustering and strong community structure at the same time. With the motivation of better understanding hyperbolic random graphs, we hereby introduce the dPSO model, a generalisation of the PSO model to any arbitrary integer dimension [Formula: see text]. The analysis of the obtained networks shows that their major structural properties can be affected by the dimension of the underlying hyperbolic space in a non-trivial way. Our extended framework is not only interesting from a theoretical point of view but can also serve as a starting point for the generalisation of already existing two-dimensional hyperbolic embedding techniques.
Collapse
Affiliation(s)
- Bianka Kovács
- Department of Biological Physics, Eötvös Loránd University, Pázmány P. stny. 1/A, 1117, Budapest, Hungary
| | - Sámuel G Balogh
- Department of Biological Physics, Eötvös Loránd University, Pázmány P. stny. 1/A, 1117, Budapest, Hungary.
| | - Gergely Palla
- Department of Biological Physics, Eötvös Loránd University, Pázmány P. stny. 1/A, 1117, Budapest, Hungary
- MTA-ELTE Statistical and Biological Physics Research Group, Pázmány P. stny. 1/A, 1117, Budapest, Hungary
- Health Services Management Training Centre, Semmelweis University, 1125, Kútvölgyi út 2, Budapest, Hungary
| |
Collapse
|
15
|
Passino FS, Heard NA, Rubin-Delanchy P. Spectral Clustering on Spherical Coordinates Under the Degree-Corrected Stochastic Blockmodel. Technometrics 2022. [DOI: 10.1080/00401706.2021.2008503] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
16
|
Jhun B. Topological analysis of the latent geometry of a complex network. CHAOS (WOODBURY, N.Y.) 2022; 32:013116. [PMID: 35105131 DOI: 10.1063/5.0073107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 12/16/2021] [Indexed: 06/14/2023]
Abstract
Most real-world networks are embedded in latent geometries. If a node in a network is found in the vicinity of another node in the latent geometry, the two nodes have a disproportionately high probability of being connected by a link. The latent geometry of a complex network is a central topic of research in network science, which has an expansive range of practical applications, such as efficient navigation, missing link prediction, and brain mapping. Despite the important role of topology in the structures and functions of complex systems, little to no study has been conducted to develop a method to estimate the general unknown latent geometry of complex networks. Topological data analysis, which has attracted extensive attention in the research community owing to its convincing performance, can be directly implemented into complex networks; however, even a small fraction (0.1%) of long-range links can completely erase the topological signature of the latent geometry. Inspired by the fact that long-range links in a network have disproportionately high loads, we develop a set of methods that can analyze the latent geometry of a complex network: the modified persistent homology diagram and the map of the latent geometry. These methods successfully reveal the topological properties of the synthetic and empirical networks used to validate the proposed methods.
Collapse
Affiliation(s)
- Bukyoung Jhun
- CCSS, CTP, and Department of Physics and Astronomy, Seoul National University, Seoul 08826, South Korea and Department of Physics, The University of Texas at Austin, Austin, Texas 78712, USA
| |
Collapse
|
17
|
Sandini C, Zöller D, Schneider M, Tarun A, Armondo M, Nelson B, Amminger PG, Yuen HP, Markulev C, Schäffer MR, Mossaheb N, Schlögelhofer M, Smesny S, Hickie IB, Berger GE, Chen EY, de Haan L, Nieman DH, Nordentoft M, Riecher-Rössler A, Verma S, Thompson A, Yung AR, McGorry PD, Van De Ville D, Eliez S. Characterization and prediction of clinical pathways of vulnerability to psychosis through graph signal processing. eLife 2021; 10:59811. [PMID: 34569937 PMCID: PMC8476129 DOI: 10.7554/elife.59811] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 09/09/2021] [Indexed: 11/21/2022] Open
Abstract
Causal interactions between specific psychiatric symptoms could contribute to the heterogenous clinical trajectories observed in early psychopathology. Current diagnostic approaches merge clinical manifestations that co-occur across subjects and could significantly hinder our understanding of clinical pathways connecting individual symptoms. Network analysis techniques have emerged as alternative approaches that could help shed light on the complex dynamics of early psychopathology. The present study attempts to address the two main limitations that have in our opinion hindered the application of network approaches in the clinical setting. Firstly, we show that a multi-layer network analysis approach, can move beyond a static view of psychopathology, by providing an intuitive characterization of the role of specific symptoms in contributing to clinical trajectories over time. Secondly, we show that a Graph-Signal-Processing approach, can exploit knowledge of longitudinal interactions between symptoms, to predict clinical trajectories at the level of the individual. We test our approaches in two independent samples of individuals with genetic and clinical vulnerability for developing psychosis. Novel network approaches can allow to embrace the dynamic complexity of early psychopathology and help pave the way towards a more a personalized approach to clinical care.
Collapse
Affiliation(s)
- Corrado Sandini
- Developmental Imaging and Psychopathology Laboratory, University of Geneva School of Medicine, Geneva, Switzerland
| | - Daniela Zöller
- Developmental Imaging and Psychopathology Laboratory, University of Geneva School of Medicine, Geneva, Switzerland.,Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Maude Schneider
- Developmental Imaging and Psychopathology Laboratory, University of Geneva School of Medicine, Geneva, Switzerland.,Center for Contextual Psychiatry, Research Group Psychiatry, Department of Neuroscience, KU Leuven, Leuven, Belgium
| | - Anjali Tarun
- Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Marco Armondo
- Developmental Imaging and Psychopathology Laboratory, University of Geneva School of Medicine, Geneva, Switzerland
| | - Barnaby Nelson
- Orygen, Parkville, Australia.,The Centre for Youth Mental Health, The University of Melbourne, Melbourne, Australia
| | - Paul G Amminger
- Orygen, Parkville, Australia.,The Centre for Youth Mental Health, The University of Melbourne, Melbourne, Australia.,Department of Psychiatry and Psychotherapy, Clinical Division of Social Psychiatry, Medical University Vienna, Vienna, Austria
| | - Hok Pan Yuen
- Orygen, Parkville, Australia.,The Centre for Youth Mental Health, The University of Melbourne, Melbourne, Australia
| | - Connie Markulev
- Orygen, Parkville, Australia.,The Centre for Youth Mental Health, The University of Melbourne, Melbourne, Australia
| | - Monica R Schäffer
- The Centre for Youth Mental Health, The University of Melbourne, Melbourne, Australia.,Department of Psychiatry and Psychotherapy, Clinical Division of Social Psychiatry, Medical University Vienna, Vienna, Austria
| | - Nilufar Mossaheb
- Department of Psychiatry and Psychotherapy, Clinical Division of Social Psychiatry, Medical University Vienna, Vienna, Austria
| | - Monika Schlögelhofer
- Department of Psychiatry and Psychotherapy, Clinical Division of Social Psychiatry, Medical University Vienna, Vienna, Austria
| | - Stefan Smesny
- Department of Psychiatry and Psychotherapy, Clinical Division of Social Psychiatry, Medical University Vienna, Vienna, Austria
| | - Ian B Hickie
- Department of Psychiatry, University Hospital Jena, Jena, Germany
| | | | - Eric Yh Chen
- Child and Adolescent Psychiatric Service of the Canton of Zurich, Zurich, Switzerland
| | - Lieuwe de Haan
- Department of Psychiatry, University of Hong Kong, Hong Kong, China
| | - Dorien H Nieman
- Department of Psychiatry, Amsterdam University Medical Centers, Amsterdam, Netherlands
| | | | | | - Swapna Verma
- Institute of Mental Health, Singapore, Singapore
| | - Andrew Thompson
- Orygen, Parkville, Australia.,The Centre for Youth Mental Health, The University of Melbourne, Melbourne, Australia.,Division of Mental Health and Wellbeing, Warwick Medical School, University of Warwick, Coventry, United Kingdom.,North Warwickshire Early Intervention in Psychosis Service, Conventry and Warwickshire National Health Service Partnership Trust, Coventry, United Kingdom
| | - Alison Ruth Yung
- Orygen, Parkville, Australia.,The Centre for Youth Mental Health, The University of Melbourne, Melbourne, Australia.,Division of Psychology and Mental Health, University of Manchester, Manchester, United Kingdom.,Greater Manchester Mental Health NHS Foundation Trust, Manchester, United Kingdom
| | - Patrick D McGorry
- Orygen, Parkville, Australia.,The Centre for Youth Mental Health, The University of Melbourne, Melbourne, Australia
| | - Dimitri Van De Ville
- Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland.,Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
| | - Stephan Eliez
- Developmental Imaging and Psychopathology Laboratory, University of Geneva School of Medicine, Geneva, Switzerland.,Department of Genetic Medicine and Development, University of Geneva School of Medicine, Geneva, Switzerland
| |
Collapse
|
18
|
Abstract
A remarkable approach for grasping the relevant statistical features of real networks with the help of random graphs is offered by hyperbolic models, centred around the idea of placing nodes in a low-dimensional hyperbolic space, and connecting node pairs with a probability depending on the hyperbolic distance. It is widely appreciated that these models can generate random graphs that are small-world, highly clustered and scale-free at the same time; thus, reproducing the most fundamental common features of real networks. In the present work, we focus on a less well-known property of the popularity-similarity optimisation model and the \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathbb {S}}^1/{\mathbb {H}}^2$$\end{document}S1/H2 model from this model family, namely that the networks generated by these approaches also contain communities for a wide range of the parameters, which was certainly not an intention at the design of the models. We extracted the communities from the studied networks using well-established community finding methods such as Louvain, Infomap and label propagation. The observed high modularity values indicate that the community structure can become very pronounced under certain conditions. In addition, the modules found by the different algorithms show good consistency, implying that these are indeed relevant and apparent structural units. Since the appearance of communities is rather common in networks representing real systems as well, this feature of hyperbolic models makes them even more suitable for describing real networks than thought before.
Collapse
|
19
|
Wang L, Huang C, Ma W, Liu R, Vosoughi S. Hyperbolic node embedding for temporal networks. Data Min Knowl Discov 2021. [DOI: 10.1007/s10618-021-00774-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
20
|
Montero M. Random Walks with Invariant Loop Probabilities: Stereographic Random Walks. ENTROPY 2021; 23:e23060729. [PMID: 34201220 PMCID: PMC8228639 DOI: 10.3390/e23060729] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Revised: 06/01/2021] [Accepted: 06/03/2021] [Indexed: 11/21/2022]
Abstract
Random walks with invariant loop probabilities comprise a wide family of Markov processes with site-dependent, one-step transition probabilities. The whole family, which includes the simple random walk, emerges from geometric considerations related to the stereographic projection of an underlying geometry into a line. After a general introduction, we focus our attention on the elliptic case: random walks on a circle with built-in reflexing boundaries.
Collapse
Affiliation(s)
- Miquel Montero
- Departament de Física de la Matèria Condensada, Universitat de Barcelona (UB), Martí i Franquès 1, E-08028 Barcelona, Spain;
- Universitat de Barcelona Institute of Complex Systems (UBICS), Martí i Franquès 1, E-08028 Barcelona, Spain
| |
Collapse
|
21
|
Kovács B, Palla G. Optimisation of the coalescent hyperbolic embedding of complex networks. Sci Rep 2021; 11:8350. [PMID: 33863973 PMCID: PMC8052422 DOI: 10.1038/s41598-021-87333-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Accepted: 03/04/2021] [Indexed: 12/18/2022] Open
Abstract
Several observations indicate the existence of a latent hyperbolic space behind real networks that makes their structure very intuitive in the sense that the probability for a connection is decreasing with the hyperbolic distance between the nodes. A remarkable network model generating random graphs along this line is the popularity-similarity optimisation (PSO) model, offering a scale-free degree distribution, high clustering and the small-world property at the same time. These results provide a strong motivation for the development of hyperbolic embedding algorithms, that tackle the problem of finding the optimal hyperbolic coordinates of the nodes based on the network structure. A very promising recent approach for hyperbolic embedding is provided by the noncentered minimum curvilinear embedding (ncMCE) method, belonging to the family of coalescent embedding algorithms. This approach offers a high-quality embedding at a low running time. In the present work we propose a further optimisation of the angular coordinates in this framework that seems to reduce the logarithmic loss and increase the greedy routing score of the embedding compared to the original version, thereby adding an extra improvement to the quality of the inferred hyperbolic coordinates.
Collapse
Affiliation(s)
- Bianka Kovács
- Department of Biological Physics, Eötvös Loránd University, Pázmány P. stny. 1/A, Budapest, 1117, Hungary
| | - Gergely Palla
- Department of Biological Physics, Eötvös Loránd University, Pázmány P. stny. 1/A, Budapest, 1117, Hungary.
- MTA-ELTE Statistical and Biological Physics Research Group, Pázmány P. stny. 1/A, Budapest, 1117, Hungary.
- Health Services Management Training Centre, Semmelweis University, Kútvölgyi út 2, Budapest, 1125, Hungary.
| |
Collapse
|
22
|
Makarov I, Kiselev D, Nikitinsky N, Subelj L. Survey on graph embeddings and their applications to machine learning problems on graphs. PeerJ Comput Sci 2021; 7:e357. [PMID: 33817007 PMCID: PMC7959646 DOI: 10.7717/peerj-cs.357] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 12/18/2020] [Indexed: 05/13/2023]
Abstract
Dealing with relational data always required significant computational resources, domain expertise and task-dependent feature engineering to incorporate structural information into a predictive model. Nowadays, a family of automated graph feature engineering techniques has been proposed in different streams of literature. So-called graph embeddings provide a powerful tool to construct vectorized feature spaces for graphs and their components, such as nodes, edges and subgraphs under preserving inner graph properties. Using the constructed feature spaces, many machine learning problems on graphs can be solved via standard frameworks suitable for vectorized feature representation. Our survey aims to describe the core concepts of graph embeddings and provide several taxonomies for their description. First, we start with the methodological approach and extract three types of graph embedding models based on matrix factorization, random-walks and deep learning approaches. Next, we describe how different types of networks impact the ability of models to incorporate structural and attributed data into a unified embedding. Going further, we perform a thorough evaluation of graph embedding applications to machine learning problems on graphs, among which are node classification, link prediction, clustering, visualization, compression, and a family of the whole graph embedding algorithms suitable for graph classification, similarity and alignment problems. Finally, we overview the existing applications of graph embeddings to computer science domains, formulate open problems and provide experiment results, explaining how different networks properties result in graph embeddings quality in the four classic machine learning problems on graphs, such as node classification, link prediction, clustering and graph visualization. As a result, our survey covers a new rapidly growing field of network feature engineering, presents an in-depth analysis of models based on network types, and overviews a wide range of applications to machine learning problems on graphs.
Collapse
Affiliation(s)
- Ilya Makarov
- HSE University, Moscow, Russia
- Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia
| | | | - Nikita Nikitinsky
- Big Data Research Center, National University of Science and Technology MISIS, Moscow, Russia
| | - Lovro Subelj
- Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia
| |
Collapse
|
23
|
Reducing the complexity of financial networks using network embeddings. Sci Rep 2020; 10:17045. [PMID: 33046815 PMCID: PMC7550348 DOI: 10.1038/s41598-020-74010-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Accepted: 09/23/2020] [Indexed: 01/31/2023] Open
Abstract
Accounting scandals like Enron (2001) and Petrobas (2014) remind us that untrustworthy financial information has an adverse effect on the stability of the economy and can ultimately be a source of systemic risk. This financial information is derived from processes and their related monetary flows within a business. But as the flows are becoming larger and more complex, it becomes increasingly difficult to distill the primary processes for large amounts of transaction data. However, by extracting the primary processes we will be able to detect possible inconsistencies in the information efficiently. We use recent advances in network embedding techniques that have demonstrated promising results regarding node classification problems in domains like biology and sociology. We learned a useful continuous vector representation of the nodes in the network which can be used for the clustering task, such that the clusters represent the meaningful primary processes. The results show that we can extract the relevant primary processes which are similar to the created clusters by a financial expert. Moreover, we construct better predictive models using the flows from the extracted primary processes which can be used to detect inconsistencies. Our work will pave the way towards a more modern technology and data-driven financial audit discipline.
Collapse
|
24
|
Abstract
Maritime transport accounts for over 80% of the world trade volume and is the backbone of the global economy. Global supply chains create a complex network of trade flows. The structure of this network impacts not only the socioeconomic development of the concerned regions but also their ecosystems. The movements of ships are a considerable source of CO2 emissions and contribute to climate change. In the wake of the announced development of Arctic shipping, the need to understand the behavior of the maritime trade network and to predict future trade flows becomes pressing. We use a unique database of daily movements of the world fleet over the period 1977-2008 and apply machine learning techniques on network data to develop models for predicting the opening of new shipping lines and for forecasting trade volume on links. We find that the evolution of this system is governed by a simple rule from network science, relying on the number of common neighbors between pairs of ports. This finding is consistent over all three decades of temporal data. We further confirm it with a natural experiment, involving traffic redirection from the port of Kobe after the 1995 earthquake. Our forecasting method enables researchers and industry to easily model effects of potential future scenarios at the level of ports, regions, and the world. Our results also indicate that maritime trade flows follow a form of random walk on the underlying network structure of sea connections, highlighting its pivotal role in the development of maritime trade.
Collapse
|
25
|
Cho H, DeMeo B, Peng J, Berger B. Large-Margin Classification in Hyperbolic Space. PROCEEDINGS OF MACHINE LEARNING RESEARCH 2019; 89:1832-1840. [PMID: 32832915 PMCID: PMC7434093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Representing data in hyperbolic space can effectively capture latent hierarchical relationships. To enable accurate classification of points in hyperbolic space while respecting their hyperbolic geometry, we introduce hyperbolic SVM, a hyperbolic formulation of support vector machine classifiers, and describe its theoretical connection to the Euclidean counterpart. We also generalize Euclidean kernel SVM to hyperbolic space, allowing nonlinear hyperbolic decision boundaries and providing a geometric interpretation for a certain class of indefinite kernels. Hyperbolic SVM improves classification accuracy in simulation and in real-world problems involving complex networks and word embeddings. Our work enables end-to-end analyses based on the inherent hyperbolic geometry of the data without resorting to ill-fitting tools developed for Euclidean space.
Collapse
Affiliation(s)
| | | | - Jian Peng
- University of Illinois at Urbana-Champaign
| | | |
Collapse
|
26
|
Faqeeh A, Osat S, Radicchi F. Characterizing the Analogy Between Hyperbolic Embedding and Community Structure of Complex Networks. PHYSICAL REVIEW LETTERS 2018; 121:098301. [PMID: 30230906 DOI: 10.1103/physrevlett.121.098301] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/09/2018] [Revised: 05/21/2018] [Indexed: 06/08/2023]
Abstract
We show that the community structure of a network can be used as a coarse version of its embedding in a hidden space with hyperbolic geometry. The finding emerges from a systematic analysis of several real-world and synthetic networks. We take advantage of the analogy for reinterpreting results originally obtained through network hyperbolic embedding in terms of community structure only. First, we show that the robustness of a multiplex network can be controlled by tuning the correlation between the community structures across different layers. Second, we deploy an efficient greedy protocol for network navigability that makes use of routing tables based on community structure.
Collapse
Affiliation(s)
- Ali Faqeeh
- MACSI, Department of Mathematics and Statistics, University of Limerick, Limerick V94 T9PX, Ireland
- Center for Complex Networks and Systems Research, School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana 47408, USA
| | - Saeed Osat
- Quantum Complexity Science Initiative, Skolkovo Institute of Science and Technology, Skoltech Building 3, Moscow 143026, Russia
| | - Filippo Radicchi
- Center for Complex Networks and Systems Research, School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana 47408, USA
| |
Collapse
|
27
|
Alanis-Lobato G, Mier P, Andrade-Navarro M. The latent geometry of the human protein interaction network. Bioinformatics 2018; 34:2826-2834. [PMID: 29635317 PMCID: PMC6084611 DOI: 10.1093/bioinformatics/bty206] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Revised: 02/16/2018] [Accepted: 04/03/2018] [Indexed: 11/21/2022] Open
Abstract
Motivation A series of recently introduced algorithms and models advocates for the existence of a hyperbolic geometry underlying the network representation of complex systems. Since the human protein interaction network (hPIN) has a complex architecture, we hypothesized that uncovering its latent geometry could ease challenging problems in systems biology, translating them into measuring distances between proteins. Results We embedded the hPIN to hyperbolic space and found that the inferred coordinates of nodes capture biologically relevant features, like protein age, function and cellular localization. This means that the representation of the hPIN in the two-dimensional hyperbolic plane offers a novel and informative way to visualize proteins and their interactions. We then used these coordinates to compute hyperbolic distances between proteins, which served as likelihood scores for the prediction of plausible protein interactions. Finally, we observed that proteins can efficiently communicate with each other via a greedy routing process, guided by the latent geometry of the hPIN. We show that these efficient communication channels can be used to determine the core members of signal transduction pathways and to study how system perturbations impact their efficiency. Availability and implementation An R implementation of our network embedder is available at https://github.com/galanisl/NetHypGeom. Also, a web tool for the geometric analysis of the hPIN accompanies this text at http://cbdm-01.zdv.uni-mainz.de/~galanisl/gapi. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Gregorio Alanis-Lobato
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg Universität, Mainz, Germany
- Institute of Molecular Biology, Mainz, Germany
| | - Pablo Mier
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg Universität, Mainz, Germany
- Institute of Molecular Biology, Mainz, Germany
| | - Miguel Andrade-Navarro
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg Universität, Mainz, Germany
- Institute of Molecular Biology, Mainz, Germany
| |
Collapse
|
28
|
Härtner F, Andrade-Navarro MA, Alanis-Lobato G. Geometric characterisation of disease modules. APPLIED NETWORK SCIENCE 2018; 3:10. [PMID: 30839777 PMCID: PMC6214295 DOI: 10.1007/s41109-018-0066-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2018] [Accepted: 05/28/2018] [Indexed: 05/07/2023]
Abstract
There is an increasing accumulation of evidence supporting the existence of a hyperbolic geometry underlying the network representation of complex systems. In particular, it has been shown that the latent geometry of the human protein network (hPIN) captures biologically relevant information, leading to a meaningful visual representation of protein-protein interactions and translating challenging systems biology problems into measuring distances between proteins. Moreover, proteins can efficiently communicate with each other, without global knowledge of the hPIN structure, via a greedy routing (GR) process in which hyperbolic distances guide biological signals from source to target proteins. It is thanks to this effective information routing throughout the hPIN that the cell operates, communicates with other cells and reacts to environmental changes. As a result, the malfunction of one or a few members of this intricate system can disturb its dynamics and derive in disease phenotypes. In fact, it is known that the proteins associated with a single disease agglomerate non-randomly in the same region of the hPIN, forming one or several connected components known as the disease module (DM). Here, we present a geometric characterisation of DMs. First, we found that DM positions on the two-dimensional hyperbolic plane reflect their fragmentation and functional heterogeneity, rendering an informative picture of the cellular processes that the disease is affecting. Second, we used a distance-based dissimilarity measure to cluster DMs with shared clinical features. Finally, we took advantage of the GR strategy to study how defective proteins affect the transduction of signals throughout the hPIN.
Collapse
Affiliation(s)
- Franziska Härtner
- Faculty for Physics, Mathematics and Computer Science, Johannes Gutenberg Universität, Institute of Computer Science, Staudingerweg 7, Mainz, 55128 Germany
| | - Miguel A. Andrade-Navarro
- Faculty of Biology, Johannes Gutenberg Universität, Institute of Molecular Biology, Ackermannweg 4, Mainz, 55128 Germany
| | - Gregorio Alanis-Lobato
- Faculty of Biology, Johannes Gutenberg Universität, Institute of Molecular Biology, Ackermannweg 4, Mainz, 55128 Germany
| |
Collapse
|
29
|
Studies in the extensively automatic construction of large odds-based inference networks from structured data. Examples from medical, bioinformatics, and health insurance claims data. Comput Biol Med 2018; 95:147-166. [PMID: 29500985 DOI: 10.1016/j.compbiomed.2018.02.013] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2018] [Revised: 02/19/2018] [Accepted: 02/19/2018] [Indexed: 12/11/2022]
Abstract
Theoretical and methodological principles are presented for the construction of very large inference nets for odds calculations, composed of hundreds or many thousands or more of elements, in this paper generated by structured data mining. It is argued that the usual small inference nets can sometimes represent rather simple, arbitrary estimates. Examples of applications in clinical and public health data analysis, medical claims data and detection of irregular entries, and bioinformatics data, are presented. Construction of large nets benefits from application of a theory of expected information for sparse data and the Dirac notation and algebra. The extent to which these are important here is briefly discussed. Purposes of the study include (a) exploration of the properties of large inference nets and a perturbation and tacit conditionality models, (b) using these to propose simpler models including one that a physician could use routinely, analogous to a "risk score", (c) examination of the merit of describing optimal performance in a single measure that combines accuracy, specificity, and sensitivity in place of a ROC curve, and (d) relationship to methods for detecting anomalous and potentially fraudulent data.
Collapse
|
30
|
Weng T, Zhang J, Khajehnejad M, Small M, Zheng R, Hui P. Navigation by anomalous random walks on complex networks. Sci Rep 2016; 6:37547. [PMID: 27876855 PMCID: PMC5120342 DOI: 10.1038/srep37547] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2016] [Accepted: 11/01/2016] [Indexed: 11/09/2022] Open
Abstract
Anomalous random walks having long-range jumps are a critical branch of dynamical processes on networks, which can model a number of search and transport processes. However, traditional measurements based on mean first passage time are not useful as they fail to characterize the cost associated with each jump. Here we introduce a new concept of mean first traverse distance (MFTD) to characterize anomalous random walks that represents the expected traverse distance taken by walkers searching from source node to target node, and we provide a procedure for calculating the MFTD between two nodes. We use Lévy walks on networks as an example, and demonstrate that the proposed approach can unravel the interplay between diffusion dynamics of Lévy walks and the underlying network structure. Moreover, applying our framework to the famous PageRank search, we show how to inform the optimality of the PageRank search. The framework for analyzing anomalous random walks on complex networks offers a useful new paradigm to understand the dynamics of anomalous diffusion processes, and provides a unified scheme to characterize search and transport processes on networks.
Collapse
Affiliation(s)
- Tongfeng Weng
- HKUST-DT System and Media Laboratory, Hong Kong University of Science and Technology, HongKong
| | - Jie Zhang
- Centre for Computational Systems Biology, Fudan University, China
| | - Moein Khajehnejad
- HKUST-DT System and Media Laboratory, Hong Kong University of Science and Technology, HongKong
| | - Michael Small
- The University of Western Australia, Crawley, WA 6009, Australia.,Mineral Resources, CSIRO, Kensington, WA, Australia
| | - Rui Zheng
- HKUST-DT System and Media Laboratory, Hong Kong University of Science and Technology, HongKong
| | - Pan Hui
- HKUST-DT System and Media Laboratory, Hong Kong University of Science and Technology, HongKong
| |
Collapse
|
31
|
Alanis-Lobato G, Mier P, Andrade-Navarro MA. Manifold learning and maximum likelihood estimation for hyperbolic network embedding. APPLIED NETWORK SCIENCE 2016; 1:10. [PMID: 30533502 PMCID: PMC6245200 DOI: 10.1007/s41109-016-0013-0] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Accepted: 10/25/2016] [Indexed: 05/23/2023]
Abstract
The Popularity-Similarity (PS) model sustains that clustering and hierarchy, properties common to most networks representing complex systems, are the result of an optimisation process in which nodes seek to form ties, not only with the most connected (popular) system components, but also with those that are similar to them. This model has a geometric interpretation in hyperbolic space, where distances between nodes abstract popularity-similarity trade-offs and the formation of scale-free and strongly clustered networks can be accurately described. Current methods for mapping networks to hyperbolic space are based on maximum likelihood estimations or manifold learning. The former approach is very accurate but slow; the latter improves efficiency at the cost of accuracy. Here, we analyse the strengths and limitations of both strategies and assess the advantages of combining them to efficiently embed big networks, allowing for their examination from a geometric perspective. Our evaluations in artificial and real networks support the idea that hyperbolic distance constraints play a significant role in the formation of edges between nodes. This means that challenging problems in network science, like link prediction or community detection, could be more easily addressed under this geometric framework.
Collapse
Affiliation(s)
- Gregorio Alanis-Lobato
- Institute of Molecular Biology, Ackermannweg 4, Mainz, 55128 Germany
- Faculty of Biology, Johannes Gutenberg Universität, Gresemundweg 2, Mainz, 55128 Germany
| | - Pablo Mier
- Institute of Molecular Biology, Ackermannweg 4, Mainz, 55128 Germany
- Faculty of Biology, Johannes Gutenberg Universität, Gresemundweg 2, Mainz, 55128 Germany
| | - Miguel A. Andrade-Navarro
- Institute of Molecular Biology, Ackermannweg 4, Mainz, 55128 Germany
- Faculty of Biology, Johannes Gutenberg Universität, Gresemundweg 2, Mainz, 55128 Germany
| |
Collapse
|