1
|
Rocher L, Hendrickx JM, Montjoye YAD. A scaling law to model the effectiveness of identification techniques. Nat Commun 2025; 16:347. [PMID: 39788959 PMCID: PMC11718298 DOI: 10.1038/s41467-024-55296-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Accepted: 12/06/2024] [Indexed: 01/12/2025] Open
Abstract
AI techniques are increasingly being used to identify individuals both offline and online. However, quantifying their effectiveness at scale and, by extension, the risks they pose remains a significant challenge. Here, we propose a two-parameter Bayesian model for exact matching techniques and derive an analytical expression for correctness (κ), the fraction of people accurately identified in a population. We then generalize the model to forecast how κ scales from small-scale experiments to the real world, for exact, sparse, and machine learning-based robust identification techniques. Despite having only two degrees of freedom, our method closely fits 476 correctness curves and strongly outperforms curve-fitting methods and entropy-based rules of thumb. Our work provides a principled framework for forecasting the privacy risks posed by identification techniques, while also supporting independent accountability efforts for AI-based biometric systems.
Collapse
Affiliation(s)
- Luc Rocher
- Oxford Internet Institute, University of Oxford, Oxford, UK.
- Information and Communication Technologies, Electronics and Applied Mathematics (ICTEAM), Université catholique de Louvain, Louvain-la-Neuve, Belgium.
- Data Science Institute, Imperial College London, London, UK.
| | - Julien M Hendrickx
- Information and Communication Technologies, Electronics and Applied Mathematics (ICTEAM), Université catholique de Louvain, Louvain-la-Neuve, Belgium
| | - Yves-Alexandre de Montjoye
- Data Science Institute, Imperial College London, London, UK.
- Department of Computing, Imperial College London, London, UK.
| |
Collapse
|
2
|
Sun H, Panda RK, Verdel R, Rodriguez A, Dalmonte M, Bianconi G. Network science: Ising states of matter. Phys Rev E 2024; 109:054305. [PMID: 38907445 DOI: 10.1103/physreve.109.054305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 02/26/2024] [Indexed: 06/24/2024]
Abstract
Network science provides very powerful tools for extracting information from interacting data. Although recently the unsupervised detection of phases of matter using machine learning has raised significant interest, the full prediction power of network science has not yet been systematically explored in this context. Here we fill this gap by providing an in-depth statistical, combinatorial, geometrical, and topological characterization of 2D Ising snapshot networks (IsingNets) extracted from Monte Carlo simulations of the 2D Ising model at different temperatures, going across the phase transition. Our analysis reveals the complex organization properties of IsingNets in both the ferromagnetic and paramagnetic phases and demonstrates the significant deviations of the IsingNets with respect to randomized null models. In particular percolation properties of the IsingNets reflect the existence of the symmetry between configurations with opposite magnetization below the critical temperature and the very compact nature of the two emerging giant clusters revealed by our persistent homology analysis of the IsingNets. Moreover, the IsingNets display a very broad degree distribution and significant degree-degree correlations and weight-degree correlations demonstrating that they encode relevant information present in the configuration space of the 2D Ising model. The geometrical organization of the critical IsingNets is reflected in their spectral properties deviating from the one of the null model. This work reveals the important insights that network science can bring to the characterization of phases of matter. The set of tools described hereby can be applied as well to numerical and experimental data.
Collapse
Affiliation(s)
- Hanlin Sun
- School of Mathematical Sciences, Queen Mary University of London, London E1 4NS, United Kingdom
- Nordita, KTH Royal Institute of Technology and Stockholm University, Hannes Alfvéns väg 12, SE-106 91 Stockholm, Sweden
| | - Rajat Kumar Panda
- The Abdus Salam International Centre for Theoretical Physics (ICTP), Strada Costiera 11, 34151 Trieste, Italy
- SISSA-International School of Advanced Studies, via Bonomea 265, 34136 Trieste, Italy
- INFN Sezione di Trieste, Via Valerio 2, 34127 Trieste, Italy
- Department of Physics, University of Trieste, 34127 Trieste, Italy
| | - Roberto Verdel
- The Abdus Salam International Centre for Theoretical Physics (ICTP), Strada Costiera 11, 34151 Trieste, Italy
| | - Alex Rodriguez
- The Abdus Salam International Centre for Theoretical Physics (ICTP), Strada Costiera 11, 34151 Trieste, Italy
- Dipartimento di Matematica e Geoscienze, Universitá degli Studi di Trieste, via Alfonso Valerio 12/1, 34127 Trieste, Italy
| | - Marcello Dalmonte
- The Abdus Salam International Centre for Theoretical Physics (ICTP), Strada Costiera 11, 34151 Trieste, Italy
- SISSA-International School of Advanced Studies, via Bonomea 265, 34136 Trieste, Italy
| | - Ginestra Bianconi
- School of Mathematical Sciences, Queen Mary University of London, London E1 4NS, United Kingdom
- The Alan Turing Institute, 96 Euston Road, London NW1 2DB, United Kingdom
| |
Collapse
|
3
|
Han Z, Liu L, Wang X, Hao Y, Zheng H, Tang S, Zheng Z. Probabilistic activity driven model of temporal simplicial networks and its application on higher-order dynamics. CHAOS (WOODBURY, N.Y.) 2024; 34:023137. [PMID: 38407398 DOI: 10.1063/5.0167123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 01/27/2024] [Indexed: 02/27/2024]
Abstract
Network modeling characterizes the underlying principles of structural properties and is of vital significance for simulating dynamical processes in real world. However, bridging structure and dynamics is always challenging due to the multiple complexities in real systems. Here, through introducing the individual's activity rate and the possibility of group interaction, we propose a probabilistic activity-driven (PAD) model that could generate temporal higher-order networks with both power-law and high-clustering characteristics, which successfully links the two most critical structural features and a basic dynamical pattern in extensive complex systems. Surprisingly, the power-law exponents and the clustering coefficients of the aggregated PAD network could be tuned in a wide range by altering a set of model parameters. We further provide an approximation algorithm to select the proper parameters that can generate networks with given structural properties, the effectiveness of which is verified by fitting various real-world networks. Finally, we construct the co-evolution framework of the PAD model and higher-order contagion dynamics and derive the critical conditions for phase transition and bistable phenomenon using theoretical and numerical methods. Results show that tendency of participating in higher-order interactions can promote the emergence of bistability but delay the outbreak under heterogeneous activity rates. Our model provides a basic tool to reproduce complex structural properties and to study the widespread higher-order dynamics, which has great potential for applications across fields.
Collapse
Affiliation(s)
- Zhihao Han
- Institute of Artificial Intelligence, Beihang University, Beijing 100191, China
- Key laboratory of Mathematics, Informatics and Behavioral Semantics (LMIB), Beihang University, Beijing 100191, China
| | - Longzhao Liu
- Institute of Artificial Intelligence, Beihang University, Beijing 100191, China
- Key laboratory of Mathematics, Informatics and Behavioral Semantics (LMIB), Beihang University, Beijing 100191, China
- State Key Lab of Software Development Environment (NLSDE), Beihang University, Beijing 100191, China
- Zhongguancun Laboratory, Beijing 100094, People's Republic of China
- Beijing Advanced Innovation Center for Future Blockchain and Privacy Computing, Beihang University, Beijing 100191, China
- PengCheng Laboratory, Shenzhen 518055, China
| | - Xin Wang
- Institute of Artificial Intelligence, Beihang University, Beijing 100191, China
- Key laboratory of Mathematics, Informatics and Behavioral Semantics (LMIB), Beihang University, Beijing 100191, China
- State Key Lab of Software Development Environment (NLSDE), Beihang University, Beijing 100191, China
- Zhongguancun Laboratory, Beijing 100094, People's Republic of China
- Beijing Advanced Innovation Center for Future Blockchain and Privacy Computing, Beihang University, Beijing 100191, China
- PengCheng Laboratory, Shenzhen 518055, China
| | - Yajing Hao
- Key laboratory of Mathematics, Informatics and Behavioral Semantics (LMIB), Beihang University, Beijing 100191, China
- School of Mathematical Sciences, Beihang University, Beijing 100191, China
| | - Hongwei Zheng
- Beijing Advanced Innovation Center for Future Blockchain and Privacy Computing, Beihang University, Beijing 100191, China
- Beijing Academy of Blockchain and Edge Computing (BABEC), Beijing 100085, China
| | - Shaoting Tang
- Institute of Artificial Intelligence, Beihang University, Beijing 100191, China
- Key laboratory of Mathematics, Informatics and Behavioral Semantics (LMIB), Beihang University, Beijing 100191, China
- State Key Lab of Software Development Environment (NLSDE), Beihang University, Beijing 100191, China
- Zhongguancun Laboratory, Beijing 100094, People's Republic of China
- Beijing Advanced Innovation Center for Future Blockchain and Privacy Computing, Beihang University, Beijing 100191, China
- PengCheng Laboratory, Shenzhen 518055, China
- Institute of Medical Artificial Intelligence, Binzhou Medical University, Yantai 264003, China
- School of Mathematical Sciences, Dalian University of Technology, Dalian 116024, China
| | - Zhiming Zheng
- Institute of Artificial Intelligence, Beihang University, Beijing 100191, China
- Key laboratory of Mathematics, Informatics and Behavioral Semantics (LMIB), Beihang University, Beijing 100191, China
- State Key Lab of Software Development Environment (NLSDE), Beihang University, Beijing 100191, China
- Zhongguancun Laboratory, Beijing 100094, People's Republic of China
- Beijing Advanced Innovation Center for Future Blockchain and Privacy Computing, Beihang University, Beijing 100191, China
- PengCheng Laboratory, Shenzhen 518055, China
- Institute of Medical Artificial Intelligence, Binzhou Medical University, Yantai 264003, China
- School of Mathematical Sciences, Dalian University of Technology, Dalian 116024, China
| |
Collapse
|
4
|
Liu Y, Wang X, Zhang C. Study on the regional risk classification method for the prevention and control of emerging infectious diseases based on directed graph theory. Front Public Health 2023; 11:1211291. [PMID: 37818307 PMCID: PMC10561095 DOI: 10.3389/fpubh.2023.1211291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 09/05/2023] [Indexed: 10/12/2023] Open
Abstract
Background Emerging infectious diseases are a class of diseases that are spreading rapidly and are highly contagious. It seriously affects social stability and poses a significant threat to human health, requiring urgent measures to deal with them. Its outbreak will very easily lead to the large-scale spread of the virus, causing social problems such as work stoppages and traffic control, thereby causing social panic and psychological unrest, affecting human activities and social stability, and even endangering lives. It is essential to prevent and control the spread of infectious diseases effectively. Purpose We aim to propose an effective method to classify the risk level of a new epidemic region by using graph theory and risk classification methods to provide a theoretical reference for the comprehensive evaluation and determination of epidemic prevention and control, as well as risk level classification. Methods Using the graph theory method, we first define the network structure of social groups and construct the risk transmission network of the new epidemic region. Then, combined with the risk classification method, the classification of high, medium, and low risk levels of the new epidemic region is discussed from two cases with common and looped graph nodes, respectively. Finally, the reasonableness of the classification method is verified by simulation data. Results The directed weighted scale-free network can better describe the transmission law of an epidemic. Moreover, the proposed method of classifying the risk level of a region by using the correlation function between two regions and the risk value of the regional nodes can effectively evaluate the risk level of different regions in the new epidemic region. The experiments show that the number of medium and high risk nodes shows no increasing trend. The number of high-risk regions is relatively small compared to medium-risk regions, and the number of low-risk regions is the largest. Conclusions It is necessary to distinguish scientifically between the risk level of the epidemic area and the neighboring regions so that the constructed social network model of the epidemic region's spread risk can better describe the spread of the epidemic risk in the social network relations.
Collapse
Affiliation(s)
- Yong Liu
- School of Science, Xi'an University of Architecture and Technology, Xi'an, China
| | - Xiao Wang
- School of Science, Xi'an University of Architecture and Technology, Xi'an, China
| | - Chongqi Zhang
- School of Science, Xi'an University of Architecture and Technology, Xi'an, China
- School of Economics and Statistics, Guangzhou University, Guangzhou, China
| |
Collapse
|
5
|
Nikolaev A, Mneimneh S. Modeling and analysis of affiliation networks with preferential attachment and subsumption. Phys Rev E 2023; 108:014310. [PMID: 37583151 DOI: 10.1103/physreve.108.014310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Accepted: 05/19/2023] [Indexed: 08/17/2023]
Abstract
Preferential attachment describes a variety of graph-based models in which a network grows incrementally via the sequential addition of new nodes and edges, and where existing nodes acquire new neighbors at a rate proportional to their degree. Some networks, however, are better described as groups of nodes rather than a set of pairwise connections. These groups are called affiliations, and the corresponding networks affiliation networks. When viewed as graphs, affiliation networks do not necessarily exhibit the power law distribution of node degrees that is typically associated with preferential attachment. We propose a preferential attachment mechanism for affiliation networks that highlights the power law characteristic of these networks when presented as hypergraphs and simplicial complexes. The two representations capture affiliations in similar ways, but the latter offers an intrinsic feature of the model called subsumption, where an affiliation cannot be a subset of another. Our model of preferential attachment has interesting features, both algorithmic and analytic, including implicit preferential attachment (node sampling does not require knowledge of node degrees), a locality property where the neighbors of a newly added node are also neighbors, the emergence of a power law distribution of degrees (defined in hypergraphs and simplicial complexes rather than at a graph level), implicit deletion of affiliations (through subsumption in the case of simplicial complexes), and to some extent a control over the affiliation size distribution. By varying the parameters of the model, the generated affiliation networks can resemble different types of real-world examples, so the framework also serves as a synthetic generation algorithm for simulation and experimental studies.
Collapse
Affiliation(s)
- Alexey Nikolaev
- Department of Computer Science, The Graduate Center of CUNY, 365 5th Avenue, New York, New York 10016, USA
| | - Saad Mneimneh
- Department of Computer Science, The Graduate Center of CUNY, 365 5th Avenue, New York, New York 10016, USA
- Department of Computer Science, Hunter College of CUNY, 695 Park Avenue, New York, New York 10065, USA
| |
Collapse
|
6
|
Vazquez A. Complex hypergraphs. Phys Rev E 2023; 107:024316. [PMID: 36932522 DOI: 10.1103/physreve.107.024316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Accepted: 02/15/2023] [Indexed: 06/18/2023]
Abstract
Providing an abstract representation of natural and human complex structures is a challenging problem. Accounting for the system heterogenous components while allowing for analytical tractability is a difficult balance. Here I introduce complex hypergraphs (chygraphs), bringing together concepts from hypergraphs, multilayer networks, simplicial complexes, and hyperstructures. To illustrate the applicability of this combinatorial structure I calculate the component sizes statistics and identify the transition to a giant component. To this end I introduce a vectorization technique that tackles the multilevel nature of chygraphs. I conclude that chygraphs are a unifying representation of complex systems allowing for analytical insight.
Collapse
Affiliation(s)
- Alexei Vazquez
- Nodes & Links Ltd, Salisbury House, Station Road, Cambridge CB1 2LA, United Kingdom
| |
Collapse
|
7
|
Majhi S, Perc M, Ghosh D. Dynamics on higher-order networks: a review. J R Soc Interface 2022; 19:20220043. [PMID: 35317647 PMCID: PMC8941407 DOI: 10.1098/rsif.2022.0043] [Citation(s) in RCA: 103] [Impact Index Per Article: 34.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2022] [Accepted: 02/18/2022] [Indexed: 12/25/2022] Open
Abstract
Network science has evolved into an indispensable platform for studying complex systems. But recent research has identified limits of classical networks, where links connect pairs of nodes, to comprehensively describe group interactions. Higher-order networks, where a link can connect more than two nodes, have therefore emerged as a new frontier in network science. Since group interactions are common in social, biological and technological systems, higher-order networks have recently led to important new discoveries across many fields of research. Here, we review these works, focusing in particular on the novel aspects of the dynamics that emerges on higher-order networks. We cover a variety of dynamical processes that have thus far been studied, including different synchronization phenomena, contagion processes, the evolution of cooperation and consensus formation. We also outline open challenges and promising directions for future research.
Collapse
Affiliation(s)
- Soumen Majhi
- Department of Mathematics, Bar-Ilan University, Ramat-Gan 5290002, Israel
| | - Matjaž Perc
- Faculty of Natural Sciences and Mathematics, University of Maribor, Koroška cesta 160, 2000 Maribor, Slovenia
- Department of Medical Research, China Medical University Hospital, China Medical University, Taichung, Taiwan
- Complexity Science Hub Vienna, Josefstödter Straße 39, 1080 Vienna, Austria
- Alma Mater Europaea, Slovenska ulica 17, 2000 Maribor, Slovenia
| | - Dibakar Ghosh
- Physics and Applied Mathematics Unit, Indian Statistical Institute, 203 B. T. Road, Kolkata 700108, India
| |
Collapse
|
8
|
Freund AJ, Giabbanelli PJ. An Experimental Study on the Scalability of Recent Node Centrality Metrics in Sparse Complex Networks. Front Big Data 2022; 5:797584. [PMID: 35252851 PMCID: PMC8889076 DOI: 10.3389/fdata.2022.797584] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Accepted: 01/21/2022] [Indexed: 11/20/2022] Open
Abstract
Node centrality measures are among the most commonly used analytical techniques for networks. They have long helped analysts to identify “important” nodes that hold power in a social context, where damages could have dire consequences for transportation applications, or who should be a focus for prevention in epidemiology. Given the ubiquity of network data, new measures have been proposed, occasionally motivated by emerging applications or by the ability to interpolate existing measures. Before analysts use these measures and interpret results, the fundamental question is: are these measures likely to complete within the time window allotted to the analysis? In this paper, we comprehensively examine how the time necessary to run 18 new measures (introduced from 2005 to 2020) scales as a function of the number of nodes in the network. Our focus is on giving analysts a simple and practical estimate for sparse networks. As the time consumption depends on the properties in the network, we nuance our analysis by considering whether the network is scale-free, small-world, or random. Our results identify that several metrics run in the order of O(nlogn) and could scale to large networks, whereas others can require O(n2) or O(n3) and may become prime targets in future works for approximation algorithms or distributed implementations.
Collapse
|
9
|
Haruna T, Gunji YP. Analysis and synthesis of a growing network model generating dense scale-free networks via category theory. Sci Rep 2020; 10:22351. [PMID: 33339877 PMCID: PMC7749186 DOI: 10.1038/s41598-020-79318-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Accepted: 12/08/2020] [Indexed: 11/26/2022] Open
Abstract
We propose a growing network model that can generate dense scale-free networks with an almost neutral degree−degree correlation and a negative scaling of local clustering coefficient. The model is obtained by modifying an existing model in the literature that can also generate dense scale-free networks but with a different higher-order network structure. The modification is mediated by category theory. Category theory can identify a duality structure hidden in the previous model. The proposed model is built so that the identified duality is preserved. This work is a novel application of category theory for designing a network model focusing on a universal algebraic structure.
Collapse
|
10
|
Ma F, Wang X, Wang P. Scale-free networks with invariable diameter and density feature: Counterexamples. Phys Rev E 2020; 101:022315. [PMID: 32168588 DOI: 10.1103/physreve.101.022315] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Accepted: 01/27/2020] [Indexed: 11/07/2022]
Abstract
Here, we propose a class of scale-free networks G(t;m) with intriguing properties, which cannot be simultaneously held by all the theoretical models with power-law degree distribution in the existing literature, including the following: (i) average degrees 〈k〉 of all the generated networks are no longer constant in the limit of large graph size, implying that they are not sparse but dense; (ii) power-law parameters γ of these networks are precisely calculated equal to 2; and (iii) their diameters D are all invariant in the growth process of models. While our models have deterministic structure with clustering coefficients equivalent to zero, we might be able to obtain various candidates with nonzero clustering coefficients based on original networks using reasonable approaches, for instance, randomly adding new edges under the premise of keeping the three important properties above unchanged. In addition, we study the trapping problem on networks G(t;m) and then obtain a closed-form solution to mean hitting time 〈H〉_{t}. As opposed to other previous models, our results show an unexpected phenomenon that the analytic value for 〈H〉_{t} is approximately close to the logarithm of the vertex number of networks G(t;m). From the theoretical point of view, these networked models considered here can be thought of as counterexamples for most of the published models obeying power-law distribution in current study.
Collapse
Affiliation(s)
- Fei Ma
- School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China
| | - Xiaomin Wang
- School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China
| | - Ping Wang
- National Engineering Research Center for Software Engineering, Peking University, Beijing 100871, China; School of Software and Microelectronics, Peking University, Beijing 102600, China; and Key Laboratory of High Confidence Software Technologies, Peking University, Ministry of Education, Beijing 100871, China
| |
Collapse
|
11
|
Metzig C, Colijn C. A Maximum Entropy Method for the Prediction of Size Distributions. ENTROPY 2020; 22:e22030312. [PMID: 33286086 PMCID: PMC7516768 DOI: 10.3390/e22030312] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Revised: 02/16/2020] [Accepted: 03/05/2020] [Indexed: 12/04/2022]
Abstract
We propose a method to derive the stationary size distributions of a system, and the degree distributions of networks, using maximisation of the Gibbs-Shannon entropy. We apply this to a preferential attachment-type algorithm for systems of constant size, which contains exit of balls and urns (or nodes and edges for the network case). Knowing mean size (degree) and turnover rate, the power law exponent and exponential cutoff can be derived. Our results are confirmed by simulations and by computation of exact probabilities. We also apply this entropy method to reproduce existing results like the Maxwell-Boltzmann distribution for the velocity of gas particles, the Barabasi-Albert model and multiplicative noise systems.
Collapse
Affiliation(s)
- Cornelia Metzig
- Business School, Imperial College London, London SW7 2AZ, UK
- School of Electronic Engineering and Computer Science, Queen Mary University, London E1 7NS, UK
- Correspondence:
| | - Caroline Colijn
- Department of Mathematics, Imperial College London, London SW7 2AZ, UK;
- Department of Mathematics, Simon Fraser University, Surrey, BC V3T0A3, Canada
| |
Collapse
|
12
|
Haruna T, Gunji YP. Ordinal Preferential Attachment: A Self-Organizing Principle Generating Dense Scale-Free Networks. Sci Rep 2019; 9:4130. [PMID: 30858504 PMCID: PMC6412141 DOI: 10.1038/s41598-019-40716-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2018] [Accepted: 02/22/2019] [Indexed: 11/28/2022] Open
Abstract
Networks are useful representations for analyzing and modeling real-world complex systems. They are often both scale-free and dense: their degree distribution follows a power-law and their average degree grows over time. So far, it has been argued that producing such networks is difficult without externally imposing a suitable cutoff for the scale-free regime. Here, we propose a new growing network model that produces dense scale-free networks with dynamically generated cutoffs. The link formation rule is based on a weak form of preferential attachment depending only on order relations between the degrees of nodes. By this mechanism, our model yields scale-free networks whose scaling exponents can take arbitrary values greater than 1. In particular, the resulting networks are dense when scaling exponents are 2 or less. We analytically study network properties such as the degree distribution, the degree correlation function, and the local clustering coefficient. All analytical calculations are in good agreement with numerical simulations. These results show that both sparse and dense scale-free networks can emerge through the same self-organizing process.
Collapse
|
13
|
Persistent homology of unweighted complex networks via discrete Morse theory. Sci Rep 2019; 9:13817. [PMID: 31554857 PMCID: PMC6761140 DOI: 10.1038/s41598-019-50202-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Accepted: 09/06/2019] [Indexed: 11/14/2022] Open
Abstract
Topological data analysis can reveal higher-order structure beyond pairwise connections between vertices in complex networks. We present a new method based on discrete Morse theory to study topological properties of unweighted and undirected networks using persistent homology. Leveraging on the features of discrete Morse theory, our method not only captures the topology of the clique complex of such graphs via the concept of critical simplices, but also achieves close to the theoretical minimum number of critical simplices in several analyzed model and real networks. This leads to a reduced filtration scheme based on the subsequence of the corresponding critical weights, thereby leading to a significant increase in computational efficiency. We have employed our filtration scheme to explore the persistent homology of several model and real-world networks. In particular, we show that our method can detect differences in the higher-order structure of networks, and the corresponding persistence diagrams can be used to distinguish between different model networks. In summary, our method based on discrete Morse theory further increases the applicability of persistent homology to investigate the global topology of complex networks.
Collapse
|
14
|
Gerlach M, Peixoto TP, Altmann EG. A network approach to topic models. SCIENCE ADVANCES 2018; 4:eaaq1360. [PMID: 30035215 PMCID: PMC6051742 DOI: 10.1126/sciadv.aaq1360] [Citation(s) in RCA: 49] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Accepted: 06/05/2018] [Indexed: 06/08/2023]
Abstract
One of the main computational and scientific challenges in the modern age is to extract useful information from unstructured texts. Topic models are one popular machine-learning approach that infers the latent topical structure of a collection of documents. Despite their success-particularly of the most widely used variant called latent Dirichlet allocation (LDA)-and numerous applications in sociology, history, and linguistics, topic models are known to suffer from severe conceptual and practical problems, for example, a lack of justification for the Bayesian priors, discrepancies with statistical properties of real texts, and the inability to properly choose the number of topics. We obtain a fresh view of the problem of identifying topical structures by relating it to the problem of finding communities in complex networks. We achieve this by representing text corpora as bipartite networks of documents and words. By adapting existing community-detection methods (using a stochastic block model (SBM) with nonparametric priors), we obtain a more versatile and principled framework for topic modeling (for example, it automatically detects the number of topics and hierarchically clusters both the words and documents). The analysis of artificial and real corpora demonstrates that our SBM approach leads to better topic models than LDA in terms of statistical model selection. Our work shows how to formally relate methods from community detection and topic modeling, opening the possibility of cross-fertilization between these two fields.
Collapse
Affiliation(s)
- Martin Gerlach
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL 60208, USA
- Max Planck Institute for the Physics of Complex Systems, D-01187 Dresden, Germany
| | - Tiago P. Peixoto
- Department of Mathematical Sciences and Centre for Networks and Collective Behaviour, University of Bath, Claverton Down, Bath BA2 7AY, UK
- Institute for Scientific Interchange Foundation, Via Alassio 11/c, 10126 Torino, Italy
| | - Eduardo G. Altmann
- Max Planck Institute for the Physics of Complex Systems, D-01187 Dresden, Germany
- School of Mathematics and Statistics, University of Sydney, 2006 New South Wales, Australia
| |
Collapse
|