1
|
Wu XZ, Percus AG, Lerman K. Neighbor-Neighbor Correlations Explain Measurement Bias in Networks. Sci Rep 2017; 7:5576. [PMID: 28717155 PMCID: PMC5514029 DOI: 10.1038/s41598-017-06042-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2017] [Accepted: 06/06/2017] [Indexed: 11/15/2022] Open
Abstract
In numerous physical models on networks, dynamics are based on interactions that exclusively involve properties of a node’s nearest neighbors. However, a node’s local view of its neighbors may systematically bias perceptions of network connectivity or the prevalence of certain traits. We investigate the strong friendship paradox, which occurs when the majority of a node’s neighbors have more neighbors than does the node itself. We develop a model to predict the magnitude of the paradox, showing that it is enhanced by negative correlations between degrees of neighboring nodes. We then show that by including neighbor-neighbor correlations, which are degree correlations one step beyond those of neighboring nodes, we accurately predict the impact of the strong friendship paradox in real-world networks. Understanding how the paradox biases local observations can inform better measurements of network structure and our understanding of collective phenomena.
Collapse
Affiliation(s)
- Xin-Zeng Wu
- Information Sciences Institute, University of Southern California, Marina del Rey, CA 90292, USA. .,Department of Physics and Astronomy, University of Southern California, Los Angeles, CA 90089, USA.
| | - Allon G Percus
- Institute of Mathematical Sciences, Claremont Graduate University, Claremont, CA 91711, USA
| | - Kristina Lerman
- Information Sciences Institute, University of Southern California, Marina del Rey, CA 90292, USA
| |
Collapse
|
2
|
Abstract
Social and biological contagions are influenced by the spatial embeddedness of networks. Historically, many epidemics spread as a wave across part of the Earth’s surface; however, in modern contagions long-range edges—for example, due to airline transportation or communication media—allow clusters of a contagion to appear in distant locations. Here we study the spread of contagions on networks through a methodology grounded in topological data analysis and nonlinear dimension reduction. We construct “contagion maps” that use multiple contagions on a network to map the nodes as a point cloud. By analyzing the topology, geometry, and dimensionality of manifold structure in such point clouds, we reveal insights to aid in the modeling, forecast, and control of spreading processes. Our approach highlights contagion maps also as a viable tool for inferring low-dimensional structure in networks.
Collapse
|
3
|
Jeub LGS, Balachandran P, Porter MA, Mucha PJ, Mahoney MW. Think locally, act locally: detection of small, medium-sized, and large communities in large networks. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2015; 91:012821. [PMID: 25679670 PMCID: PMC5125638 DOI: 10.1103/physreve.91.012821] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2014] [Indexed: 06/04/2023]
Abstract
It is common in the study of networks to investigate intermediate-sized (or "meso-scale") features to try to gain an understanding of network structure and function. For example, numerous algorithms have been developed to try to identify "communities," which are typically construed as sets of nodes with denser connections internally than with the remainder of a network. In this paper, we adopt a complementary perspective that communities are associated with bottlenecks of locally biased dynamical processes that begin at seed sets of nodes, and we employ several different community-identification procedures (using diffusion-based and geodesic-based dynamics) to investigate community quality as a function of community size. Using several empirical and synthetic networks, we identify several distinct scenarios for "size-resolved community structure" that can arise in real (and realistic) networks: (1) the best small groups of nodes can be better than the best large groups (for a given formulation of the idea of a good community); (2) the best small groups can have a quality that is comparable to the best medium-sized and large groups; and (3) the best small groups of nodes can be worse than the best large groups. As we discuss in detail, which of these three cases holds for a given network can make an enormous difference when investigating and making claims about network community structure, and it is important to take this into account to obtain reliable downstream conclusions. Depending on which scenario holds, one may or may not be able to successfully identify "good" communities in a given network (and good communities might not even exist for a given community quality measure), the manner in which different small communities fit together to form meso-scale network structures can be very different, and processes such as viral propagation and information diffusion can exhibit very different dynamics. In addition, our results suggest that, for many large realistic networks, the output of locally biased methods that focus on communities that are centered around a given seed node (or set of seed nodes) might have better conceptual grounding and greater practical utility than the output of global community-detection methods. They also illustrate structural properties that are important to consider in the development of better benchmark networks to test methods for community detection.
Collapse
Affiliation(s)
- Lucas G S Jeub
- Oxford Centre for Industrial and Applied Mathematics, Mathematical Institute, University of Oxford, Oxford OX2 6GG, United Kingdom
| | - Prakash Balachandran
- Morgan Stanley, Montreal, Quebec, H3C 3S4, Canada and Department of Mathematics and Statistics, Boston University, Boston, Massachusetts 02215, USA
| | - Mason A Porter
- Oxford Centre for Industrial and Applied Mathematics, Mathematical Institute, University of Oxford, Oxford OX2 6GG, United Kingdom and CABDyN Complexity Centre, University of Oxford, Oxford OX1 1HP, United Kingdom
| | - Peter J Mucha
- Carolina Center for Interdisciplinary Applied Mathematics, Department of Mathematics, University of North Carolina, Chapel Hill, North Carolina 27599-3250, USA
| | - Michael W Mahoney
- International Computer Science Institute, Berkeley, California 94704, USA and Department of Statistics, University of California at Berkeley, Berkeley, California 94720, USA
| |
Collapse
|
4
|
Smith LM, Lerman K, Garcia-Cardona C, Percus AG, Ghosh R. Spectral clustering with epidemic diffusion. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2013; 88:042813. [PMID: 24229231 DOI: 10.1103/physreve.88.042813] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/21/2013] [Indexed: 06/02/2023]
Abstract
Spectral clustering is widely used to partition graphs into distinct modules or communities. Existing methods for spectral clustering use the eigenvalues and eigenvectors of the graph Laplacian, an operator that is closely associated with random walks on graphs. We propose a spectral partitioning method that exploits the properties of epidemic diffusion. An epidemic is a dynamic process that, unlike the random walk, simultaneously transitions to all the neighbors of a given node. We show that the replicator, an operator describing epidemic diffusion, is equivalent to the symmetric normalized Laplacian of a reweighted graph with edges reweighted by the eigenvector centralities of their incident nodes. Thus, more weight is given to edges connecting more central nodes. We describe a method that partitions the nodes based on the componentwise ratio of the replicator's second eigenvector to the first and compare its performance to traditional spectral clustering techniques on synthetic graphs with known community structure. We demonstrate that the replicator gives preference to dense, clique-like structures, enabling it to more effectively discover communities that may be obscured by dense intercommunity linking.
Collapse
Affiliation(s)
- Laura M Smith
- California State University, Fullerton, California 92831, USA
| | | | | | | | | |
Collapse
|