51
|
Label propagation with α-degree neighborhood impact for network community detection. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2014; 2014:130689. [PMID: 25525425 PMCID: PMC4265519 DOI: 10.1155/2014/130689] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/09/2014] [Revised: 10/30/2014] [Accepted: 11/12/2014] [Indexed: 11/17/2022]
Abstract
Community detection is an important task for mining the structure and function of complex networks. In this paper, a novel label propagation approach with α-degree neighborhood impact is proposed for efficiently and effectively detecting communities in networks. Firstly, we calculate the neighborhood impact of each node in a network within the scope of its α-degree neighborhood network by using an iterative approach. To mitigate the problems of visiting order correlation and convergence difficulty when updating the node labels asynchronously, our method updates the labels in an ascending order on the α-degree neighborhood impact of all the nodes. The α-degree neighborhood impact is also taken as the updating weight value, where the parameter impact scope α can be set to a positive integer. Experimental results from several real-world and synthetic networks show that our method can reveal the community structure in networks rapidly and accurately. The performance of our method is better than other label propagation based methods.
Collapse
|
52
|
Cheng J, Leng M, Li L, Zhou H, Chen X. Active semi-supervised community detection based on must-link and cannot-link constraints. PLoS One 2014; 9:e110088. [PMID: 25329660 PMCID: PMC4201489 DOI: 10.1371/journal.pone.0110088] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2014] [Accepted: 09/16/2014] [Indexed: 12/04/2022] Open
Abstract
Community structure detection is of great importance because it can help in discovering the relationship between the function and the topology structure of a network. Many community detection algorithms have been proposed, but how to incorporate the prior knowledge in the detection process remains a challenging problem. In this paper, we propose a semi-supervised community detection algorithm, which makes full utilization of the must-link and cannot-link constraints to guide the process of community detection and thereby extracts high-quality community structures from networks. To acquire the high-quality must-link and cannot-link constraints, we also propose a semi-supervised component generation algorithm based on active learning, which actively selects nodes with maximum utility for the proposed semi-supervised community detection algorithm step by step, and then generates the must-link and cannot-link constraints by accessing a noiseless oracle. Extensive experiments were carried out, and the experimental results show that the introduction of active learning into the problem of community detection makes a success. Our proposed method can extract high-quality community structures from networks, and significantly outperforms other comparison methods.
Collapse
Affiliation(s)
- Jianjun Cheng
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China
- * E-mail: (JC); (XC)
| | - Mingwei Leng
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China
| | - Longjie Li
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China
| | - Hanhai Zhou
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China
| | - Xiaoyun Chen
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China
- * E-mail: (JC); (XC)
| |
Collapse
|
53
|
Detecting community structures in networks by label propagation with prediction of percolation transition. ScientificWorldJournal 2014; 2014:148686. [PMID: 25110725 PMCID: PMC4119666 DOI: 10.1155/2014/148686] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2014] [Revised: 06/17/2014] [Accepted: 06/17/2014] [Indexed: 11/21/2022] Open
Abstract
Though label propagation algorithm (LPA) is one of the fastest algorithms for community detection in complex networks, the problem of trivial solutions frequently occurring in the algorithm affects its performance. We propose a label propagation algorithm with prediction of percolation transition (LPAp). After analyzing the reason for multiple solutions of LPA, by transforming the process of community detection into network construction process, a trivial solution in label propagation is considered as a giant component in the percolation transition. We add a prediction process of percolation transition in label propagation to delay the occurrence of trivial solutions, which makes small communities easier to be found. We also give an incomplete update condition which considers both neighbor purity and the contribution of small degree vertices to community detection to reduce the computation time of LPAp. Numerical tests are conducted. Experimental results on synthetic networks and real-world networks show that the LPAp is more accurate, more sensitive to small community, and has the ability to identify a single community structure. Moreover, LPAp with the incomplete update process can use less computation time than LPA, nearly without modularity loss.
Collapse
|
54
|
Liu X, Murata T, Wakita K. Detecting network communities beyond assortativity-related attributes. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2014; 90:012806. [PMID: 25122341 DOI: 10.1103/physreve.90.012806] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2013] [Indexed: 06/03/2023]
Abstract
In network science, assortativity refers to the tendency of links to exist between nodes with similar attributes. In social networks, for example, links tend to exist between individuals of similar age, nationality, location, race, income, educational level, religious belief, and language. Thus, various attributes jointly affect the network topology. An interesting problem is to detect community structure beyond some specific assortativity-related attributes ρ, i.e., to take out the effect of ρ on network topology and reveal the hidden community structures which are due to other attributes. An approach to this problem is to redefine the null model of the modularity measure, so as to simulate the effect of ρ on network topology. However, a challenge is that we do not know to what extent the network topology is affected by ρ and by other attributes. In this paper, we propose a distance modularity, which allows us to freely choose any suitable function to simulate the effect of ρ. Such freedom can help us probe the effect of ρ and detect the hidden communities which are due to other attributes. We test the effectiveness of distance modularity on synthetic benchmarks and two real-world networks.
Collapse
Affiliation(s)
- Xin Liu
- Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro, Tokyo, 152-8552 Japan and CREST, Japan Science and Technology Agency, K's Gobancho, 7 Gobancho, Chiyoda, Tokyo, 102-0076 Japan and Wuhan University of Technology, 122 Luoshi Road, Wuhan, Hubei, 430070 China
| | - Tsuyoshi Murata
- Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro, Tokyo, 152-8552 Japan
| | - Ken Wakita
- Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro, Tokyo, 152-8552 Japan and CREST, Japan Science and Technology Agency, K's Gobancho, 7 Gobancho, Chiyoda, Tokyo, 102-0076 Japan
| |
Collapse
|
55
|
|
56
|
Decomposition-based multiobjective evolutionary algorithm for community detection in dynamic social networks. ScientificWorldJournal 2014; 2014:402345. [PMID: 24723806 PMCID: PMC3958684 DOI: 10.1155/2014/402345] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2013] [Accepted: 12/22/2013] [Indexed: 11/17/2022] Open
Abstract
Community structure is one of the most important properties in social networks. In dynamic networks, there are two conflicting criteria that need to be considered. One is the snapshot quality, which evaluates the quality of the community partitions at the current time step. The other is the temporal cost, which evaluates the difference between communities at different time steps. In this paper, we propose a decomposition-based multiobjective community detection algorithm to simultaneously optimize these two objectives to reveal community structure and its evolution in dynamic networks. It employs the framework of multiobjective evolutionary algorithm based on decomposition to simultaneously optimize the modularity and normalized mutual information, which quantitatively measure the quality of the community partitions and temporal cost, respectively. A local search strategy dealing with the problem-specific knowledge is incorporated to improve the effectiveness of the new algorithm. Experiments on computer-generated and real-world networks demonstrate that the proposed algorithm can not only find community structure and capture community evolution more accurately, but also be steadier than the two compared algorithms.
Collapse
|
57
|
|
58
|
Exploiting behaviors of communities of twitter users for link prediction. SOCIAL NETWORK ANALYSIS AND MINING 2013. [DOI: 10.1007/s13278-013-0142-8] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
59
|
Hu D, Ronhovde P, Nussinov Z. Stability-to-instability transition in the structure of large-scale networks. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2012; 86:066106. [PMID: 23368003 DOI: 10.1103/physreve.86.066106] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2012] [Indexed: 06/01/2023]
Abstract
We examine phase transitions between the "easy," "hard," and "unsolvable" phases when attempting to identify structure in large complex networks ("community detection") in the presence of disorder induced by network "noise" (spurious links that obscure structure), heat bath temperature T, and system size N. The partition of a graph into q optimally disjoint subgraphs or "communities" inherently requires Potts-type variables. In earlier work [Philos. Mag. 92, 406 (2012)], when examining power law and other networks (and general associated Potts models), we illustrated that transitions in the computational complexity of the community detection problem typically correspond to spin-glass-type transitions (and transitions to chaotic dynamics in mechanical analogs) at both high and low temperatures and/or noise. The computationally "hard" phase exhibits spin-glass type behavior including memory effects. The region over which the hard phase extends in the noise and temperature phase diagram decreases as N increases while holding the average number of nodes per community fixed. This suggests that in the thermodynamic limit a direct sharp transition may occur between the easy and unsolvable phases. When present, transitions at low temperature or low noise correspond to entropy driven (or "order by disorder") annealing effects, wherein stability may initially increase as temperature or noise is increased before becoming unsolvable at sufficiently high temperature or noise. Additional transitions between contending viable solutions (such as those at different natural scales) are also possible. Identifying community structure via a dynamical approach where "chaotic-type" transitions were found earlier. The correspondence between the spin-glass-type complexity transitions and transitions into chaos in dynamical analogs might extend to other hard computational problems. In this work, we examine large networks (with a power law distribution in cluster size) that have a large number of communities (q≫1). We infer that large systems at a constant ratio of q to the number of nodes N asymptotically tend towards insolvability in the limit of large N for any positive T. The asymptotic behavior of temperatures below which structure identification might be possible, T_{×}=O[1/lnq], decreases slowly, so for practical system sizes, there remains an accessible, and generally easy, global solvable phase at low temperature. We further employ multivariate Tutte polynomials to show that increasing q emulates increasing T for a general Potts model, leading to a similar stability region at low T. Given the relation between Tutte and Jones polynomials, our results further suggest a link between the above complexity transitions and transitions associated with random knots.
Collapse
Affiliation(s)
- Dandan Hu
- Department of Physics, Washington University in St. Louis, Campus Box 1105, 1 Brookings Drive, St. Louis, Missouri 63130, USA
| | | | | |
Collapse
|
60
|
Kawadia V, Sreenivasan S. Sequential detection of temporal communities by estrangement confinement. Sci Rep 2012; 2:794. [PMID: 23145317 PMCID: PMC3494021 DOI: 10.1038/srep00794] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2012] [Accepted: 09/24/2012] [Indexed: 11/09/2022] Open
Abstract
Temporal communities are the result of a consistent partitioning of nodes across multiple snapshots of an evolving network, and they provide insights into how dense clusters in a network emerge, combine, split and decay over time. To reliably detect temporal communities we need to not only find a good community partition in a given snapshot but also ensure that it bears some similarity to the partition(s) found in the previous snapshot(s), a particularly difficult task given the extreme sensitivity of community structure yielded by current methods to changes in the network structure. Here, motivated by the inertia of inter-node relationships, we present a new measure of partition distance called estrangement, and show that constraining estrangement enables one to find meaningful temporal communities at various degrees of temporal smoothness in diverse real-world datasets. Estrangement confinement thus provides a principled approach to uncovering temporal communities in evolving networks.
Collapse
|
61
|
Wu J, Jiao L, Jin C, Liu F, Gong M, Shang R, Chen W. Overlapping community detection via network dynamics. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2012; 85:016115. [PMID: 22400633 DOI: 10.1103/physreve.85.016115] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2011] [Revised: 12/10/2011] [Indexed: 05/31/2023]
Abstract
The modular structure of a network is closely related to the dynamics toward clustering. In this paper, a method for community detection is proposed via the clustering dynamics of a network. The initial phases of the nodes in the network are given randomly, and then they evolve according to a set of dedicatedly designed differential equations. The phases of the nodes are naturally separated into several clusters after a period of evolution, and each cluster corresponds to a community in the network. For the networks with overlapping communities, the phases of the overlapping nodes will evolve to the interspace of the two communities. The proposed method is illustrated with applications to both synthetically generated and real-world complex networks.
Collapse
Affiliation(s)
- Jianshe Wu
- Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China, Xidian University, Xi'an 710071, China
| | | | | | | | | | | | | |
Collapse
|
62
|
Abstract
Modularity is a widely used quality measure for graph clusterings. Its exact maximization is NP-hard and prohibitively expensive for large graphs. Popular heuristics first perform a coarsening phase, where local search starting from singleton clusters is used to compute a preliminary clustering, and then optionally a refinement phase, where this clustering is improved by moving vertices between clusters. As a generalization, multilevel heuristics coarsen in several stages, and refine by moving entire clusters from each of these stages, not only individual vertices.
This article organizes existing and new single-level and multilevel heuristics into a coherent design space, and compares them experimentally with respect to their effectiveness (achieved modularity) and runtime. For coarsening by iterated cluster joining, it turns out that the most widely used criterion for joining clusters (modularity increase) is outperformed by other simple criteria, that a recent multistep algorithm [Schuetz and Caflisch 2008] is no improvement over simple single-step coarsening for these criteria, and that the recent multilevel coarsening by iterated vertex moving [Blondel et al. 2008] is somewhat faster but slightly less effective (with refinement). The new multilevel refinement is significantly more effective than the conventional single-level refinement or no refinement, in reasonable runtime.
A comparison with published benchmark results and algorithm implementations shows that multilevel local search heuristics, despite their relative simplicity, are competitive with the best algorithms in the literature.
Collapse
Affiliation(s)
- Randolf Rotta
- Brandenburgische Technische Universität Cottbus, Germany
| | - Andreas Noack
- Brandenburgische Technische Universität Cottbus, Germany
| |
Collapse
|
63
|
Subelj L, Bajec M. Unfolding communities in large complex networks: combining defensive and offensive label propagation for core extraction. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2011; 83:036103. [PMID: 21517554 DOI: 10.1103/physreve.83.036103] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2010] [Revised: 11/06/2010] [Indexed: 05/30/2023]
Abstract
Label propagation has proven to be a fast method for detecting communities in large complex networks. Recent developments have also improved the accuracy of the approach; however, a general algorithm is still an open issue. We present an advanced label propagation algorithm that combines two unique strategies of community formation, namely, defensive preservation and offensive expansion of communities. The two strategies are combined in a hierarchical manner to recursively extract the core of the network and to identify whisker communities. The algorithm was evaluated on two classes of benchmark networks with planted partition and on 23 real-world networks ranging from networks with tens of nodes to networks with several tens of millions of edges. It is shown to be comparable to the current state-of-the-art community detection algorithms and superior to all previous label propagation algorithms, with comparable time complexity. In particular, analysis on real-world networks has proven that the algorithm has almost linear complexity, O(m¹·¹⁹), and scales even better than the basic label propagation algorithm (m is the number of edges in the network).
Collapse
Affiliation(s)
- Lovro Subelj
- Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia.
| | | |
Collapse
|
64
|
Liu X, Murata T. An Efficient Algorithm for Optimizing Bipartite Modularity in Bipartite Networks. JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS 2010. [DOI: 10.20965/jaciii.2010.p0408] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Modularity evaluates the quality of a division of network nodes into communities, and modularity optimization is the most widely used class of methods for detecting communities in networks. In bipartite networks, there are correspondingly bipartite modularity and bipartite modularity optimization. LPAb, a very fast label propagation algorithm based on bipartite modularity optimization, tends to become stuck in poor local maxima, yielding suboptimal community divisions with low bipartite modularity. We therefore propose LPAb+, a hybrid algorithm combining modified LPAb, or LPAb’, and MSG, a multistep greedy agglomerative algorithm, with the objective of using MSG to drive LPAb out of local maxima. We use four commonly used real-world bipartite networks to demonstrate LPAb+ capability in detecting community divisions with remarkably higher bipartite modularity than LPAb. We show how LPAb+ outperforms other bipartite modularity optimization algorithms, without compromising speed.
Collapse
|
65
|
Ronhovde P, Nussinov Z. Local resolution-limit-free Potts model for community detection. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2010; 81:046114. [PMID: 20481793 DOI: 10.1103/physreve.81.046114] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/19/2009] [Revised: 02/28/2010] [Indexed: 05/29/2023]
Abstract
We report on an exceptionally accurate spin-glass-type Potts model for community detection. With a simple algorithm, we find that our approach is at least as accurate as the best currently available algorithms and robust to the effects of noise. It is also competitive with the best currently available algorithms in terms of speed and size of solvable systems. We find that the computational demand often exhibits superlinear scaling O(L1.3) where L is the number of edges in the system, and we have applied the algorithm to synthetic systems as large as 40 x 10(6) nodes and over 1 x 10(9) edges. A previous stumbling block encountered by popular community detection methods is the so-called "resolution limit." Being a "local" measure of community structure, our Potts model is free from this resolution-limit effect, and it further remains a local measure on weighted and directed graphs. We also address the mitigation of resolution-limit effects for two other popular Potts models.
Collapse
Affiliation(s)
- Peter Ronhovde
- Department of Physics, Washington University, St Louis, Missouri 63130, USA
| | | |
Collapse
|
66
|
Cafieri S, Hansen P, Liberti L. Edge ratio and community structure in networks. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2010; 81:026105. [PMID: 20365629 DOI: 10.1103/physreve.81.026105] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2009] [Revised: 12/12/2009] [Indexed: 05/29/2023]
Abstract
A hierarchical divisive algorithm is proposed for identifying communities in complex networks. To that effect, the definition of community in the weak sense of Radicchi [Proc. Natl. Acad. Sci. U.S.A. 101, 2658 (2004)] is extended into a criterion for a bipartition to be optimal: one seeks to maximize the minimum for both classes of the bipartition of the ratio of inner edges to cut edges. A mathematical program is used within a dichotomous search to do this in an optimal way for each bipartition. This includes an exact solution of the problem of detecting indivisible communities. The resulting hierarchical divisive algorithm is compared with exact modularity maximization on both artificial and real world data sets. For two problems of the former kind optimal solutions are found; for five problems of the latter kind the edge ratio algorithm always appears to be competitive. Moreover, it provides additional information in several cases, notably through the use of the dendrogram summarizing the resolution. Finally, both algorithms are compared on reduced versions of the data sets of Girvan and Newman [Proc. Natl. Acad. Sci. U.S.A. 99, 7821 (2002)] and of Lancichinetti [Phys. Rev. E 78, 046110 (2008)]. Results for these instances appear to be comparable.
Collapse
Affiliation(s)
- Sonia Cafieri
- LIX, Ecole Polytechnique, F-91128 Palaiseau, France.
| | | | | |
Collapse
|