551
|
Defining and Discovering Communities in Social Networks. HANDBOOK OF OPTIMIZATION IN COMPLEX NETWORKS 2012. [DOI: 10.1007/978-1-4614-0754-6_6] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
552
|
Kuhl C, Tautenhahn R, Böttcher C, Larson TR, Neumann S. CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets. Anal Chem 2011; 84:283-9. [PMID: 22111785 DOI: 10.1021/ac202450g] [Citation(s) in RCA: 785] [Impact Index Per Article: 56.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Liquid chromatography coupled to mass spectrometry is routinely used for metabolomics experiments. In contrast to the fairly routine and automated data acquisition steps, subsequent compound annotation and identification require extensive manual analysis and thus form a major bottleneck in data interpretation. Here we present CAMERA, a Bioconductor package integrating algorithms to extract compound spectra, annotate isotope and adduct peaks, and propose the accurate compound mass even in highly complex data. To evaluate the algorithms, we compared the annotation of CAMERA against a manually defined annotation for a mixture of known compounds spiked into a complex matrix at different concentrations. CAMERA successfully extracted accurate masses for 89.7% and 90.3% of the annotatable compounds in positive and negative ion modes, respectively. Furthermore, we present a novel annotation approach that combines spectral information of data acquired in opposite ion modes to further improve the annotation rate. We demonstrate the utility of CAMERA in two different, easily adoptable plant metabolomics experiments, where the application of CAMERA drastically reduced the amount of manual analysis.
Collapse
Affiliation(s)
- Carsten Kuhl
- Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120 Halle (Saale), Germany.
| | | | | | | | | |
Collapse
|
553
|
Coscia M, Giannotti F, Pedreschi D. A classification for community discovery methods in complex networks. Stat Anal Data Min 2011. [DOI: 10.1002/sam.10133] [Citation(s) in RCA: 226] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
554
|
Mateos P, Longley PA, O'Sullivan D. Ethnicity and population structure in personal naming networks. PLoS One 2011; 6:e22943. [PMID: 21909399 PMCID: PMC3167808 DOI: 10.1371/journal.pone.0022943] [Citation(s) in RCA: 72] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2011] [Accepted: 07/01/2011] [Indexed: 11/25/2022] Open
Abstract
Personal naming practices exist in all human groups and are far from random. Rather, they continue to reflect social norms and ethno-cultural customs that have developed over generations. As a consequence, contemporary name frequency distributions retain distinct geographic, social and ethno-cultural patterning that can be exploited to understand population structure in human biology, public health and social science. Previous attempts to detect and delineate such structure in large populations have entailed extensive empirical analysis of naming conventions in different parts of the world without seeking any general or automated methods of population classification by ethno-cultural origin. Here we show how ‘naming networks’, constructed from forename-surname pairs of a large sample of the contemporary human population in 17 countries, provide a valuable representation of cultural, ethnic and linguistic population structure around the world. This innovative approach enriches and adds value to automated population classification through conventional national data sources such as telephone directories and electoral registers. The method identifies clear social and ethno-cultural clusters in such naming networks that extend far beyond the geographic areas in which particular names originated, and that are preserved even after international migration. Moreover, one of the most striking findings of this approach is that these clusters simply ‘emerge’ from the aggregation of millions of individual decisions on parental naming practices for their children, without any prior knowledge introduced by the researcher. Our probabilistic approach to community assignment, both at city level as well as at a global scale, helps to reveal the degree of isolation, integration or overlap between human populations in our rapidly globalising world. As such, this work has important implications for research in population genetics, public health, and social science adding new understandings of migration, identity, integration and social interaction across the world.
Collapse
Affiliation(s)
- Pablo Mateos
- Department of Geography University College London, London, United Kingdom.
| | | | | |
Collapse
|
555
|
|
556
|
Kim Y, Jeong H. Map equation for link communities. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2011; 84:026110. [PMID: 21929067 DOI: 10.1103/physreve.84.026110] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2011] [Indexed: 05/31/2023]
Abstract
Community structure exists in many real-world networks and has been reported being related to several functional properties of the networks. The conventional approach was partitioning nodes into communities, while some recent studies start partitioning links instead of nodes to find overlapping communities of nodes efficiently. We extended the map equation method, which was originally developed for node communities, to find link communities in networks. This method is tested on various kinds of networks and compared with the metadata of the networks, and the results show that our method can identify the overlapping role of nodes effectively. The advantage of this method is that the node community scheme and link community scheme can be compared quantitatively by measuring the unknown information left in the networks besides the community structure. It can be used to decide quantitatively whether or not the link community scheme should be used instead of the node community scheme. Furthermore, this method can be easily extended to the directed and weighted networks since it is based on the random walk.
Collapse
Affiliation(s)
- Youngdo Kim
- Department of Physics, Korea Advanced Institute of Science and Technology, Daejeon 305-701, Republic of Korea
| | | |
Collapse
|
557
|
|
558
|
Traag VA, Van Dooren P, Nesterov Y. Narrow scope for resolution-limit-free community detection. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2011; 84:016114. [PMID: 21867264 DOI: 10.1103/physreve.84.016114] [Citation(s) in RCA: 101] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2011] [Revised: 06/27/2011] [Indexed: 05/12/2023]
Abstract
Detecting communities in large networks has drawn much attention over the years. While modularity remains one of the more popular methods of community detection, the so-called resolution limit remains a significant drawback. To overcome this issue, it was recently suggested that instead of comparing the network to a random null model, as is done in modularity, it should be compared to a constant factor. However, it is unclear what is meant exactly by "resolution-limit-free," that is, not suffering from the resolution limit. Furthermore, the question remains what other methods could be classified as resolution-limit-free. In this paper we suggest a rigorous definition and derive some basic properties of resolution-limit-free methods. More importantly, we are able to prove exactly which class of community detection methods are resolution-limit-free. Furthermore, we analyze which methods are not resolution-limit-free, suggesting there is only a limited scope for resolution-limit-free community detection methods. Finally, we provide such a natural formulation, and show it performs superbly.
Collapse
Affiliation(s)
- V A Traag
- ICTEAM, Université Catholique de Louvain, Louvain-la Neuve, Belgium.
| | | | | |
Collapse
|
559
|
Papadopoulos S, Kompatsiaris Y, Vakali A, Spyridonos P. Community detection in Social Media. Data Min Knowl Discov 2011. [DOI: 10.1007/s10618-011-0224-z] [Citation(s) in RCA: 197] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
560
|
Zhan W, Zhang Z, Guan J, Zhou S. Evolutionary method for finding communities in bipartite networks. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2011; 83:066120. [PMID: 21797454 DOI: 10.1103/physreve.83.066120] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2010] [Revised: 03/03/2011] [Indexed: 05/31/2023]
Abstract
An important step in unveiling the relation between network structure and dynamics defined on networks is to detect communities, and numerous methods have been developed separately to identify community structure in different classes of networks, such as unipartite networks, bipartite networks, and directed networks. Here, we show that the finding of communities in such networks can be unified in a general framework-detection of community structure in bipartite networks. Moreover, we propose an evolutionary method for efficiently identifying communities in bipartite networks. To this end, we show that both unipartite and directed networks can be represented as bipartite networks, and their modularity is completely consistent with that for bipartite networks, the detection of modular structure on which can be reformulated as modularity maximization. To optimize the bipartite modularity, we develop a modified adaptive genetic algorithm (MAGA), which is shown to be especially efficient for community structure detection. The high efficiency of the MAGA is based on the following three improvements we make. First, we introduce a different measure for the informativeness of a locus instead of the standard deviation, which can exactly determine which loci mutate. This measure is the bias between the distribution of a locus over the current population and the uniform distribution of the locus, i.e., the Kullback-Leibler divergence between them. Second, we develop a reassignment technique for differentiating the informative state a locus has attained from the random state in the initial phase. Third, we present a modified mutation rule which by incorporating related operations can guarantee the convergence of the MAGA to the global optimum and can speed up the convergence process. Experimental results show that the MAGA outperforms existing methods in terms of modularity for both bipartite and unipartite networks.
Collapse
Affiliation(s)
- Weihua Zhan
- Department of Computer Science and Technology, Tongji University, 4800 Cao'an Road, Shanghai 201804, China.
| | | | | | | |
Collapse
|
561
|
Lancichinetti A, Radicchi F, Ramasco JJ, Fortunato S. Finding statistically significant communities in networks. PLoS One 2011; 6:e18961. [PMID: 21559480 PMCID: PMC3084717 DOI: 10.1371/journal.pone.0018961] [Citation(s) in RCA: 252] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2010] [Accepted: 03/14/2011] [Indexed: 11/18/2022] Open
Abstract
Community structure is one of the main structural features of networks, revealing both their internal organization and the similarity of their elementary units. Despite the large variety of methods proposed to detect communities in graphs, there is a big need for multi-purpose techniques, able to handle different types of datasets and the subtleties of community structure. In this paper we present OSLOM (Order Statistics Local Optimization Method), the first method capable to detect clusters in networks accounting for edge directions, edge weights, overlapping communities, hierarchies and community dynamics. It is based on the local optimization of a fitness function expressing the statistical significance of clusters with respect to random fluctuations, which is estimated with tools of Extreme and Order Statistics. OSLOM can be used alone or as a refinement procedure of partitions/covers delivered by other techniques. We have also implemented sequential algorithms combining OSLOM with other fast techniques, so that the community structure of very large networks can be uncovered. Our method has a comparable performance as the best existing algorithms on artificial benchmark graphs. Several applications on real networks are shown as well. OSLOM is implemented in a freely available software (http://www.oslom.org), and we believe it will be a valuable tool in the analysis of networks.
Collapse
Affiliation(s)
- Andrea Lancichinetti
- Complex Networks and Systems Lagrange
Laboratory, Institute for Scientific Interchange (ISI), Torino,
Italy
- Physics Department, Politecnico di Torino,
Torino, Italy
| | - Filippo Radicchi
- Howard Hughes Medical Institute (HHMI),
Northwestern University, Evanston, Illinois, United States of
America
| | - José J. Ramasco
- Complex Networks and Systems Lagrange
Laboratory, Institute for Scientific Interchange (ISI), Torino,
Italy
- Instituto de Física Interdisciplinar y
Sistemas Complejos IFISC (CSIC-UIB), Palma de Mallorca, Spain
| | - Santo Fortunato
- Complex Networks and Systems Lagrange
Laboratory, Institute for Scientific Interchange (ISI), Torino,
Italy
| |
Collapse
|
562
|
Su G, Kuchinsky A, Morris JH, States DJ, Meng F. GLay: community structure analysis of biological networks. Bioinformatics 2011; 26:3135-7. [PMID: 21123224 PMCID: PMC2995124 DOI: 10.1093/bioinformatics/btq596] [Citation(s) in RCA: 182] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
SUMMARY GLay provides Cytoscape users an assorted collection of versatile community structure algorithms and graph layout functions for network clustering and structured visualization. High performance is achieved by dynamically linking highly optimized C functions to the Cytoscape JAVA program, which makes GLay especially suitable for decomposition, display and exploratory analysis of large biological networks. AVAILABILITY http://brainarray.mbni.med.umich.edu/glay/.
Collapse
Affiliation(s)
- Gang Su
- Bioinformatics Program, University of Michigan, Ann Arbor, MI, USA.
| | | | | | | | | |
Collapse
|
563
|
Subelj L, Bajec M. Unfolding communities in large complex networks: combining defensive and offensive label propagation for core extraction. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2011; 83:036103. [PMID: 21517554 DOI: 10.1103/physreve.83.036103] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2010] [Revised: 11/06/2010] [Indexed: 05/30/2023]
Abstract
Label propagation has proven to be a fast method for detecting communities in large complex networks. Recent developments have also improved the accuracy of the approach; however, a general algorithm is still an open issue. We present an advanced label propagation algorithm that combines two unique strategies of community formation, namely, defensive preservation and offensive expansion of communities. The two strategies are combined in a hierarchical manner to recursively extract the core of the network and to identify whisker communities. The algorithm was evaluated on two classes of benchmark networks with planted partition and on 23 real-world networks ranging from networks with tens of nodes to networks with several tens of millions of edges. It is shown to be comparable to the current state-of-the-art community detection algorithms and superior to all previous label propagation algorithms, with comparable time complexity. In particular, analysis on real-world networks has proven that the algorithm has almost linear complexity, O(m¹·¹⁹), and scales even better than the basic label propagation algorithm (m is the number of edges in the network).
Collapse
Affiliation(s)
- Lovro Subelj
- Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia.
| | | |
Collapse
|
564
|
Fast Community Detection for Dynamic Complex Networks. COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE 2011. [DOI: 10.1007/978-3-642-25501-4_20] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
565
|
Bin W, Bin Z, Hongqiao T, Wanting W. LiterMiner: An Academic Literature Mining System. 2010 INTERNATIONAL CONFERENCE OF INFORMATION SCIENCE AND MANAGEMENT ENGINEERING 2010. [DOI: 10.1109/isme.2010.170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
|
566
|
Liu X, Murata T. An Efficient Algorithm for Optimizing Bipartite Modularity in Bipartite Networks. JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS 2010. [DOI: 10.20965/jaciii.2010.p0408] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Modularity evaluates the quality of a division of network nodes into communities, and modularity optimization is the most widely used class of methods for detecting communities in networks. In bipartite networks, there are correspondingly bipartite modularity and bipartite modularity optimization. LPAb, a very fast label propagation algorithm based on bipartite modularity optimization, tends to become stuck in poor local maxima, yielding suboptimal community divisions with low bipartite modularity. We therefore propose LPAb+, a hybrid algorithm combining modified LPAb, or LPAb’, and MSG, a multistep greedy agglomerative algorithm, with the objective of using MSG to drive LPAb out of local maxima. We use four commonly used real-world bipartite networks to demonstrate LPAb+ capability in detecting community divisions with remarkably higher bipartite modularity than LPAb. We show how LPAb+ outperforms other bipartite modularity optimization algorithms, without compromising speed.
Collapse
|
567
|
Ronhovde P, Nussinov Z. Local resolution-limit-free Potts model for community detection. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2010; 81:046114. [PMID: 20481793 DOI: 10.1103/physreve.81.046114] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/19/2009] [Revised: 02/28/2010] [Indexed: 05/29/2023]
Abstract
We report on an exceptionally accurate spin-glass-type Potts model for community detection. With a simple algorithm, we find that our approach is at least as accurate as the best currently available algorithms and robust to the effects of noise. It is also competitive with the best currently available algorithms in terms of speed and size of solvable systems. We find that the computational demand often exhibits superlinear scaling O(L1.3) where L is the number of edges in the system, and we have applied the algorithm to synthetic systems as large as 40 x 10(6) nodes and over 1 x 10(9) edges. A previous stumbling block encountered by popular community detection methods is the so-called "resolution limit." Being a "local" measure of community structure, our Potts model is free from this resolution-limit effect, and it further remains a local measure on weighted and directed graphs. We also address the mitigation of resolution-limit effects for two other popular Potts models.
Collapse
Affiliation(s)
- Peter Ronhovde
- Department of Physics, Washington University, St Louis, Missouri 63130, USA
| | | |
Collapse
|
568
|
Cafieri S, Hansen P, Liberti L. Edge ratio and community structure in networks. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2010; 81:026105. [PMID: 20365629 DOI: 10.1103/physreve.81.026105] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2009] [Revised: 12/12/2009] [Indexed: 05/29/2023]
Abstract
A hierarchical divisive algorithm is proposed for identifying communities in complex networks. To that effect, the definition of community in the weak sense of Radicchi [Proc. Natl. Acad. Sci. U.S.A. 101, 2658 (2004)] is extended into a criterion for a bipartition to be optimal: one seeks to maximize the minimum for both classes of the bipartition of the ratio of inner edges to cut edges. A mathematical program is used within a dichotomous search to do this in an optimal way for each bipartition. This includes an exact solution of the problem of detecting indivisible communities. The resulting hierarchical divisive algorithm is compared with exact modularity maximization on both artificial and real world data sets. For two problems of the former kind optimal solutions are found; for five problems of the latter kind the edge ratio algorithm always appears to be competitive. Moreover, it provides additional information in several cases, notably through the use of the dendrogram summarizing the resolution. Finally, both algorithms are compared on reduced versions of the data sets of Girvan and Newman [Proc. Natl. Acad. Sci. U.S.A. 99, 7821 (2002)] and of Lancichinetti [Phys. Rev. E 78, 046110 (2008)]. Results for these instances appear to be comparable.
Collapse
Affiliation(s)
- Sonia Cafieri
- LIX, Ecole Polytechnique, F-91128 Palaiseau, France.
| | | | | |
Collapse
|
569
|
Barber MJ, Clark JW. Detecting network communities by propagating labels under constraints. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2009; 80:026129. [PMID: 19792222 DOI: 10.1103/physreve.80.026129] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2009] [Indexed: 05/28/2023]
Abstract
We investigate the recently proposed label-propagation algorithm (LPA) for identifying network communities. We reformulate the LPA as an equivalent optimization problem, giving an objective function whose maxima correspond to community solutions. By considering properties of the objective function, we identify conceptual and practical drawbacks of the label-propagation approach, most importantly the disparity between increasing the value of the objective function and improving the quality of communities found. To address the drawbacks, we modify the objective function in the optimization problem, producing a variety of algorithms that propagate labels subject to constraints; of particular interest is a variant that maximizes the modularity measure of community quality. Performance properties and implementation details of the proposed algorithms are discussed. Bipartite as well as unipartite networks are considered.
Collapse
Affiliation(s)
- Michael J Barber
- Foresight & Policy Development Department, Austrian Institute of Technology (AIT) GmbH, 1220 Vienna, Austria.
| | | |
Collapse
|
570
|
Schwarz AJ, Gozzi A, Bifone A. Community structure in networks of functional connectivity: Resolving functional organization in the rat brain with pharmacological MRI. Neuroimage 2009; 47:302-11. [DOI: 10.1016/j.neuroimage.2009.03.064] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2009] [Revised: 03/10/2009] [Accepted: 03/22/2009] [Indexed: 12/15/2022] Open
|
571
|
Ronhovde P, Nussinov Z. Multiresolution community detection for megascale networks by information-based replica correlations. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2009; 80:016109. [PMID: 19658776 DOI: 10.1103/physreve.80.016109] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2008] [Revised: 04/28/2009] [Indexed: 05/28/2023]
Abstract
We use a Potts model community detection algorithm to accurately and quantitatively evaluate the hierarchical or multiresolution structure of a graph. Our multiresolution algorithm calculates correlations among multiple copies ("replicas") of the same graph over a range of resolutions. Significant multiresolution structures are identified by strongly correlated replicas. The average normalized mutual information, the variation in information, and other measures, in principle, give a quantitative estimate of the "best" resolutions and indicate the relative strength of the structures in the graph. Because the method is based on information comparisons, it can, in principle, be used with any community detection model that can examine multiple resolutions. Our approach may be extended to other optimization problems. As a local measure, our Potts model avoids the "resolution limit" that affects other popular models. With this model, our community detection algorithm has an accuracy that ranks among the best of currently available methods. Using it, we can examine graphs over 40 x10;{6} nodes and more than 1 x10;{9} edges. We further report that the multiresolution variant of our algorithm can solve systems of at least 200 000 nodes and 10 x 10;{6} edges on a single processor with exceptionally high accuracy. For typical cases, we find a superlinear scaling O(L1.3) for community detection and O(L1.3 log N) for the multiresolution algorithm, where L is the number of edges and N is the number of nodes in the system.
Collapse
Affiliation(s)
- Peter Ronhovde
- Department of Physics, Washington University, St. Louis, Missouri 63130, USA
| | | |
Collapse
|
572
|
Leung IXY, Hui P, Liò P, Crowcroft J. Towards real-time community detection in large networks. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2009; 79:066107. [PMID: 19658564 DOI: 10.1103/physreve.79.066107] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2008] [Revised: 03/10/2009] [Indexed: 05/28/2023]
Abstract
The recent boom of large-scale online social networks (OSNs) both enables and necessitates the use of parallelizable and scalable computational techniques for their analysis. We examine the problem of real-time community detection and a recently proposed linear time- O(m) on a network with m edges-label propagation, or "epidemic" community detection algorithm. We identify characteristics and drawbacks of the algorithm and extend it by incorporating different heuristics to facilitate reliable and multifunctional real-time community detection. With limited computational resources, we employ the algorithm on OSN data with 1 x 10(6) nodes and about 58 x 10(6) directed edges. Experiments and benchmarks reveal that the extended algorithm is not only faster but its community detection accuracy compares favorably over popular modularity-gain optimization algorithms known to suffer from their resolution limits.
Collapse
|
573
|
Orman GK, Labatut V. A Comparison of Community Detection Algorithms on Artificial Networks. DISCOVERY SCIENCE 2009. [DOI: 10.1007/978-3-642-04747-3_20] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|