451
|
Buccheri G, Marmi S, Mantegna RN. Evolution of correlation structure of industrial indices of U.S. equity markets. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2013; 88:012806. [PMID: 23944517 DOI: 10.1103/physreve.88.012806] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/03/2013] [Indexed: 06/02/2023]
Abstract
We investigate the dynamics of correlations present between pairs of industry indices of U.S. stocks traded in U.S. markets by studying correlation-based networks and spectral properties of the correlation matrix. The study is performed by using 49 industry index time series computed by K. French and E. Fama during the time period from July 1969 to December 2011, which spans more than 40 years. We show that the correlation between industry indices presents both a fast and a slow dynamics. The slow dynamics has a time scale longer than 5 years, showing that a different degree of diversification of the investment is possible in different periods of time. Moreover, we also detect a fast dynamics associated with exogenous or endogenous events. The fast time scale we use is a monthly time scale and the evaluation time period is a 3-month time period. By investigating the correlation dynamics monthly, we are able to detect two examples of fast variations in the first and second eigenvalue of the correlation matrix. The first occurs during the dot-com bubble (from March 1999 to April 2001) and the second occurs during the period of highest impact of the subprime crisis (from August 2008 to August 2009).
Collapse
|
452
|
Amiri B, Hossain L, Crawford JW, Wigand RT. Community Detection in Complex Networks: Multi–objective Enhanced Firefly Algorithm. Knowl Based Syst 2013. [DOI: 10.1016/j.knosys.2013.01.004] [Citation(s) in RCA: 131] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
453
|
Hallquist MN, Hwang K, Luna B. The nuisance of nuisance regression: spectral misspecification in a common approach to resting-state fMRI preprocessing reintroduces noise and obscures functional connectivity. Neuroimage 2013; 82:208-25. [PMID: 23747457 DOI: 10.1016/j.neuroimage.2013.05.116] [Citation(s) in RCA: 498] [Impact Index Per Article: 41.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2013] [Revised: 05/03/2013] [Accepted: 05/23/2013] [Indexed: 12/24/2022] Open
Abstract
Recent resting-state functional connectivity fMRI (RS-fcMRI) research has demonstrated that head motion during fMRI acquisition systematically influences connectivity estimates despite bandpass filtering and nuisance regression, which are intended to reduce such nuisance variability. We provide evidence that the effects of head motion and other nuisance signals are poorly controlled when the fMRI time series are bandpass-filtered but the regressors are unfiltered, resulting in the inadvertent reintroduction of nuisance-related variation into frequencies previously suppressed by the bandpass filter, as well as suboptimal correction for noise signals in the frequencies of interest. This is important because many RS-fcMRI studies, including some focusing on motion-related artifacts, have applied this approach. In two cohorts of individuals (n=117 and 22) who completed resting-state fMRI scans, we found that the bandpass-regress approach consistently overestimated functional connectivity across the brain, typically on the order of r=.10-.35, relative to a simultaneous bandpass filtering and nuisance regression approach. Inflated correlations under the bandpass-regress approach were associated with head motion and cardiac artifacts. Furthermore, distance-related differences in the association of head motion and connectivity estimates were much weaker for the simultaneous filtering approach. We recommend that future RS-fcMRI studies ensure that the frequencies of nuisance regressors and fMRI data match prior to nuisance regression, and we advocate a simultaneous bandpass filtering and nuisance regression strategy that better controls nuisance-related variability.
Collapse
|
454
|
Zhang ZY, Wang Y, Ahn YY. Overlapping community detection in complex networks using symmetric binary matrix factorization. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2013; 87:062803. [PMID: 23848725 DOI: 10.1103/physreve.87.062803] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2013] [Indexed: 06/02/2023]
Abstract
Discovering overlapping community structures is a crucial step to understanding the structure and dynamics of many networks. In this paper we develop a symmetric binary matrix factorization model to identify overlapping communities. Our model allows us not only to assign community memberships explicitly to nodes, but also to distinguish outliers from overlapping nodes. In addition, we propose a modified partition density to evaluate the quality of community structures. We use this to determine the most appropriate number of communities. We evaluate our methods using both synthetic benchmarks and real-world networks, demonstrating the effectiveness of our approach.
Collapse
Affiliation(s)
- Zhong-Yuan Zhang
- School of Statistics, Central University of Finance and Economics, Haidian District, Beijing 100081, China.
| | | | | |
Collapse
|
455
|
Vilhena DA, Harris EB, Bergstrom CT, Maliska ME, Ward PD, Sidor CA, Strömberg CAE, Wilson GP. Bivalve network reveals latitudinal selectivity gradient at the end-Cretaceous mass extinction. Sci Rep 2013. [PMCID: PMC3646391 DOI: 10.1038/srep01790] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Biogeographic patterns of survival help constrain the causal factors responsible for mass extinction. To test whether biogeography influenced end-Cretaceous (K-Pg) extinction patterns, we used a network approach to delimit biogeographic units (BUs) above the species level in a global Maastrichtian database of 329 bivalve genera. Geographic range is thought to buffer taxa from extinction, but the number of BUs a taxon occurred in superseded geographic range as an extinction predictor. Geographically, we found a latitudinal selectivity gradient for geographic range in the K-Pg, such that higher latitude BUs had lower extinction than expected given the geographic ranges of the genera, implying that (i) high latitude BUs were more resistant to extinction, (ii) the intensity of the K-Pg kill mechanism declined with distance from the tropics, or (iii) both. Our results highlight the importance of macroecological structure in constraining causal mechanisms of extinction and estimating extinction risk of taxa.
Collapse
|
456
|
|
457
|
Provincialization of terrestrial faunas following the end-Permian mass extinction. Proc Natl Acad Sci U S A 2013; 110:8129-33. [PMID: 23630295 DOI: 10.1073/pnas.1302323110] [Citation(s) in RCA: 116] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
In addition to their devastating effects on global biodiversity, mass extinctions have had a long-term influence on the history of life by eliminating dominant lineages that suppressed ecological change. Here, we test whether the end-Permian mass extinction (252.3 Ma) affected the distribution of tetrapod faunas within the southern hemisphere and apply quantitative methods to analyze four components of biogeographic structure: connectedness, clustering, range size, and endemism. For all four components, we detected increased provincialism between our Permian and Triassic datasets. In southern Pangea, a more homogeneous and broadly distributed fauna in the Late Permian (Wuchiapingian, ∼257 Ma) was replaced by a provincial and biogeographically fragmented fauna by Middle Triassic times (Anisian, ∼242 Ma). Importantly in the Triassic, lower latitude basins in Tanzania and Zambia included dinosaur predecessors and other archosaurs unknown elsewhere. The recognition of heterogeneous tetrapod communities in the Triassic implies that the end-Permian mass extinction afforded ecologically marginalized lineages the ecospace to diversify, and that biotic controls (i.e., evolutionary incumbency) were fundamentally reset. Archosaurs, which began diversifying in the Early Triassic, were likely beneficiaries of this ecological release and remained dominant for much of the later Mesozoic.
Collapse
|
458
|
Peixoto TP. Parsimonious module inference in large networks. PHYSICAL REVIEW LETTERS 2013; 110:148701. [PMID: 25167049 DOI: 10.1103/physrevlett.110.148701] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/19/2012] [Indexed: 06/03/2023]
Abstract
We investigate the detectability of modules in large networks when the number of modules is not known in advance. We employ the minimum description length principle which seeks to minimize the total amount of information required to describe the network, and avoid overfitting. According to this criterion, we obtain general bounds on the detectability of any prescribed block structure, given the number of nodes and edges in the sampled network. We also obtain that the maximum number of detectable blocks scales as sqrt[N], where N is the number of nodes in the network, for a fixed average degree ⟨k⟩. We also show that the simplicity of the minimum description length approach yields an efficient multilevel Monte Carlo inference algorithm with a complexity of O(τNlogN), if the number of blocks is unknown, and O(τN) if it is known, where τ is the mixing time of the Markov chain. We illustrate the application of the method on a large network of actors and films with over 10(6) edges, and a dissortative, bipartite block structure.
Collapse
Affiliation(s)
- Tiago P Peixoto
- Institut für Theoretische Physik, Universität Bremen, Hochschulring 18, D-28359 Bremen, Germany
| |
Collapse
|
459
|
Abstract
It has been suggested that the native state of a protein acts as a kinetic hub that can facilitate transitions between nonnative states. Using recently developed tools to quantify mediation probabilities ("hub scores"), we quantify hub-like behavior in atomic resolution trajectories for the first time. We use a data set of trajectory ensembles for 12 fast-folding proteins previously published by D. E. Shaw Research (Lindorff-Larsen, K.; et al. How Fast-Folding Proteins Fold. Science2011, 334, 517) with an aggregate simulation time of over 8.2 ms. We visualize the free-energy landscape of each molecule using configuration space networks, and show that dynamic quantities can be qualitatively understood from visual inspection of the networks. Modularity optimization is used to provide a parameter-free means of tessellating the network into a group of communities. Using hub scores, we find that the percentage of trajectories that are mediated by the native state is 31% when averaged over all molecules, and reaches a maximum of 52% for the homeodomain and chignolin. Furthermore, for these mediated transitions, we use Markov models to determine whether the native state acts as a facilitator for the transition, or as a trap (i.e., an off-pathway detour). Although instances of facilitation are found in 4 of the 12 molecules, we conclude that the native state acts primarily as a trap, which is consistent with the idea of a funnel-like landscape.
Collapse
Affiliation(s)
- Alex Dickson
- Department of Chemistry, The University of Michigan, Ann Arbor, MI
| | - Charles L. Brooks
- Department of Chemistry, The University of Michigan, Ann Arbor, MI
- Biophysics Program, The University of Michigan, Ann Arbor, MI
| |
Collapse
|
460
|
Wang X, Yao J, Sun Y, Mai V. M-pick, a modularity-based method for OTU picking of 16S rRNA sequences. BMC Bioinformatics 2013; 14:43. [PMID: 23387433 PMCID: PMC3599145 DOI: 10.1186/1471-2105-14-43] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2012] [Accepted: 01/30/2013] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Binning 16S rRNA sequences into operational taxonomic units (OTUs) is an initial crucial step in analyzing large sequence datasets generated to determine microbial community compositions in various environments including that of the human gut. Various methods have been developed, but most suffer from either inaccuracies or from being unable to handle millions of sequences generated in current studies. Furthermore, existing binning methods usually require a priori decisions regarding binning parameters such as a distance level for defining an OTU. RESULTS We present a novel modularity-based approach (M-pick) to address the aforementioned problems. The new method utilizes ideas from community detection in graphs, where sequences are viewed as vertices on a weighted graph, each pair of sequences is connected by an imaginary edge, and the similarity of a pair of sequences represents the weight of the edge. M-pick first generates a graph based on pairwise sequence distances and then applies a modularity-based community detection technique on the graph to generate OTUs to capture the community structures in sequence data. To compare the performance of M-pick with that of existing methods, specifically CROP and ESPRIT-Tree, sequence data from different hypervariable regions of 16S rRNA were used and binning results were compared. CONCLUSIONS A new modularity-based clustering method for OTU picking of 16S rRNA sequences is developed in this study. The algorithm does not require a predetermined cut-off level, and our simulation studies suggest that it is superior to existing methods that require specified distance levels to define OTUs. The source code is available at http://plaza.ufl.edu/xywang/Mpick.htm.
Collapse
Affiliation(s)
- Xiaoyu Wang
- Department of Epidemiology, College of Public Health and Health Professions and College of Medicine, Emerging Pathogens Institute, University of Florida, Gainesville, FL 32610, USA
| | | | | | | |
Collapse
|
461
|
Surprise maximization reveals the community structure of complex networks. Sci Rep 2013; 3:1060. [PMID: 23320141 PMCID: PMC3544010 DOI: 10.1038/srep01060] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2012] [Accepted: 12/28/2012] [Indexed: 11/09/2022] Open
Abstract
How to determine the community structure of complex networks is an open question. It is critical to establish the best strategies for community detection in networks of unknown structure. Here, using standard synthetic benchmarks, we show that none of the algorithms hitherto developed for community structure characterization perform optimally. Significantly, evaluating the results according to their modularity, the most popular measure of the quality of a partition, systematically provides mistaken solutions. However, a novel quality function, called Surprise, can be used to elucidate which is the optimal division into communities. Consequently, we show that the best strategy to find the community structure of all the networks examined involves choosing among the solutions provided by multiple algorithms the one with the highest Surprise value. We conclude that Surprise maximization precisely reveals the community structure of complex networks.
Collapse
|
462
|
van Laarhoven T, Marchiori E. Graph clustering with local search optimization: the resolution bias of the objective function matters most. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2013; 87:012812. [PMID: 23410393 DOI: 10.1103/physreve.87.012812] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2012] [Revised: 11/23/2012] [Indexed: 06/01/2023]
Abstract
Results of a recent comparative experimental assessment of methods for network community detection applied to benchmark graphs indicate that the two best methods use different objective functions but a similar local search-based optimization (LSO) procedure. This observation motivates the following research question: Given the LSO optimization procedure, how much does the choice of the objective function influence the results and in what way? We address this question empirically in a broad graph clustering context, that is, when graphs are either given as such or are k-nearest-neighbor graphs generated from a given data set. We consider normalized cut, modularity, and infomap, as well as two new objective functions. We show that all these objectives have a resolution bias, that is, they tend to prefer either small or large clusters. When removing this bias, by forcing the objective to generate a given number of clusters, LSO achieves similar performance across the considered objective functions on benchmark networks with built-in community structure. These results indicate that the resolution bias is the most important difference between objective functions in graph clustering with LSO. Spectral clustering is an alternative to LSO, which has been used to optimize the popular normalized cut and modularity objectives. We show experimentally that LSO often achieves superior performance than spectral clustering on various benchmark, real-life, and k-nearest-neighbor graphs. These results, the flexibility of LSO and its efficiency, provide arguments in favor of this optimization method.
Collapse
Affiliation(s)
- Twan van Laarhoven
- Institute for Computing and Information Sciences, Radboud University Nijmegen, The Netherlands.
| | | |
Collapse
|
463
|
|
464
|
Gaume B, Navarro E, Prade H. Clustering bipartite graphs in terms of approximate formal concepts and sub-contexts. INT J COMPUT INT SYS 2013. [DOI: 10.1080/18756891.2013.819179] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022] Open
|
465
|
Yang B, Di J, Liu J, Liu D. Hierarchical community detection with applications to real-world network analysis. DATA KNOWL ENG 2013. [DOI: 10.1016/j.datak.2012.09.002] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
466
|
Hu D, Ronhovde P, Nussinov Z. Stability-to-instability transition in the structure of large-scale networks. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2012; 86:066106. [PMID: 23368003 DOI: 10.1103/physreve.86.066106] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2012] [Indexed: 06/01/2023]
Abstract
We examine phase transitions between the "easy," "hard," and "unsolvable" phases when attempting to identify structure in large complex networks ("community detection") in the presence of disorder induced by network "noise" (spurious links that obscure structure), heat bath temperature T, and system size N. The partition of a graph into q optimally disjoint subgraphs or "communities" inherently requires Potts-type variables. In earlier work [Philos. Mag. 92, 406 (2012)], when examining power law and other networks (and general associated Potts models), we illustrated that transitions in the computational complexity of the community detection problem typically correspond to spin-glass-type transitions (and transitions to chaotic dynamics in mechanical analogs) at both high and low temperatures and/or noise. The computationally "hard" phase exhibits spin-glass type behavior including memory effects. The region over which the hard phase extends in the noise and temperature phase diagram decreases as N increases while holding the average number of nodes per community fixed. This suggests that in the thermodynamic limit a direct sharp transition may occur between the easy and unsolvable phases. When present, transitions at low temperature or low noise correspond to entropy driven (or "order by disorder") annealing effects, wherein stability may initially increase as temperature or noise is increased before becoming unsolvable at sufficiently high temperature or noise. Additional transitions between contending viable solutions (such as those at different natural scales) are also possible. Identifying community structure via a dynamical approach where "chaotic-type" transitions were found earlier. The correspondence between the spin-glass-type complexity transitions and transitions into chaos in dynamical analogs might extend to other hard computational problems. In this work, we examine large networks (with a power law distribution in cluster size) that have a large number of communities (q≫1). We infer that large systems at a constant ratio of q to the number of nodes N asymptotically tend towards insolvability in the limit of large N for any positive T. The asymptotic behavior of temperatures below which structure identification might be possible, T_{×}=O[1/lnq], decreases slowly, so for practical system sizes, there remains an accessible, and generally easy, global solvable phase at low temperature. We further employ multivariate Tutte polynomials to show that increasing q emulates increasing T for a general Potts model, leading to a similar stability region at low T. Given the relation between Tutte and Jones polynomials, our results further suggest a link between the above complexity transitions and transitions associated with random knots.
Collapse
Affiliation(s)
- Dandan Hu
- Department of Physics, Washington University in St. Louis, Campus Box 1105, 1 Brookings Drive, St. Louis, Missouri 63130, USA
| | | | | |
Collapse
|
467
|
Unravelling the intrinsic functional organization of the human lateral frontal cortex: a parcellation scheme based on resting state fMRI. J Neurosci 2012; 32:10238-52. [PMID: 22836258 DOI: 10.1523/jneurosci.5852-11.2012] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Human and nonhuman primates exhibit flexible behavior. Functional, anatomical, and lesion studies indicate that the lateral frontal cortex (LFC) plays a pivotal role in such behavior. LFC consists of distinct subregions exhibiting distinct connectivity patterns that possibly relate to functional specializations. Inference about the border of each subregion in the human brain is performed with the aid of macroscopic landmarks and/or cytoarchitectonic parcellations extrapolated in a stereotaxic system. However, the high interindividual variability, the limited availability of cytoarchitectonic probabilistic maps, and the absence of robust functional localizers render the in vivo delineation and examination of the LFC subregions challenging. In this study, we use resting state fMRI for the in vivo parcellation of the human LFC on a subjectwise and data-driven manner. This approach succeeds in uncovering neuroanatomically realistic subregions, with potential anatomical substrates including BA 46, 44, 45, 9 and related (sub)divisions. Ventral LFC subregions exhibit different functional connectivity (FC), which can account for different contributions in the language domain, while more dorsal adjacent subregions mark a transition to visuospatial/sensorimotor networks. Dorsal LFC subregions participate in known large-scale networks obeying an external/internal information processing dichotomy. Furthermore, we traced "families" of LFC subregions organized along the dorsal-ventral and anterior-posterior axis with distinct functional networks also encompassing specialized cingulate divisions. Similarities with the connectivity of macaque candidate homologs were observed, such as the premotor affiliation of presumed BA 46. The current findings partially support dominant LFC models.
Collapse
|
468
|
Manimaran P, Duraiswamy K. Identifying Overlying Group of People through Clustering. INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY AND WEB ENGINEERING 2012. [DOI: 10.4018/jitwe.2012100104] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Folksonomies like Delicious and LastFm are modeled as multilateral (user-resource-tag) hypergraphs for studying their network properties. Detecting communities of similar nodes from such networks is a challenging problem. Most existing algorithms for community detection in folksonomies assign unique communities to nodes, whereas in reality, users have multiple relevant interests and same resource is often tagged with semantically different tags. Few attempts to perceive overlapping communities work on forecasts of hypergraph, which results in momentous loss of information contained in original tripartite structure. Propose first algorithm to detect overlapping communities in folksonomies using complete hypergraph structure. The authors’ algorithm converts a hypergraph into its parallel line graph, using measures of hyperedge similarity, whereby any community detection algorithm on unipartite graphs can be used to produce intersecting communities in folksonomy. Through extensive experiments on synthetic as well as real folksonomy data, demonstrate that proposed algorithm can detect better community structures as compared to existing state-of-the-art algorithms for folksonomies.
Collapse
Affiliation(s)
- P. Manimaran
- Department of Computer Science and Engineering, K.S. Rangasamy College of Technology, Namakkal-Dt., Tamilnadu, India
| | - K. Duraiswamy
- Department of Computer Science and Engineering, K.S. Rangasamy College of Technology, Namakkal-Dt., Tamilnadu, India
| |
Collapse
|
469
|
Chen C, Fushing H. Multiscale community geometry in a network and its application. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2012; 86:041120. [PMID: 23214542 DOI: 10.1103/physreve.86.041120] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2012] [Revised: 08/17/2012] [Indexed: 06/01/2023]
Abstract
We introduce a between-ness-based distance metric to extract local and global information for each pair of nodes (or "vertices" used interchangeably) located in a binary network. Since this distance then superimposes a weighted graph upon such a binary network, a multiscale clustering mechanism, called data cloud geometry, is applicable to discover hierarchical communities within a binary network. This approach resolves many shortcomings of community finding approaches, which are primarily based on modularity optimization. Using several contrived and real binary networks, our community hierarchies compare favorably with results derived from a recently proposed approach based on time-scale differences of random walks and has already demonstrated significant improvements over module-based approaches, especially on the multiscale and the determination of the number of communities.
Collapse
Affiliation(s)
- Chen Chen
- University of California, Davis, California 95616, USA
| | | |
Collapse
|
470
|
Breve F, Zhao L. Fuzzy community structure detection by particle competition and cooperation. Soft comput 2012. [DOI: 10.1007/s00500-012-0924-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
471
|
Abstract
We consider the problem of finding the set of rankings that best represents a given group of orderings on the same collection of elements (preference lists). This problem arises from social choice and voting theory, in which each voter gives a preference on a set of alternatives, and a system outputs a single preference order based on the observed voters' preferences. In this paper, we observe that, if the given set of preference lists is not homogeneous, a unique true underling ranking might not exist. Moreover only the lists that share the highest amount of information should be aggregated, and thus multiple rankings might provide a more feasible solution to the problem. In this light, we propose Network Selection, an algorithm that, given a heterogeneous group of rankings, first discovers the different communities of homogeneous rankings and then combines only the rank orderings belonging to the same community into a single final ordering. Our novel approach is inspired by graph theory; indeed our set of lists can be loosely read as the nodes of a network. As a consequence, only the lists populating the same community in the network would then be aggregated. In order to highlight the strength of our proposal, we show an application both on simulated and on two real datasets, namely a financial and a biological dataset. Experimental results on simulated data show that Network Selection can significantly outperform existing related methods. The other way around, the empirical evidence achieved on real financial data reveals that Network Selection is also able to select the most relevant variables in data mining predictive models, providing a clear superiority in terms of predictive power of the models built. Furthermore, we show the potentiality of our proposal in the bioinformatics field, providing an application to a biological microarray dataset.
Collapse
|
472
|
Schaub MT, Lambiotte R, Barahona M. Encoding dynamics for multiscale community detection: Markov time sweeping for the map equation. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2012; 86:026112. [PMID: 23005830 DOI: 10.1103/physreve.86.026112] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/16/2012] [Indexed: 06/01/2023]
Abstract
The detection of community structure in networks is intimately related to finding a concise description of the network in terms of its modules. This notion has been recently exploited by the map equation formalism [Rosvall and Bergstrom, Proc. Natl. Acad. Sci. USA 105, 1118 (2008)] through an information-theoretic description of the process of coding inter- and intracommunity transitions of a random walker in the network at stationarity. However, a thorough study of the relationship between the full Markov dynamics and the coding mechanism is still lacking. We show here that the original map coding scheme, which is both block-averaged and one-step, neglects the internal structure of the communities and introduces an upper scale, the "field-of-view" limit, in the communities it can detect. As a consequence, map is well tuned to detect clique-like communities but can lead to undesirable overpartitioning when communities are far from clique-like. We show that a signature of this behavior is a large compression gap: The map description length is far from its ideal limit. To address this issue, we propose a simple dynamic approach that introduces time explicitly into the map coding through the analysis of the weighted adjacency matrix of the time-dependent multistep transition matrix of the Markov process. The resulting Markov time sweeping induces a dynamical zooming across scales that can reveal (potentially multiscale) community structure above the field-of-view limit, with the relevant partitions indicated by a small compression gap.
Collapse
Affiliation(s)
- Michael T Schaub
- Department of Mathematics, Imperial College London, London, United Kingdom.
| | | | | |
Collapse
|
473
|
Li HJ, Wang Y, Wu LY, Zhang J, Zhang XS. Potts model based on a Markov process computation solves the community structure problem effectively. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2012; 86:016109. [PMID: 23005493 DOI: 10.1103/physreve.86.016109] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/2012] [Indexed: 06/01/2023]
Abstract
The Potts model is a powerful tool to uncover community structure in complex networks. Here, we propose a framework to reveal the optimal number of communities and stability of network structure by quantitatively analyzing the dynamics of the Potts model. Specifically we model the community structure detection Potts procedure by a Markov process, which has a clear mathematical explanation. Then we show that the local uniform behavior of spin values across multiple timescales in the representation of the Markov variables could naturally reveal the network's hierarchical community structure. In addition, critical topological information regarding multivariate spin configuration could also be inferred from the spectral signatures of the Markov process. Finally an algorithm is developed to determine fuzzy communities based on the optimal number of communities and the stability across multiple timescales. The effectiveness and efficiency of our algorithm are theoretically analyzed as well as experimentally validated.
Collapse
Affiliation(s)
- Hui-Jia Li
- Academy of Mathematics and Systems Science, Chinese Academy of Science, Beijing 100190, China
| | | | | | | | | |
Collapse
|
474
|
Abstract
Understanding how individual scientists build a personal portfolio of research is key to understanding outcomes on the level of scientific fields, institutions, and systems. We lack the scientometric and statistical instruments to examine the development over time of the involvement of researchers in different problem areas. In this paper we present a scientometric method to map, measure, and compare the entire corpus of individual scientists. We use this method to analyse the search strategies of 43 condensed matter physicists along their academic lifecycle. We formulate six propositions that summarise our theoretical expectations and are empirically testable: (1) a scientist’s work consists of multiple finite research trails; (2) a scientist will work in several parallel research trails; (3) a scientist’s role in research trail selection changes along the lifecycle; (4) a scientist’s portfolio will converge before it diverges; (5) the rise and fall of research trails is associated with career changes; and (6) the rise and fall of research trails is associated with the potential for reputational gain. Four propositions are confirmed, the fifth is rejected, and the sixth could not be confirmed or rejected. In combination, the results of the four confirmed propositions reveal specific search strategies along the academic lifecycle. In the PhD phase scientists work in one problem area that is often unconnected to the later portfolio. The postdoctoral phase is where scientists diversify their portfolio and their social network, entering various problem areas and abandoning low-yielding ones. A professor has a much more stable portfolio, leading the work of PhDs and postdoctoral researchers. We present an agenda for future research and discuss theoretical and policy implications.
Collapse
Affiliation(s)
- Edwin Horlings
- Science System Assessment Department, Rathenau Institute, The Hague, The Netherlands
| | | |
Collapse
|
475
|
Piccardi C, Tajoli L. Existence and significance of communities in the World Trade Web. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2012; 85:066119. [PMID: 23005174 DOI: 10.1103/physreve.85.066119] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2011] [Revised: 05/22/2012] [Indexed: 06/01/2023]
Abstract
The World Trade Web (WTW), which models the international transactions among countries, is a fundamental tool for studying the economics of trade flows, their evolution over time, and their implications for a number of phenomena, including the propagation of economic shocks among countries. In this respect, the possible existence of communities is a key point, because it would imply that countries are organized in groups of preferential partners. In this paper, we use four approaches to analyze communities in the WTW between 1962 and 2008, based, respectively, on modularity optimization, cluster analysis, stability functions, and persistence probabilities. Overall, the four methods agree in finding no evidence of significant partitions. A few weak communities emerge from the analysis, but they do not represent secluded groups of countries, as intercommunity linkages are also strong, supporting the view of a truly globalized trading system.
Collapse
Affiliation(s)
- Carlo Piccardi
- Department of Electronics and Information, Politecnico di Milano, Piazza Leonardo da Vinci 32, I-20133 Milano, Italy.
| | | |
Collapse
|
476
|
Cerina F, De Leo V, Barthelemy M, Chessa A. Spatial correlations in attribute communities. PLoS One 2012; 7:e37507. [PMID: 22666361 PMCID: PMC3362576 DOI: 10.1371/journal.pone.0037507] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2011] [Accepted: 04/22/2012] [Indexed: 12/02/2022] Open
Abstract
Community detection is an important tool for exploring and classifying the properties of large complex networks and should be of great help for spatial networks. Indeed, in addition to their location, nodes in spatial networks can have attributes such as the language for individuals, or any other socio-economical feature that we would like to identify in communities. We discuss in this paper a crucial aspect which was not considered in previous studies which is the possible existence of correlations between space and attributes. Introducing a simple toy model in which both space and node attributes are considered, we discuss the effect of space-attribute correlations on the results of various community detection methods proposed for spatial networks in this paper and in previous studies. When space is irrelevant, our model is equivalent to the stochastic block model which has been shown to display a detectability-non detectability transition. In the regime where space dominates the link formation process, most methods can fail to recover the communities, an effect which is particularly marked when space-attributes correlations are strong. In this latter case, community detection methods which remove the spatial component of the network can miss a large part of the community structure and can lead to incorrect results.
Collapse
Affiliation(s)
- Federica Cerina
- Department of Physics, University of Cagliari, Cagliari, Italy
- Linkalab, Complex Systems Computational Laboratory, Cagliari, Italy
| | - Vincenzo De Leo
- Linkalab, Complex Systems Computational Laboratory, Cagliari, Italy
- CRS4 Bioinformatica, Pula, Italy
| | - Marc Barthelemy
- Institut de Physique Théorique (IPhT), CEA, CNRS-URA 2306, Gif-sur-Yvette, France
| | - Alessandro Chessa
- Department of Physics, University of Cagliari, Cagliari, Italy
- Linkalab, Complex Systems Computational Laboratory, Cagliari, Italy
- Institute for Complex Systems, CNR UOS Department of Physics, University of Rome Sapienza, Rome, Italy
| |
Collapse
|
477
|
Lee J, Gross SP, Lee J. Modularity optimization by conformational space annealing. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2012; 85:056702. [PMID: 23004898 DOI: 10.1103/physreve.85.056702] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/21/2012] [Indexed: 06/01/2023]
Abstract
We propose a modularity optimization method, Mod-CSA, based on stochastic global optimization algorithm, conformational space annealing (CSA). Our method outperforms simulated annealing in terms of both efficiency and accuracy, finding higher modularity partitions with less computational resources required. The high modularity values found by our method are higher than, or equal to, the largest values previously reported. In addition, the method can be combined with other heuristic methods, and implemented in parallel fashion, allowing it to be applicable to large graphs with more than 10,000 nodes.
Collapse
Affiliation(s)
- Juyong Lee
- School of Computational Sciences, Korea Institute of Advanced Study, Seoul, Korea.
| | | | | |
Collapse
|
478
|
Lambiotte R, Rosvall M. Ranking and clustering of nodes in networks with smart teleportation. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2012; 85:056107. [PMID: 23004821 DOI: 10.1103/physreve.85.056107] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2011] [Indexed: 06/01/2023]
Abstract
Random teleportation is a necessary evil for ranking and clustering directed networks based on random walks. Teleportation enables ergodic solutions, but the solutions must necessarily depend on the exact implementation and parametrization of the teleportation. For example, in the commonly used PageRank algorithm, the teleportation rate must trade off a heavily biased solution with a uniform solution. Here we show that teleportation to links rather than nodes enables a much smoother trade-off and effectively more robust results. We also show that, by not recording the teleportation steps of the random walker, we can further reduce the effect of teleportation with dramatic effects on clustering.
Collapse
Affiliation(s)
- R Lambiotte
- Department of Mathematics and Naxys, University of Namur, Belgium.
| | | |
Collapse
|
479
|
Kirouac DC, Saez-Rodriguez J, Swantek J, Burke JM, Lauffenburger DA, Sorger PK. Creating and analyzing pathway and protein interaction compendia for modelling signal transduction networks. BMC SYSTEMS BIOLOGY 2012; 6:29. [PMID: 22548703 PMCID: PMC3436686 DOI: 10.1186/1752-0509-6-29] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/08/2011] [Accepted: 04/11/2012] [Indexed: 11/11/2022]
Abstract
Background Understanding the information-processing capabilities of signal transduction networks, how those networks are disrupted in disease, and rationally designing therapies to manipulate diseased states require systematic and accurate reconstruction of network topology. Data on networks central to human physiology, such as the inflammatory signalling networks analyzed here, are found in a multiplicity of on-line resources of pathway and interactome databases (Cancer CellMap, GeneGo, KEGG, NCI-Pathway Interactome Database (NCI-PID), PANTHER, Reactome, I2D, and STRING). We sought to determine whether these databases contain overlapping information and whether they can be used to construct high reliability prior knowledge networks for subsequent modeling of experimental data. Results We have assembled an ensemble network from multiple on-line sources representing a significant portion of all machine-readable and reconcilable human knowledge on proteins and protein interactions involved in inflammation. This ensemble network has many features expected of complex signalling networks assembled from high-throughput data: a power law distribution of both node degree and edge annotations, and topological features of a “bow tie” architecture in which diverse pathways converge on a highly conserved set of enzymatic cascades focused around PI3K/AKT, MAPK/ERK, JAK/STAT, NFκB, and apoptotic signaling. Individual pathways exhibit “fuzzy” modularity that is statistically significant but still involving a majority of “cross-talk” interactions. However, we find that the most widely used pathway databases are highly inconsistent with respect to the actual constituents and interactions in this network. Using a set of growth factor signalling networks as examples (epidermal growth factor, transforming growth factor-beta, tumor necrosis factor, and wingless), we find a multiplicity of network topologies in which receptors couple to downstream components through myriad alternate paths. Many of these paths are inconsistent with well-established mechanistic features of signalling networks, such as a requirement for a transmembrane receptor in sensing extracellular ligands. Conclusions Wide inconsistencies among interaction databases, pathway annotations, and the numbers and identities of nodes associated with a given pathway pose a major challenge for deriving causal and mechanistic insight from network graphs. We speculate that these inconsistencies are at least partially attributable to cell, and context-specificity of cellular signal transduction, which is largely unaccounted for in available databases, but the absence of standardized vocabularies is an additional confounding factor. As a result of discrepant annotations, it is very difficult to identify biologically meaningful pathways from interactome networks a priori. However, by incorporating prior knowledge, it is possible to successively build out network complexity with high confidence from a simple linear signal transduction scaffold. Such reduced complexity networks appear suitable for use in mechanistic models while being richer and better justified than the simple linear pathways usually depicted in diagrams of signal transduction.
Collapse
Affiliation(s)
- Daniel C Kirouac
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | | | | | | | | | | |
Collapse
|
480
|
Abstract
Researchers use community-detection algorithms to reveal large-scale organization in biological and social networks, but community detection is useful only if the communities are significant and not a result of noisy data. To assess the statistical significance of the network communities, or the robustness of the detected structure, one approach is to perturb the network structure by removing links and measure how much the communities change. However, perturbing sparse networks is challenging because they are inherently sensitive; they shatter easily if links are removed. Here we propose a simple method to perturb sparse networks and assess the significance of their communities. We generate resampled networks by adding extra links based on local information, then we aggregate the information from multiple resampled networks to find a coarse-grained description of significant clusters. In addition to testing our method on benchmark networks, we use our method on the sparse network of the European Court of Justice (ECJ) case law, to detect significant and insignificant areas of law. We use our significance analysis to draw a map of the ECJ case law network that reveals the relations between the areas of law.
Collapse
|
481
|
Ronhovde P, Chakrabarty S, Hu D, Sahu M, Sahu KK, Kelton KF, Mauro NA, Nussinov Z. Detection of hidden structures for arbitrary scales in complex physical systems. Sci Rep 2012; 2:329. [PMID: 22461970 PMCID: PMC3314987 DOI: 10.1038/srep00329] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2011] [Accepted: 02/29/2012] [Indexed: 11/14/2022] Open
Abstract
Recent decades have experienced the discovery of numerous complex materials. At the root of the complexity underlying many of these materials lies a large number of contending atomic- and largerscale configurations. In order to obtain a more detailed understanding of such systems, we need tools that enable the detection of pertinent structures on all spatial and temporal scales. Towards this end, we suggest a new method that applies to both static and dynamic systems which invokes ideas from network analysis and information theory. Our approach efficiently identifies basic unit cells, topological defects, and candidate natural structures. The method is particularly useful where a clear definition of order is lacking, and the identified features may constitute a natural point of departure for further analysis.
Collapse
|
482
|
Lancichinetti A, Fortunato S. Consensus clustering in complex networks. Sci Rep 2012; 2:336. [PMID: 22468223 PMCID: PMC3313482 DOI: 10.1038/srep00336] [Citation(s) in RCA: 381] [Impact Index Per Article: 29.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2012] [Accepted: 03/09/2012] [Indexed: 01/20/2023] Open
Abstract
The community structure of complex networks reveals both their organization and hidden relationships among their constituents. Most community detection methods currently available are not deterministic, and their results typically depend on the specific random seeds, initial conditions and tie-break rules adopted for their execution. Consensus clustering is used in data analysis to generate stable results out of a set of partitions delivered by stochastic methods. Here we show that consensus clustering can be combined with any existing method in a self-consistent way, enhancing considerably both the stability and the accuracy of the resulting partitions. This framework is also particularly suitable to monitor the evolution of community structure in temporal networks. An application of consensus clustering to a large citation network of physics papers demonstrates its capability to keep track of the birth, death and diversification of topics.
Collapse
|
483
|
Schaub MT, Delvenne JC, Yaliraki SN, Barahona M. Markov dynamics as a zooming lens for multiscale community detection: non clique-like communities and the field-of-view limit. PLoS One 2012; 7:e32210. [PMID: 22384178 PMCID: PMC3288079 DOI: 10.1371/journal.pone.0032210] [Citation(s) in RCA: 106] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2011] [Accepted: 01/25/2012] [Indexed: 12/02/2022] Open
Abstract
In recent years, there has been a surge of interest in community detection algorithms for complex networks. A variety of computational heuristics, some with a long history, have been proposed for the identification of communities or, alternatively, of good graph partitions. In most cases, the algorithms maximize a particular objective function, thereby finding the 'right' split into communities. Although a thorough comparison of algorithms is still lacking, there has been an effort to design benchmarks, i.e., random graph models with known community structure against which algorithms can be evaluated. However, popular community detection methods and benchmarks normally assume an implicit notion of community based on clique-like subgraphs, a form of community structure that is not always characteristic of real networks. Specifically, networks that emerge from geometric constraints can have natural non clique-like substructures with large effective diameters, which can be interpreted as long-range communities. In this work, we show that long-range communities escape detection by popular methods, which are blinded by a restricted 'field-of-view' limit, an intrinsic upper scale on the communities they can detect. The field-of-view limit means that long-range communities tend to be overpartitioned. We show how by adopting a dynamical perspective towards community detection [1], [2], in which the evolution of a Markov process on the graph is used as a zooming lens over the structure of the network at all scales, one can detect both clique- or non clique-like communities without imposing an upper scale to the detection. Consequently, the performance of algorithms on inherently low-diameter, clique-like benchmarks may not always be indicative of equally good results in real networks with local, sparser connectivity. We illustrate our ideas with constructive examples and through the analysis of real-world networks from imaging, protein structures and the power grid, where a multiscale structure of non clique-like communities is revealed.
Collapse
Affiliation(s)
- Michael T. Schaub
- Department of Mathematics, Imperial College London, London, United Kingdom
- Department of Chemistry, Imperial College London, London, United Kingdom
| | - Jean-Charles Delvenne
- Department of Applied Mathematics, Université catholique de Louvain, Louvain-la-Neuve, Belgium
| | - Sophia N. Yaliraki
- Department of Chemistry, Imperial College London, London, United Kingdom
| | - Mauricio Barahona
- Department of Mathematics, Imperial College London, London, United Kingdom
| |
Collapse
|
484
|
Rees BS, Gallagher KB. Overlapping community detection using a community optimized graph swarm. SOCIAL NETWORK ANALYSIS AND MINING 2012. [DOI: 10.1007/s13278-012-0050-3] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
485
|
MOfinder: a novel algorithm for detecting overlapping modules from protein-protein interaction network. J Biomed Biotechnol 2012; 2012:103702. [PMID: 22500072 PMCID: PMC3303734 DOI: 10.1155/2012/103702] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2011] [Revised: 10/19/2011] [Accepted: 10/21/2011] [Indexed: 11/17/2022] Open
Abstract
Since organism development and many critical cell biology processes are organized in modular patterns, many algorithms have been proposed to detect modules. In this study, a new method, MOfinder, was developed to detect overlapping modules in a protein-protein interaction (PPI) network. We demonstrate that our method is more accurate than other 5 methods. Then, we applied MOfinder to yeast and human PPI network and explored the overlapping information. Using the overlapping modules of human PPI network, we constructed the module-module communication network. Functional annotation showed that the immune-related and cancer-related proteins were always together and present in the same modules, which offer some clues for immune therapy for cancer. Our study around overlapping modules suggests a new perspective on the analysis of PPI network and improves our understanding of disease.
Collapse
|
486
|
Aldecoa R, Marín I. Closed benchmarks for network community structure characterization. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2012; 85:026109. [PMID: 22463281 DOI: 10.1103/physreve.85.026109] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2011] [Revised: 12/23/2011] [Indexed: 05/31/2023]
Abstract
Characterizing the community structure of complex networks is a key challenge in many scientific fields. Very diverse algorithms and methods have been proposed to this end, many working reasonably well in specific situations. However, no consensus has emerged on which of these methods is the best to use in practice. In part, this is due to the fact that testing their performance requires the generation of a comprehensive, standard set of synthetic benchmarks, a goal not yet fully achieved. Here, we present a type of benchmark that we call "closed," in which an initial network of known community structure is progressively converted into a second network whose communities are also known. This approach differs from all previously published ones, in which networks evolve toward randomness. The use of this type of benchmark allows us to monitor the transformation of the community structure of a network. Moreover, we can predict the optimal behavior of the variation of information, a measure of the quality of the partitions obtained, at any moment of the process. This enables us in many cases to determine the best partition among those suggested by different algorithms. Also, since any network can be used as a starting point, extensive studies and comparisons can be performed using a heterogeneous set of structures, including random ones. These properties make our benchmarks a general standard for comparing community detection algorithms.
Collapse
Affiliation(s)
- Rodrigo Aldecoa
- Instituto de Biomedicina de Valencia, Consejo Superior de Investigaciones Científicas, Calle Jaime Roig 11, E-46010 Valencia, Spain.
| | | |
Collapse
|
487
|
Kim EY, Hwang DU, Ko TW. Multiscale ensemble clustering for finding modules in complex networks. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2012; 85:026119. [PMID: 22463291 DOI: 10.1103/physreve.85.026119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2011] [Indexed: 05/31/2023]
Abstract
The identification of modules in complex networks is important for the understanding of systems. Here, we propose an ensemble clustering method incorporating node groupings in various sizes and the sequential removal of weak ties between nodes which are rarely grouped together. This method successfully detects modules in various networks, such as hierarchical random networks and the American college football network, with known modular structures. Some of the results are compared with those obtained by modularity optimization and K-means clustering.
Collapse
Affiliation(s)
- Eun-Youn Kim
- Computational Neuroscience Team, National Institute for Mathematical Sciences, Daejeon 305-811, Republic of Korea
| | | | | |
Collapse
|
488
|
Toivonen R, Kivelä M, Saramäki J, Viinikainen M, Vanhatalo M, Sams M. Networks of emotion concepts. PLoS One 2012; 7:e28883. [PMID: 22276099 PMCID: PMC3262789 DOI: 10.1371/journal.pone.0028883] [Citation(s) in RCA: 92] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2011] [Accepted: 11/16/2011] [Indexed: 11/23/2022] Open
Abstract
The aim of this work was to study the similarity network and hierarchical clustering of Finnish emotion concepts. Native speakers of Finnish evaluated similarity between the 50 most frequently used Finnish words describing emotional experiences. We hypothesized that methods developed within network theory, such as identifying clusters and specific local network structures, can reveal structures that would be difficult to discover using traditional methods such as multidimensional scaling (MDS) and ordinary cluster analysis. The concepts divided into three main clusters, which can be described as negative, positive, and surprise. Negative and positive clusters divided further into meaningful sub-clusters, corresponding to those found in previous studies. Importantly, this method allowed the same concept to be a member in more than one cluster. Our results suggest that studying particular network structures that do not fit into a low-dimensional description can shed additional light on why subjects evaluate certain concepts as similar. To encourage the use of network methods in analyzing similarity data, we provide the analysis software for free use (http://www.becs.tkk.fi/similaritynets/).
Collapse
Affiliation(s)
- Riitta Toivonen
- Department of Biomedical Engineering and Computational Science (BECS), Aalto University School of Science, Espoo, Finland
| | - Mikko Kivelä
- Department of Biomedical Engineering and Computational Science (BECS), Aalto University School of Science, Espoo, Finland
| | - Jari Saramäki
- Department of Biomedical Engineering and Computational Science (BECS), Aalto University School of Science, Espoo, Finland
| | - Mikko Viinikainen
- Department of Biomedical Engineering and Computational Science (BECS), Aalto University School of Science, Espoo, Finland
| | - Maija Vanhatalo
- Department of Biomedical Engineering and Computational Science (BECS), Aalto University School of Science, Espoo, Finland
| | - Mikko Sams
- Department of Biomedical Engineering and Computational Science (BECS), Aalto University School of Science, Espoo, Finland
- * E-mail:
| |
Collapse
|
489
|
Abstract
Real-world complex systems may be mathematically modeled as graphs, revealing properties of the system. Here we study graphs of functional brain organization in healthy adults using resting state functional connectivity MRI. We propose two novel brain-wide graphs, one of 264 putative functional areas, the other a modification of voxelwise networks that eliminates potentially artificial short-distance relationships. These graphs contain many subgraphs in good agreement with known functional brain systems. Other subgraphs lack established functional identities; we suggest possible functional characteristics for these subgraphs. Further, graph measures of the areal network indicate that the default mode subgraph shares network properties with sensory and motor subgraphs: it is internally integrated but isolated from other subgraphs, much like a "processing" system. The modified voxelwise graph also reveals spatial motifs in the patterning of systems across the cortex.
Collapse
|
490
|
Moradi F, Olovsson T, Tsigas P. An Evaluation of Community Detection Algorithms on Large-Scale Email Traffic. EXPERIMENTAL ALGORITHMS 2012. [DOI: 10.1007/978-3-642-30850-5_25] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
491
|
van Laarhoven T, Marchiori E. Robust Community Detection Methods with Resolution Parameter for Complex Detection in Protein Protein Interaction Networks. PATTERN RECOGNITION IN BIOINFORMATICS 2012. [DOI: 10.1007/978-3-642-34123-6_1] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
492
|
Lancichinetti A, Fortunato S. Limits of modularity maximization in community detection. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2011; 84:066122. [PMID: 22304170 DOI: 10.1103/physreve.84.066122] [Citation(s) in RCA: 113] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2011] [Revised: 10/17/2011] [Indexed: 05/21/2023]
Abstract
Modularity maximization is the most popular technique for the detection of community structure in graphs. The resolution limit of the method is supposedly solvable with the introduction of modified versions of the measure, with tunable resolution parameters. We show that multiresolution modularity suffers from two opposite coexisting problems: the tendency to merge small subgraphs, which dominates when the resolution is low; the tendency to split large subgraphs, which dominates when the resolution is high. In benchmark networks with heterogeneous distributions of cluster sizes, the simultaneous elimination of both biases is not possible and multiresolution modularity is not capable to recover the planted community structure, not even when it is pronounced and easily detectable by other methods, for any value of the resolution parameter. This holds for other multiresolution techniques and it is likely to be a general problem of methods based on global optimization.
Collapse
Affiliation(s)
- Andrea Lancichinetti
- Complex Networks and Systems Lagrange Lab, Institute for Scientific Interchange, I-10133 Torino, Italy
| | | |
Collapse
|
493
|
Decelle A, Krzakala F, Moore C, Zdeborová L. Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2011; 84:066106. [PMID: 22304154 DOI: 10.1103/physreve.84.066106] [Citation(s) in RCA: 162] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2011] [Indexed: 05/22/2023]
Abstract
In this paper we extend our previous work on the stochastic block model, a commonly used generative model for social and biological networks, and the problem of inferring functional groups or communities from the topology of the network. We use the cavity method of statistical physics to obtain an asymptotically exact analysis of the phase diagram. We describe in detail properties of the detectability-undetectability phase transition and the easy-hard phase transition for the community detection problem. Our analysis translates naturally into a belief propagation algorithm for inferring the group memberships of the nodes in an optimal way, i.e., that maximizes the overlap with the underlying group memberships, and learning the underlying parameters of the block model. Finally, we apply the algorithm to two examples of real-world networks and discuss its performance.
Collapse
Affiliation(s)
- Aurelien Decelle
- Université Paris-Sud & CNRS, LPTMS, UMR8626, Bât 100, Université Paris-Sud, F-91405 Orsay, France
| | | | | | | |
Collapse
|
494
|
Lou X, Suykens JAK. Finding communities in weighted networks through synchronization. CHAOS (WOODBURY, N.Y.) 2011; 21:043116. [PMID: 22225353 DOI: 10.1063/1.3655371] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Community detection in weighted networks is an important challenge. In this paper, we introduce a local weight ratio scheme for identifying the community structures of weighted networks within the context of the Kuramoto model by taking into account weights of links. The proposed scheme takes full advantage of the information of the link density among vertices and the closeness of relations between each vertex and its neighbors. By means of this scheme, we explore the connection between community structures and dynamic time scales of synchronization. Moreover, we can also unravel the hierarchical structures of weighted networks with a well-defined connectivity pattern by the synchronization process. The performance of the proposed method is evaluated on both computer-generated benchmark graphs and real-world networks.
Collapse
Affiliation(s)
- Xuyang Lou
- Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi 214122, China.
| | | |
Collapse
|
495
|
de Haan W, van der Flier WM, Koene T, Smits LL, Scheltens P, Stam CJ. Disrupted modular brain dynamics reflect cognitive dysfunction in Alzheimer's disease. Neuroimage 2011; 59:3085-93. [PMID: 22154957 DOI: 10.1016/j.neuroimage.2011.11.055] [Citation(s) in RCA: 159] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2011] [Revised: 11/09/2011] [Accepted: 11/14/2011] [Indexed: 11/19/2022] Open
Abstract
The relation between pathology and cognitive dysfunction in dementia is still poorly understood, although disturbed communication between different brain regions is almost certainly involved. In this study we combine magneto-encephalography (MEG) and network analysis to investigate the role of functional sub-networks (modules) in the brain with regard to cognitive failure in Alzheimer's disease. Whole-head resting-state (MEG) was performed in 18 Alzheimer patients (age 67 ± 9, 6 females, MMSE 23 ± 5) and 18 healthy controls (age 66 ± 9, 11 females, MMSE 29 ± 1). We constructed functional brain networks based on interregional synchronization measurements, and performed graph theoretical analysis with a focus on modular organization. The overall modular strength and the number of modules changed significantly in Alzheimer patients. The parietal cortex was the most highly connected network area, but showed the strongest intramodular losses. Nonetheless, weakening of intermodular connectivity was even more outspoken, and more strongly related to cognitive impairment. The results of this study demonstrate that particularly the loss of communication between different functional brain regions reflects cognitive decline in Alzheimer's disease. These findings imply the relevance of regarding dementia as a functional network disorder.
Collapse
Affiliation(s)
- W de Haan
- Department of Clinical Neurophysiology and MEG, VU University Medical Center, Amsterdam, The Netherlands.
| | | | | | | | | | | |
Collapse
|
496
|
Coscia M, Giannotti F, Pedreschi D. A classification for community discovery methods in complex networks. Stat Anal Data Min 2011. [DOI: 10.1002/sam.10133] [Citation(s) in RCA: 226] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
497
|
Abstract
The analysis of complex networks permeates all sciences, from biology to sociology. A fundamental, unsolved problem is how to characterize the community structure of a network. Here, using both standard and novel benchmarks, we show that maximization of a simple global parameter, which we call Surprise (S), leads to a very efficient characterization of the community structure of complex synthetic networks. Particularly, S qualitatively outperforms the most commonly used criterion to define communities, Newman and Girvan's modularity (Q). Applying S maximization to real networks often provides natural, well-supported partitions, but also sometimes counterintuitive solutions that expose the limitations of our previous knowledge. These results indicate that it is possible to define an effective global criterion for community structure and open new routes for the understanding of complex networks.
Collapse
|
498
|
Ball B, Karrer B, Newman MEJ. Efficient and principled method for detecting communities in networks. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2011; 84:036103. [PMID: 22060452 DOI: 10.1103/physreve.84.036103] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2011] [Revised: 07/11/2011] [Indexed: 05/31/2023]
Abstract
A fundamental problem in the analysis of network data is the detection of network communities, groups of densely interconnected nodes, which may be overlapping or disjoint. Here we describe a method for finding overlapping communities based on a principled statistical approach using generative network models. We show how the method can be implemented using a fast, closed-form expectation-maximization algorithm that allows us to analyze networks of millions of nodes in reasonable running times. We test the method both on real-world networks and on synthetic benchmarks and find that it gives results competitive with previous methods. We also show that the same approach can be used to extract nonoverlapping community divisions via a relaxation method, and demonstrate that the algorithm is competitively fast and accurate for the nonoverlapping problem.
Collapse
Affiliation(s)
- Brian Ball
- Department of Physics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | | | | |
Collapse
|
499
|
Ronhovde P, Chakrabarty S, Hu D, Sahu M, Sahu KK, Kelton KF, Mauro NA, Nussinov Z. Detecting hidden spatial and spatio-temporal structures in glasses and complex physical systems by multiresolution network clustering. THE EUROPEAN PHYSICAL JOURNAL. E, SOFT MATTER 2011; 34:105. [PMID: 21959545 DOI: 10.1140/epje/i2011-11105-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/02/2011] [Revised: 08/03/2011] [Accepted: 08/05/2011] [Indexed: 05/31/2023]
Abstract
We elaborate on a general method that we recently introduced for characterizing the "natural" structures in complex physical systems via multi-scale network analysis. The method is based on "community detection" wherein interacting particles are partitioned into an "ideal gas" of optimally decoupled groups of particles. Specifically, we construct a set of network representations ("replicas") of the physical system based on interatomic potentials and apply a multiscale clustering ("multiresolution community detection") analysis using information-based correlations among the replicas. Replicas may i) be different representations of an identical static system, ii) embody dynamics by considering replicas to be time separated snapshots of the system (with a tunable time separation), or iii) encode general correlations when different replicas correspond to different representations of the entire history of the system as it evolves in space-time. Inputs for our method are the inter-particle potentials or experimentally measured two (or higher order) particle correlations. We apply our method to computer simulations of a binary Kob-Andersen Lennard-Jones system in a mixture ratio of A(80)B(20) , a ternary model system with components "A", "B", and "C" in ratios of A(88)B(7)C(5) (as in Al(88)Y(7)Fe(5) , and to atomic coordinates in a Zr(80)Pt(20) system as gleaned by reverse Monte Carlo analysis of experimentally determined structure factors. We identify the dominant structures (disjoint or overlapping) and general length scales by analyzing extrema of the information theory measures. We speculate on possible links between i) physical transitions or crossovers and ii) changes in structures found by this method as well as phase transitions associated with the computational complexity of the community detection problem. We also briefly consider continuum approaches and discuss rigidity and the shear penetration depth in amorphous systems; this latter length scale increases as the system becomes progressively rigid.
Collapse
Affiliation(s)
- P Ronhovde
- Department of Physics, Washington University in St. Louis, Campus Box 1105, 1 Brookings Drive, St. Louis, MO 63130, USA
| | | | | | | | | | | | | | | |
Collapse
|
500
|
Huang J, Sun H, Liu Y, Song Q, Weninger T. Towards online multiresolution community detection in large-scale networks. PLoS One 2011; 6:e23829. [PMID: 21887325 PMCID: PMC3161084 DOI: 10.1371/journal.pone.0023829] [Citation(s) in RCA: 77] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2011] [Accepted: 07/25/2011] [Indexed: 11/19/2022] Open
Abstract
The investigation of community structure in networks has aroused great interest in multiple disciplines. One of the challenges is to find local communities from a starting vertex in a network without global information about the entire network. Many existing methods tend to be accurate depending on a priori assumptions of network properties and predefined parameters. In this paper, we introduce a new quality function of local community and present a fast local expansion algorithm for uncovering communities in large-scale networks. The proposed algorithm can detect multiresolution community from a source vertex or communities covering the whole network. Experimental results show that the proposed algorithm is efficient and well-behaved in both real-world and synthetic networks.
Collapse
Affiliation(s)
- Jianbin Huang
- School of Software, Xidian University, Xi'an, China.
| | | | | | | | | |
Collapse
|