1
|
Su X, Xue S, Liu F, Wu J, Yang J, Zhou C, Hu W, Paris C, Nepal S, Jin D, Sheng QZ, Yu PS. A Comprehensive Survey on Community Detection With Deep Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:4682-4702. [PMID: 35263257 DOI: 10.1109/tnnls.2021.3137396] [Citation(s) in RCA: 37] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Detecting a community in a network is a matter of discerning the distinct features and connections of a group of members that are different from those in other communities. The ability to do this is of great significance in network analysis. However, beyond the classic spectral clustering and statistical inference methods, there have been significant developments with deep learning techniques for community detection in recent years-particularly when it comes to handling high-dimensional network data. Hence, a comprehensive review of the latest progress in community detection through deep learning is timely. To frame the survey, we have devised a new taxonomy covering different state-of-the-art methods, including deep learning models based on deep neural networks (DNNs), deep nonnegative matrix factorization, and deep sparse filtering. The main category, i.e., DNNs, is further divided into convolutional networks, graph attention networks, generative adversarial networks, and autoencoders. The popular benchmark datasets, evaluation metrics, and open-source implementations to address experimentation settings are also summarized. This is followed by a discussion on the practical applications of community detection in various domains. The survey concludes with suggestions of challenging topics that would make for fruitful future research directions in this fast-growing deep learning field.
Collapse
|
2
|
Yagoub I, Lou Z, Qiu B, Abdul Wahid J, Saad T. Density and node closeness based clustering method for community detection. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-220224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
In a real-world, networked system, the ability to detect communities or clusters has piqued the concern of researchers in a wide range of fields. Many existing methods are simply meant to detect the membership of communities, not the structures of those groups, which is a limitation. We contend that community structures at the local level can also provide valuable insight into their detection. In this study, we developed a simple yet prosperous way of uncovering communities and their cores at the same time while keeping things simple. Essentially, the concept is founded on the theory that the structure of a community may be thought of as a high-density node surrounded by neighbors of minor densities and that community centers are located at a significant distance from one another. We propose a concept termed “community centrality” based on finding motifs to measure the probability of a node becoming the community center in a setting like this and then disseminate multiple, substantial center probabilities all over the network through a node closeness score mechanism. The experimental results show that the proposed method is more efficient than many other already used methods.
Collapse
Affiliation(s)
- Imam Yagoub
- School of Computer and Artificial Intelligence, Zhengzhou University, 450001, China
| | - Zhengzheng Lou
- School of Computer and Artificial Intelligence, Zhengzhou University, 450001, China
| | - Baozhi Qiu
- School of Computer and Artificial Intelligence, Zhengzhou University, 450001, China
| | - Junaid Abdul Wahid
- School of Computer and Artificial Intelligence, Zhengzhou University, 450001, China
| | - Tahir Saad
- Prince Sattam Bin Abdulaziz University, Al-Kharj, Saudi Arabia
| |
Collapse
|
3
|
|
4
|
Haghbayan SA, Geroliminis N, Akbarzadeh M. Community detection in large scale congested urban road networks. PLoS One 2021; 16:e0260201. [PMID: 34843535 PMCID: PMC8629316 DOI: 10.1371/journal.pone.0260201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2020] [Accepted: 11/04/2021] [Indexed: 11/19/2022] Open
Abstract
Traffic congestion in large urban networks may take different shapes and propagates non-uniformly variations from day to day. Given the fact that congestion on a road segment is spatially correlated to adjacent roads and propagates spatiotemporally with finite speed, it is essential to describe the main pockets of congestion in a city with a small number of clusters. For example, the perimeter control with macroscopic fundamental diagrams is one of the effective traffic management tools. Perimeter control adjusts the inflow to pre-specified regions of a city through signal timing on the border of a region in order to optimize the traffic condition within the region. The precision of macroscopic fundamental diagrams depends on the homogeneity of traffic condition on road segments of the region. Hence, previous studies have defined the boundaries of the region under perimeter control subjected to the regional homogeneity. In this study, a cost-effective method is proposed for the mentioned problem that simultaneously considers homogeneity, contiguity and compactness of clusters and has a shorter computational time. Since it is necessary to control the cost and complexity of perimeter control in terms of the number of traffic signals, sparse parts of the network could be potential candidates for boundaries. Therefore, a community detection method (Infomap) is initially adopted and then those clusters are improved by refining the communities in relation to roads with the highest heterogeneity. The proposed method is applied to Shenzhen, China and San Francisco, USA and the outcomes are compared to previous studies. The results of comparison reveal that the proposed method is as effective as the best previous methods in detecting homogenous communities, but it outperforms them in contiguity. It is worth noting that this is the first method that guarantees the connectedness of clusters, which is a prerequisite of perimeter control.
Collapse
Affiliation(s)
- Seyed Arman Haghbayan
- Department of Transportation Engineering, Isfahan University of Technology, Isfahan, Iran
| | - Nikolas Geroliminis
- Ecole Polytechnique Federale de Lausanne (EPFL), School of Architecture, Civil and Environmental Engineering (ENAC), Urban Transport Systems Laboratory (LUTS), Lausanne, Switzerland
| | - Meisam Akbarzadeh
- Department of Transportation Engineering, Isfahan University of Technology, Isfahan, Iran
| |
Collapse
|
5
|
Analysis of University Students’ Behavior Based on a Fusion K-Means Clustering Algorithm. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10186566] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
With the development of big data technology, creating the ‘Digital Campus’ is a hot issue. For an increasing amount of data, traditional data mining algorithms are not suitable. The clustering algorithm is becoming more and more important in the field of data mining, but the traditional clustering algorithm does not take the clustering efficiency and clustering effect into consideration. In this paper, the algorithm based on K-Means and clustering by fast search and find of density peaks (K-CFSFDP) is proposed, which improves on the distance and density of data points. This method is used to cluster students from four universities. The experiment shows that K-CFSFDP algorithm has better clustering results and running efficiency than the traditional K-Means clustering algorithm, and it performs well in large scale campus data. Additionally, the results of the cluster analysis show that the students of different categories in four universities had different performances in living habits and learning performance, so the university can learn about the students’ behavior of different categories and provide corresponding personalized services, which have certain practical significance.
Collapse
|
6
|
Taheri S, Bouyer A. Community Detection in Social Networks Using Affinity Propagation with Adaptive Similarity Matrix. BIG DATA 2020; 8:189-202. [PMID: 32397731 DOI: 10.1089/big.2019.0143] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Community detection problem is a projection of data clustering where the network's topological properties are only considered for measuring similarities among nodes. Also, finding communities' kernel nodes and expanding a community from kernel will certainly help us to find optimal communities. Among the existing community detection approaches, the affinity propagation (AP)-based method has been showing promising results and does not require any predefined information such as the number of clusters (communities). AP is an exemplar-based clustering method that defines the negative real-valued similarity measure sim(i, k) between data point i and exemplar k as the probability of k being the exemplar of data point i. According to our intuition, the value of sim(i, k) should not be identical to sim(k, i). In this study, a new version of AP using an adaptive similarity matrix, namely affinity propagation with adaptive similarity (APAS) matrix, is proposed, which could efficiently show the leadership probabilities between data points. APAS can adaptively transform the symmetric similarity matrix into an asymmetric one. It outperforms AP method in terms of accuracy. Extensive experiments conducted on artificial and real-world networks demonstrate the effectiveness of our approach.
Collapse
Affiliation(s)
- Sona Taheri
- Department of Computer Engineering, Azarbaijan Shahid Madani University, Tabriz, Iran
| | - Asgarali Bouyer
- Department of Computer Engineering, Azarbaijan Shahid Madani University, Tabriz, Iran
| |
Collapse
|
7
|
Cheng J, Yin X, Li Q, Yang H, Li L, Leng M, Chen X. Voting Simulation based Agglomerative Hierarchical Method for Network Community Detection. Sci Rep 2018; 8:8064. [PMID: 29795231 PMCID: PMC5966462 DOI: 10.1038/s41598-018-26415-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Accepted: 05/10/2018] [Indexed: 11/26/2022] Open
Abstract
Community detection has been paid much attention in many fields in recent years, and a great deal of community-detection methods have been proposed. But the time consumption of some of them is heavy, limiting them from being applied to large-scale networks. On the contrary, there exist some lower-time-complexity methods. But most of them are non-deterministic, meaning that running the same method many times may yield different results from the same network, which reduces their practical utility greatly in real-world applications. To solve these problems, we propose a community-detection method in this paper, which takes both the quality of the results and the efficiency of the detecting procedure into account. Moreover, it is a deterministic method which can extract definite community structures from networks. The proposed method is inspired by the voting behaviours in election activities in the social society, in which we first simulate the voting procedure on the network. Every vertex votes for the nominated candidates following the proposed voting principles, densely connected groups of vertices can quickly reach a consensus on their candidates. At the end of this procedure, candidates and their own voters form a group of clusters. Then, we take the clusters as initial communities, and agglomerate some of them into larger ones with high efficiency to obtain the resulting community structures. We conducted extensive experiments on some artificial networks and real-world networks, the experimental results show that our proposed method can efficiently extract high-quality community structures from networks, and outperform the comparison algorithms significantly.
Collapse
Affiliation(s)
- Jianjun Cheng
- Lanzhou University, School of Information Science and Engineering, Lanzhou, 730000, China.
- Gansu Resources and Environmental Science Data Engineering Technology Research Center, Lanzhou, 730000, China.
| | - Xinhong Yin
- Lanzhou University, School of Information Science and Engineering, Lanzhou, 730000, China
| | - Qi Li
- Lanzhou University, School of Information Science and Engineering, Lanzhou, 730000, China
| | - Haijuan Yang
- Lanzhou Vocational Technical College, Department of Electronic Information Engineering, Lanzhou, 730070, China
| | - Longjie Li
- Lanzhou University, School of Information Science and Engineering, Lanzhou, 730000, China
| | - Mingwei Leng
- Northwest Minzu University, School of Education Science and Technology, Lanzhou, 730030, China
| | - Xiaoyun Chen
- Lanzhou University, School of Information Science and Engineering, Lanzhou, 730000, China.
| |
Collapse
|
8
|
Micro-blog user community discovery using generalized SimRank edge weighting method. PLoS One 2018; 13:e0196447. [PMID: 29734358 PMCID: PMC5937763 DOI: 10.1371/journal.pone.0196447] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2016] [Accepted: 04/09/2018] [Indexed: 11/19/2022] Open
Abstract
Community discovery is one of the most popular issues in analyzing and understanding a network. Previous research suggests that the discovery can be enhanced by assigning weights to the edges of the network. This paper proposes a novel edge weighting method, which balances both local and global weighting based on the idea of shared neighbor ranging between users and the interpersonal significance of the social network community. We assume that users belonging to the same community have similar relationship network structures. By controlling the measure of "neighborhood", this method can adequately adapt to real-world networks. Therefore, the famous similarity calculation method-SimRank-can be regarded as a special case of our method. According to the practical significance of social networks, we propose a new evaluation method that uses the communication rate to measure its divided demerit to better express users' interaction relations than the ordinary modularity Q. Furthermore, the fast Newman algorithm is extended to weighted networks. In addition, we use four real networks in the largest Chinese micro-blog website Sina. The results of experiments demonstrate that the proposed method easily meets the balancing requirements and is more robust to different kinds of networks. The experimental results also indicate that the proposed algorithm outperforms several conventional weighting methods.
Collapse
|
9
|
Fu YH, Huang CY, Sun CT. A community detection algorithm using network topologies and rule-based hierarchical arc-merging strategies. PLoS One 2017; 12:e0187603. [PMID: 29121100 PMCID: PMC5679540 DOI: 10.1371/journal.pone.0187603] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2017] [Accepted: 10/23/2017] [Indexed: 11/18/2022] Open
Abstract
The authors use four criteria to examine a novel community detection algorithm: (a) effectiveness in terms of producing high values of normalized mutual information (NMI) and modularity, using well-known social networks for testing; (b) examination, meaning the ability to examine mitigating resolution limit problems using NMI values and synthetic networks; (c) correctness, meaning the ability to identify useful community structure results in terms of NMI values and Lancichinetti-Fortunato-Radicchi (LFR) benchmark networks; and (d) scalability, or the ability to produce comparable modularity values with fast execution times when working with large-scale real-world networks. In addition to describing a simple hierarchical arc-merging (HAM) algorithm that uses network topology information, we introduce rule-based arc-merging strategies for identifying community structures. Five well-studied social network datasets and eight sets of LFR benchmark networks were employed to validate the correctness of a ground-truth community, eight large-scale real-world complex networks were used to measure its efficiency, and two synthetic networks were used to determine its susceptibility to two resolution limit problems. Our experimental results indicate that the proposed HAM algorithm exhibited satisfactory performance efficiency, and that HAM-identified and ground-truth communities were comparable in terms of social and LFR benchmark networks, while mitigating resolution limit problems.
Collapse
Affiliation(s)
- Yu-Hsiang Fu
- Department of Computer Science, National Chiao Tung University, Hsinchu, Taiwan
| | - Chung-Yuan Huang
- Department of Computer Science and Information Engineering, School of Electrical and Computer Engineering, College of Engineering, Chang Gung University, Taoyuan, Taiwan
- * E-mail:
| | - Chuen-Tsai Sun
- Department of Computer Science, National Chiao Tung University, Hsinchu, Taiwan
| |
Collapse
|