201
|
Abstract
One of the main ideas about the Internet is to rethink its services in a user-centric fashion. This fact translates to having human-scale services with devices that will become smarter and make decisions in place of their respective owners. Online Social Networks and, in particular, Online Social Groups, such as Facebook Groups, will be at the epicentre of this revolution because of their great relevance in the current society. Despite the vast number of studies on human behaviour in Online Social Media, the characteristics of Online Social Groups are still unknown. In this paper, we propose a dynamic community detection driven study of the structure of users inside Facebook Groups. The communities are extracted considering the interactions among the members of a group and it aims at searching dense communication groups of users, and the evolution of the communication groups over time, in order to discover social properties of Online Social Groups. The analysis is carried out considering the activity of 17 Facebook Groups, using 8 community detection algorithms and considering 2 possible interaction lifespans. Results show that interaction communities in OSGs are very fragmented but community detection tools are capable of uncovering relevant structures. The study of the community quality gives important insights about the community structure and increasing the interaction lifespan does not necessarily result in more clusterized or bigger communities.
Collapse
|
202
|
Verma P, Goyal R. Influence propagation based community detection in complex networks. MACHINE LEARNING WITH APPLICATIONS 2021. [DOI: 10.1016/j.mlwa.2020.100019] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022] Open
|
203
|
Bouguessa M, Nouri K. BiNeTClus. ACM T INTEL SYST TEC 2021. [DOI: 10.1145/3423067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
We investigate the problem of community detection in bipartite networks that are characterized by the presence of two types of nodes such that connections exist only between nodes of different types. While some approaches have been proposed to identify community structures in bipartite networks, there are a number of problems still to solve. In fact, the majority of the proposed approaches suffer from one or even more of the following limitations: (1) difficulty in detecting communities in the presence of many non-discriminating nodes with atypical connections that hide the community structures, (2) loss of relevant topological information due to the transformation of the bipartite network to standard plain graphs, and (3) manually specifying several input parameters, including the number of communities to be identified. To alleviate these problems, we propose BiNeTClus, a parameter-free community detection algorithm in bipartite networks that operates in two phases. The first phase focuses on identifying an initial grouping of nodes through a transactional data model capable of dealing with the situation that involves networks with many atypical connections, that is, sparsely connected nodes and nodes of one type that massively connect to all other nodes of the second type. The second phase aims to refine the clustering results of the first phase via an optimization strategy of the bipartite modularity to identify the final community structures. Our experiments on both synthetic and real networks illustrate the suitability of the proposed approach.
Collapse
Affiliation(s)
| | - Khaled Nouri
- University of Quebec at Montreal, Montreal, Canada
| |
Collapse
|
204
|
Kumari S, Yadav RJ, Namasudra S, Hsu C. Intelligent deception techniques against adversarial attack on the industrial system. INT J INTELL SYST 2021. [DOI: 10.1002/int.22384] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Suchi Kumari
- Department of Computer Science Engineering Bennett University Greater Noida Utter Pradesh India
| | | | - Suyel Namasudra
- Department of Computer Science and Engineering National Institute of Technology Patna Bihar India
| | - Ching‐Hsien Hsu
- School of Mathematics and Big Data Foshan University Foshan China
- Department of Computer Science and Information Engineering Asia University Taichung Taiwan
- Department of Medical Research, China Medical University Hospital China Medical University Taichung Taiwan
| |
Collapse
|
205
|
Wang F, Han S, Yang J, Yan W, Hu G. Knowledge-Guided "Community Network" Analysis Reveals the Functional Modules and Candidate Targets in Non-Small-Cell Lung Cancer. Cells 2021; 10:cells10020402. [PMID: 33669233 PMCID: PMC7919838 DOI: 10.3390/cells10020402] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 02/06/2021] [Accepted: 02/15/2021] [Indexed: 12/24/2022] Open
Abstract
Non-small-cell lung cancer (NSCLC) represents a heterogeneous group of malignancies that are the leading cause of cancer-related death worldwide. Although many NSCLC-related genes and pathways have been identified, there remains an urgent need to mechanistically understand how these genes and pathways drive NSCLC. Here, we propose a knowledge-guided and network-based integration method, called the node and edge Prioritization-based Community Analysis, to identify functional modules and their candidate targets in NSCLC. The protein–protein interaction network was prioritized by performing a random walk with restart algorithm based on NSCLC seed genes and the integrating edge weights, and then a “community network” was constructed by combining Girvan–Newman and Label Propagation algorithms. This systems biology analysis revealed that the CCNB1-mediated network in the largest community provides a modular biomarker, the second community serves as a drug regulatory module, and the two are connected by some contextual signaling motifs. Moreover, integrating structural information into the signaling network suggested novel protein–protein interactions with therapeutic significance, such as interactions between GNG11 and CXCR2, CXCL3, and PPBP. This study provides new mechanistic insights into the landscape of cellular functions in the context of modular networks and will help in developing therapeutic targets for NSCLC.
Collapse
Affiliation(s)
- Fan Wang
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou 215123, China; (F.W.); (S.H.); (J.Y.)
| | - Shuqing Han
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou 215123, China; (F.W.); (S.H.); (J.Y.)
| | - Ji Yang
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou 215123, China; (F.W.); (S.H.); (J.Y.)
| | - Wenying Yan
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou 215123, China; (F.W.); (S.H.); (J.Y.)
- Correspondence: (W.Y.); (G.H.)
| | - Guang Hu
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou 215123, China; (F.W.); (S.H.); (J.Y.)
- State Key Laboratory of Radiation Medicine and Protection, Soochow University, Suzhou 215123, China
- Correspondence: (W.Y.); (G.H.)
| |
Collapse
|
206
|
Tang M, Pan Q, Qian Y, Tian Y, Al-Nabhan N, Wang X. Parallel label propagation algorithm based on weight and random walk. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2021; 18:1609-1628. [PMID: 33757201 DOI: 10.3934/mbe.2021083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Community detection is a complex and meaningful process, which plays an important role in studying the characteristics of complex networks. In recent years, the discovery and analysis of community structures in complex networks has attracted the attention of many scholars, and many community discovery algorithms have been proposed. Many existing algorithms are only suitable for small-scale data, not for large-scale data, so it is necessary to establish a stable and efficient label propagation algorithm to deal with massive data and complex social networks. In this paper, we propose a novel label propagation algorithm, called WRWPLPA (Parallel Label Propagation Algorithm based on Weight and Random Walk). WRWPLPA proposes a new similarity calculation method combining weights and random walks. It uses weights and similarities to update labels in the process of label propagation, improving the accuracy and stability of community detection. First, weight is calculated by combining the neighborhood index and the position index, and the weight is used to distinguish the importance of the nodes in the network. Then, use random walk strategy to describe the similarity between nodes, and the label of nodes are updated by combining the weight and similarity. Finally, parallel propagation is comprehensively proposed to utilize label probability efficiently. Experiment results on artificial network datasets and real network datasets show that our algorithm has improved accuracy and stability compared with other label propagation algorithms.
Collapse
Affiliation(s)
- Meili Tang
- Nanjing University of information science Technology, Jiangsu, Nanjing 210044, China
| | - Qian Pan
- Nanjing University of information science Technology, Jiangsu, Nanjing 210044, China
| | | | - Yuan Tian
- Nanjing Institute of Technology, Nanjing 211167, China
| | - Najla Al-Nabhan
- Department of Computer Science, KingSaud University, Riyadh 11362, Saudi Arabia
| | - Xin Wang
- Huafeng Meteorological Media Group, Beijing 100080, China
| |
Collapse
|
207
|
Zhao Q, Zhang Y, Shao S, Sun Y, Lin Z. Identification of hub genes and biological pathways in hepatocellular carcinoma by integrated bioinformatics analysis. PeerJ 2021; 9:e10594. [PMID: 33552715 PMCID: PMC7821758 DOI: 10.7717/peerj.10594] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Accepted: 11/26/2020] [Indexed: 12/18/2022] Open
Abstract
Background Hepatocellular carcinoma (HCC), the main type of liver cancer in human, is one of the most prevalent and deadly malignancies in the world. The present study aimed to identify hub genes and key biological pathways by integrated bioinformatics analysis. Methods A bioinformatics pipeline based on gene co-expression network (GCN) analysis was built to analyze the gene expression profile of HCC. Firstly, differentially expressed genes (DEGs) were identified and a GCN was constructed with Pearson correlation analysis. Then, the gene modules were identified with 3 different community detection algorithms, and the correlation analysis between gene modules and clinical indicators was performed. Moreover, we used the Search Tool for the Retrieval of Interacting Genes (STRING) database to construct a protein protein interaction (PPI) network of the key gene module, and we identified the hub genes using nine topology analysis algorithms based on this PPI network. Further, we used the Oncomine analysis, survival analysis, GEO data set and random forest algorithm to verify the important roles of hub genes in HCC. Lastly, we explored the methylation changes of hub genes using another GEO data (GSE73003). Results Firstly, among the expression profiles, 4,130 up-regulated genes and 471 down-regulated genes were identified. Next, the multi-level algorithm which had the highest modularity divided the GCN into nine gene modules. Also, a key gene module (m1) was identified. The biological processes of GO enrichment of m1 mainly included the processes of mitosis and meiosis and the functions of catalytic and exodeoxyribonuclease activity. Besides, these genes were enriched in the cell cycle and mitotic pathway. Furthermore, we identified 11 hub genes, MCM3, TRMT6, AURKA, CDC20, TOP2A, ECT2, TK1, MCM2, FEN1, NCAPD2 and KPNA2 which played key roles in HCC. The results of multiple verification methods indicated that the 11 hub genes had highly diagnostic efficiencies to distinguish tumors from normal tissues. Lastly, the methylation changes of gene CDC20, TOP2A, TK1, FEN1 in HCC samples had statistical significance (P-value < 0.05). Conclusion MCM3, TRMT6, AURKA, CDC20, TOP2A, ECT2, TK1, MCM2, FEN1, NCAPD2 and KPNA2 could be potential biomarkers or therapeutic targets for HCC. Meanwhile, the metabolic pathway, the cell cycle and mitotic pathway might played vital roles in the progression of HCC.
Collapse
Affiliation(s)
- Qian Zhao
- College of Information Science and Technology, Dalian Martime University, Dalian, Liaoning, China
| | - Yan Zhang
- College of Information Science and Technology, Dalian Martime University, Dalian, Liaoning, China
| | - Shichun Shao
- College of Environmental Science and Engineering, Dalian Martime University, Dalian, Liaoning, China
| | - Yeqing Sun
- College of Environmental Science and Engineering, Dalian Martime University, Dalian, Liaoning, China
| | - Zhengkui Lin
- College of Information Science and Technology, Dalian Martime University, Dalian, Liaoning, China
| |
Collapse
|
208
|
Zhou J, Jiang Y, Huang B. Source identification of infectious diseases in networks via label ranking. PLoS One 2021; 16:e0245344. [PMID: 33444390 PMCID: PMC7808631 DOI: 10.1371/journal.pone.0245344] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Accepted: 12/28/2020] [Indexed: 11/18/2022] Open
Abstract
Background Outbreaks of infectious diseases would cause great losses to the human society. Source identification in networks has drawn considerable interest in order to understand and control the infectious disease propagation processes. Unsatisfactory accuracy and high time complexity are major obstacles to practical applications under various real-world situations for existing source identification algorithms. Methods This study attempts to measure the possibility for nodes to become the infection source through label ranking. A unified Label Ranking framework for source identification with complete observation and snapshot is proposed. Firstly, a basic label ranking algorithm with complete observation of the network considering both infected and uninfected nodes is designed. Our inferred infection source node with the highest label ranking tends to have more infected nodes surrounding it, which makes it likely to be in the center of infection subgraph and far from the uninfected frontier. A two-stage algorithm for source identification via semi-supervised learning and label ranking is further proposed to address the source identification issue with snapshot. Results Extensive experiments are conducted on both synthetic and real-world network datasets. It turns out that the proposed label ranking algorithms are capable of identifying the propagation source under different situations fairly accurately with acceptable computational complexity without knowing the underlying model of infection propagation. Conclusions The effectiveness and efficiency of the label ranking algorithms proposed in this study make them be of practical value for infection source identification.
Collapse
Affiliation(s)
- Jianye Zhou
- Department of Automation, Tsinghua University, Beijing, PR China
| | - Yuewen Jiang
- Clinical College of Chinese Medicine, Hubei University of Chinese Medicine, Wuhan, Hubei, PR China
- * E-mail: (YJ); (BH)
| | - Biqing Huang
- Department of Automation, Tsinghua University, Beijing, PR China
- * E-mail: (YJ); (BH)
| |
Collapse
|
209
|
Discrimination of Tomato Maturity Using Hyperspectral Imaging Combined with Graph-Based Semi-supervised Method Considering Class Probability Information. FOOD ANAL METHOD 2021. [DOI: 10.1007/s12161-020-01955-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
210
|
Caldarelli G, De Nicola R, Petrocchi M, Pratelli M, Saracco F. Flow of online misinformation during the peak of the COVID-19 pandemic in Italy. EPJ DATA SCIENCE 2021; 10:34. [PMID: 34249599 PMCID: PMC8258478 DOI: 10.1140/epjds/s13688-021-00289-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Accepted: 06/14/2021] [Indexed: 05/16/2023]
Abstract
UNLABELLED The COVID-19 pandemic has impacted on every human activity and, because of the urgency of finding the proper responses to such an unprecedented emergency, it generated a diffused societal debate. The online version of this discussion was not exempted by the presence of misinformation campaigns, but, differently from what already witnessed in other debates, the COVID-19 -intentional or not- flow of false information put at severe risk the public health, possibly reducing the efficacy of government countermeasures. In this manuscript, we study the effective impact of misinformation in the Italian societal debate on Twitter during the pandemic, focusing on the various discursive communities. In order to extract such communities, we start by focusing on verified users, i.e., accounts whose identity is officially certified by Twitter. We start by considering each couple of verified users and count how many unverified ones interacted with both of them via tweets or retweets: if this number is statically significant, i.e. so great that it cannot be explained only by their activity on the online social network, we can consider the two verified accounts as similar and put a link connecting them in a monopartite network of verified users. The discursive communities can then be found by running a community detection algorithm on this network. We observe that, despite being a mostly scientific subject, the COVID-19 discussion shows a clear division in what results to be different political groups. We filter the network of retweets from random noise and check the presence of messages displaying URLs. By using the well known browser extension NewsGuard, we assess the trustworthiness of the most recurrent news sites, among those tweeted by the political groups. The impact of low reputable posts reaches the 22.1% in the right and center-right wing community and its contribution is even stronger in absolute numbers, due to the activity of this group: 96% of all non reputable URLs shared by political groups come from this community. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1140/epjds/s13688-021-00289-4.
Collapse
Affiliation(s)
- Guido Caldarelli
- Department of Molecular Sciences and Nanosystems, Ca’Foscari University of Venice, Ed. Alfa, Via Torino 155, 30170 Venezia Mestre, Italy
- European Centre for Living Technology (ECLT), Ca’ Bottacin, 3911 Dorsoduro Calle Crosera, 30123 Venice, Italy
- IMT School For Advanced Studies Lucca, Piazza San Francesco 19, 55100 Lucca, Italy
| | - Rocco De Nicola
- IMT School For Advanced Studies Lucca, Piazza San Francesco 19, 55100 Lucca, Italy
- CINI – National Laboratory for Cybersecurity, via Ariosto, 25, 00185 Roma, Italy
| | - Marinella Petrocchi
- IMT School For Advanced Studies Lucca, Piazza San Francesco 19, 55100 Lucca, Italy
- Institute of Informatics and Telematics, National Research Council, via Moruzzi 1, 56124 Pisa, Italy
| | - Manuel Pratelli
- IMT School For Advanced Studies Lucca, Piazza San Francesco 19, 55100 Lucca, Italy
| | - Fabio Saracco
- IMT School For Advanced Studies Lucca, Piazza San Francesco 19, 55100 Lucca, Italy
| |
Collapse
|
211
|
Qiao HH, Deng ZH, Li HJ, Hu J, Song Q, Gao L. Research on historical phase division of terrorism: An analysis method by time series complex network. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.07.125] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
212
|
Okuda M, Satoh S, Sato Y, Kidawara Y. Community Detection Using Restrained Random-Walk Similarity. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:89-103. [PMID: 31265385 DOI: 10.1109/tpami.2019.2926033] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
In this paper, we propose a restrained random-walk similarity method for detecting the community structures of graphs. The basic premise of our method is that the starting vertices of finite-length random walks are judged to be in the same community if the walkers pass similar sets of vertices. This idea is based on our consideration that a random walker tends to move in the community including the walker's starting vertex for some time after starting the walk. Therefore, the sets of vertices passed by random walkers starting from vertices in the same community must be similar. The idea is reinforced with two conditions. First, we exclude abnormal random walks. Random walks that depart from each vertex are executed many times, and vertices that are rarely passed by the walkers are excluded from the set of vertices that the walkers may pass. Second, we forcibly restrain random walks to an appropriate length. In our method, a random walk is terminated when the walker repeatedly visits vertices that they have already passed. Experiments on real-world networks demonstrate that our method outperforms previous techniques in terms of accuracy.
Collapse
|
213
|
Aono AH, Pimenta RJG, Garcia ALB, Correr FH, Hosaka GK, Carrasco MM, Cardoso-Silva CB, Mancini MC, Sforça DA, dos Santos LB, Nagai JS, Pinto LR, Landell MGDA, Carneiro MS, Balsalobre TW, Quiles MG, Pereira WA, Margarido GRA, de Souza AP. The Wild Sugarcane and Sorghum Kinomes: Insights Into Expansion, Diversification, and Expression Patterns. FRONTIERS IN PLANT SCIENCE 2021; 12:668623. [PMID: 34305969 PMCID: PMC8294386 DOI: 10.3389/fpls.2021.668623] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Accepted: 03/17/2021] [Indexed: 05/11/2023]
Abstract
The protein kinase (PK) superfamily is one of the largest superfamilies in plants and the core regulator of cellular signaling. Despite this substantial importance, the kinomes of sugarcane and sorghum have not been profiled. Here, we identified and profiled the complete kinomes of the polyploid Saccharum spontaneum (Ssp) and Sorghum bicolor (Sbi), a close diploid relative. The Sbi kinome was composed of 1,210 PKs; for Ssp, we identified 2,919 PKs when disregarding duplications and allelic copies, and these were related to 1,345 representative gene models. The Ssp and Sbi PKs were grouped into 20 groups and 120 subfamilies and exhibited high compositional similarities and evolutionary divergences. By utilizing the collinearity between the species, this study offers insights into Sbi and Ssp speciation, PK differentiation and selection. We assessed the PK subfamily expression profiles via RNA-Seq and identified significant similarities between Sbi and Ssp. Moreover, coexpression networks allowed inference of a core structure of kinase interactions with specific key elements. This study provides the first categorization of the allelic specificity of a kinome and offers a wide reservoir of molecular and genetic information, thereby enhancing the understanding of Sbi and Ssp PK evolutionary history.
Collapse
Affiliation(s)
- Alexandre Hild Aono
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
| | - Ricardo José Gonzaga Pimenta
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
| | - Ana Letycia Basso Garcia
- Department of Genetics, Luiz de Queiroz College of Agriculture (ESALQ), University of São Paulo (USP), Piracicaba, Brazil
| | - Fernando Henrique Correr
- Department of Genetics, Luiz de Queiroz College of Agriculture (ESALQ), University of São Paulo (USP), Piracicaba, Brazil
| | - Guilherme Kenichi Hosaka
- Department of Genetics, Luiz de Queiroz College of Agriculture (ESALQ), University of São Paulo (USP), Piracicaba, Brazil
| | - Marishani Marin Carrasco
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
| | | | - Melina Cristina Mancini
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
| | - Danilo Augusto Sforça
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
| | - Lucas Borges dos Santos
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
| | - James Shiniti Nagai
- Faculty of Medicine, Institute for Computational Genomics, RWTH Aachen University, Aachen, Germany
| | - Luciana Rossini Pinto
- Advanced Center of Sugarcane Agrobusiness Technological Research, Agronomic Institute of Campinas (IAC), Ribeirão Preto, Brazil
| | | | - Monalisa Sampaio Carneiro
- Departamento de Biotecnologia e Produção Vegetal e Animal, Centro de Ciências Agrárias, Universidade Federal de São Carlos (UFSCar), São Carlos, Brazil
| | - Thiago Willian Balsalobre
- Departamento de Biotecnologia e Produção Vegetal e Animal, Centro de Ciências Agrárias, Universidade Federal de São Carlos (UFSCar), São Carlos, Brazil
| | - Marcos Gonçalves Quiles
- Instituto de Ciência e Tecnologia (ICT), Universidade Federal de São Paulo (Unifesp), São José dos Campos, Brazil
| | | | | | - Anete Pereira de Souza
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
- Department of Plant Biology, Institute of Biology, University of Campinas (UNICAMP), Campinas, Brazil
- *Correspondence: Anete Pereira de Souza,
| |
Collapse
|
214
|
Overlapping Community Detection Based on Membership Degree Propagation. ENTROPY 2020; 23:e23010015. [PMID: 33374305 PMCID: PMC7824673 DOI: 10.3390/e23010015] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 12/19/2020] [Accepted: 12/22/2020] [Indexed: 11/17/2022]
Abstract
A community in a complex network refers to a group of nodes that are densely connected internally but with only sparse connections to the outside. Overlapping community structures are ubiquitous in real-world networks, where each node belongs to at least one community. Therefore, overlapping community detection is an important topic in complex network research. This paper proposes an overlapping community detection algorithm based on membership degree propagation that is driven by both global and local information of the node community. In the method, we introduce a concept of membership degree, which not only stores the label information, but also the degrees of the node belonging to the labels. Then the conventional label propagation process could be extended to membership degree propagation, with the results mapped directly to the overlapping community division. Therefore, it obtains the partition result and overlapping node identification simultaneously and greatly reduces the computational time. The proposed algorithm was applied to a synthetic Lancichinetti–Fortunato–Radicchi (LFR) dataset and nine real-world datasets and compared with other up-to-date algorithms. The experimental results show that our proposed algorithm is effective and outperforms the comparison methods on most datasets. Our proposed method significantly improved the accuracy and speed of the overlapping node prediction. It can also substantially alleviate the computational complexity of community structure detection in general.
Collapse
|
215
|
Sheng J, Liu C, Chen L, Wang B, Zhang J. Research on Community Detection in Complex Networks Based on Internode Attraction. ENTROPY (BASEL, SWITZERLAND) 2020; 22:E1383. [PMID: 33297386 PMCID: PMC7762263 DOI: 10.3390/e22121383] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Accepted: 12/03/2020] [Indexed: 11/28/2022]
Abstract
With the rapid development of computer technology, the research on complex networks has attracted more and more attention. At present, the research directions of cloud computing, big data, internet of vehicles, and distributed systems with very high attention are all based on complex networks. Community structure detection is a very important and meaningful research hotspot in complex networks. It is a difficult task to quickly and accurately divide the community structure and run it on large-scale networks. In this paper, we put forward a new community detection approach based on internode attraction, named IACD. This algorithm starts from the perspective of the important nodes of the complex network and refers to the gravitational relationship between two objects in physics to represent the forces between nodes in the network dataset, and then perform community detection. Through experiments on a large number of real-world datasets and synthetic networks, it is shown that the IACD algorithm can quickly and accurately divide the community structure, and it is superior to some classic algorithms and recently proposed algorithms.
Collapse
Affiliation(s)
| | | | | | - Bin Wang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China; (J.S.); (C.L.); (L.C.); (J.Z.)
| | | |
Collapse
|
216
|
Zhao Q, Zhang Y, Zhang X, Sun Y, Lin Z. Mining of gene modules and identification of key genes in head and neck squamous cell carcinoma based on gene co-expression network analysis. Medicine (Baltimore) 2020; 99:e22655. [PMID: 33285674 PMCID: PMC7717835 DOI: 10.1097/md.0000000000022655] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 09/01/2020] [Accepted: 09/10/2020] [Indexed: 11/26/2022] Open
Abstract
To explore the gene modules and key genes of head and neck squamous cell carcinoma (HNSCC), a bioinformatics algorithm based on the gene co-expression network analysis was proposed in this study.Firstly, differentially expressed genes (DEGs) were identified and a gene co-expression network (i-GCN) was constructed with Pearson correlation analysis. Then, the gene modules were identified with 5 different community detection algorithms, and the correlation analysis between gene modules and clinical indicators was performed. Gene Ontology (GO) analysis was used to annotate the biological pathways of the gene modules. Then, the key genes were identified with 2 methods, gene significance (GS) and PageRank algorithm. Moreover, we used the Disgenet database to search the related diseases of the key genes. Lastly, the online software onclnc was used to perform the survival analysis on the key genes and draw survival curves.There were 2600 up-regulated and 1547 down-regulated genes identified in HNSCC. An i-GCN was constructed with Pearson correlation analysis. Then, the i-GCN was divided into 9 gene modules. The result of association analysis showed that, sex was mainly related to mitosis and meiosis processes, event was mainly related to responding to interferons, viruses and T cell differentiation processes, T stage was mainly related to muscle development and contraction, regulation of protein transport activity processes, N stage was mainly related to mitosis and meiosis processes, while M stage was mainly related to responding to interferons and immune response processes. Lastly, 34 key genes were identified, such as CDKN2A, HOXA1, CDC7, PPL, EVPL, PXN, PDGFRB, CALD1, and NUSAP1. Among them, HOXA1, PXN, and NUSAP1 were negatively correlated with the survival prognosis.HOXA1, PXN, and NUSAP1 might play important roles in the progression of HNSCC and severed as potential biomarkers for future diagnosis.
Collapse
Affiliation(s)
- Qian Zhao
- College of Information Science and Technology
| | - Yan Zhang
- College of Information Science and Technology
| | - Xue Zhang
- College of Information Science and Technology
| | - Yeqing Sun
- Institute of Environmental System Biology, College of Environmental Science and Engineering, Dalian Maritime University, Dalian, China
| | | |
Collapse
|
217
|
Correspondence analysis-based network clustering and importance of degenerate solutions unification of spectral clustering and modularity maximization. SOCIAL NETWORK ANALYSIS AND MINING 2020. [DOI: 10.1007/s13278-020-00686-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
218
|
Tamarit I, Pereda M, Cuesta JA. Hierarchical clustering of bipartite data sets based on the statistical significance of coincidences. Phys Rev E 2020; 102:042304. [PMID: 33212688 DOI: 10.1103/physreve.102.042304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Accepted: 09/13/2020] [Indexed: 11/07/2022]
Abstract
When some 'entities' are related by the 'features' they share they are amenable to a bipartite network representation. Plant-pollinator ecological communities, co-authorship of scientific papers, customers and purchases, or answers in a poll, are but a few examples. Analyzing clustering of such entities in the network is a useful tool with applications in many fields, like internet technology, recommender systems, or detection of diseases. The algorithms most widely applied to find clusters in bipartite networks are variants of modularity optimization. Here, we provide a hierarchical clustering algorithm based on a dissimilarity between entities that quantifies the probability that the features shared by two entities are due to mere chance. The algorithm performance is O(n^{2}) when applied to a set of n entities, and its outcome is a dendrogram exhibiting the connections of those entities. Through the introduction of a 'susceptibility' measure we can provide an 'optimal' choice for the clustering as well as quantify its quality. The dendrogram reveals further useful structural information though-like the existence of subclusters within clusters or of nodes that do not fit in any cluster. We illustrate the algorithm by applying it first to a set of synthetic networks, and then to a selection of examples. We also illustrate how to transform our algorithm into a valid alternative for one-mode networks as well, and show that it performs at least as well as the standard, modularity-based algorithms-with a higher numerical performance. We provide an implementation of the algorithm in python freely accessible from GitHub.
Collapse
Affiliation(s)
- Ignacio Tamarit
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Departamento de Matemáticas de la Universidad Carlos III de Madrid, Leganés, Spain.,Unidad Mixta Interdisciplinar de Comportamiento y Complejidad Social (UMICCS), Madrid, Spain
| | - María Pereda
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Departamento de Matemáticas de la Universidad Carlos III de Madrid, Leganés, Spain.,Unidad Mixta Interdisciplinar de Comportamiento y Complejidad Social (UMICCS), Madrid, Spain.,Grupo de Investigación Ingeniería de Organización y Logística (IOL), Escuela Técnica Superior de Ingenieros Industriales, Universidad Politécnica de Madrid, Madrid, Spain
| | - José A Cuesta
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Departamento de Matemáticas de la Universidad Carlos III de Madrid, Leganés, Spain.,Unidad Mixta Interdisciplinar de Comportamiento y Complejidad Social (UMICCS), Madrid, Spain.,Instituto de Biocomputación y Física de Sistemas Complejos (BIFI), Universidad de Zaragoza, Zaragoza, Spain.,UC3M-Santander Big Data Institute (IBiDat), Getafe, Spain
| |
Collapse
|
219
|
Yang B, Sun Y, Huang S. Measuring visibility of disciplines on Chinese academic web. J Inf Sci 2020. [DOI: 10.1177/0165551520968059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
This study proposes a hierarchy affiliation model (department–school–university) to build network between web entities taking into account the domain names, the topological structure of academic network and the disciplinary characteristics of schools and universities synthetically. The study of the Chinese academic web based on the model shows that at the school level, 68 of 95 disciplines (71.6%) are identified from the directed school network and 71 from the undirected school network, respectively; at the university level, four out of seven broad disciplines are found. Furthermore, according to the comparative result based on three types of relations (hyperlinks, citations and collaborations) among universities, we would like to argue with cautions that the structure on academic web would potentially be more suitable to trace the interests in common between institutions.
Collapse
Affiliation(s)
- Bo Yang
- School of Information Management, Nanjing Agricultural University, P.R. China
| | - Ying Sun
- Department of Library and Information Studies, University at Buffalo, USA
| | - Shan Huang
- Computing Center, Nanjing Agricultural University, P.R. China
| |
Collapse
|
220
|
Allen GR, Schwartz FW, Cole DR, Lanno RP, Prabhu A, Eleish A. Algal blooms in a freshwater reservoir – A network community detection analysis of potential forcing parameters. ECOL INFORM 2020. [DOI: 10.1016/j.ecoinf.2020.101168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
221
|
Zhang Y, Liu Y, Li Q, Jin R, Wen C. LILPA: A label importance based label propagation algorithm for community detection with application to core drug discovery. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.06.088] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
222
|
Ge J, Sun H, Xue C, He L, Jia X, He H, Chen J. LPX: Overlapping community detection based on X‐means and label propagation algorithm in attributed networks. Comput Intell 2020. [DOI: 10.1111/coin.12420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Jinhuan Ge
- School of Computer Science and Technology Xi'an Jiaotong University Xi'an China
- First Affiliated Hospital Wenzhou Medical University Wenzhou China
| | - Heli Sun
- School of Computer Science and Technology Xi'an Jiaotong University Xi'an China
- School of Journalism and New Media Xi'an Jiaotong University Xi'an China
| | - Chenhao Xue
- School of Computer Science and Technology Xi'an Jiaotong University Xi'an China
| | - Liang He
- School of Computer Science and Technology Xi'an Jiaotong University Xi'an China
| | - Xiaolin Jia
- School of Computer Science and Technology Xi'an Jiaotong University Xi'an China
| | - Hui He
- School of Computer Science and Technology Xi'an Jiaotong University Xi'an China
| | - Jiyin Chen
- School of Journalism and New Media Xi'an Jiaotong University Xi'an China
| |
Collapse
|
223
|
Morselli Gysi D, de Miranda Fragoso T, Zebardast F, Bertoli W, Busskamp V, Almaas E, Nowick K. Whole transcriptomic network analysis using Co-expression Differential Network Analysis (CoDiNA). PLoS One 2020; 15:e0240523. [PMID: 33057419 PMCID: PMC7561188 DOI: 10.1371/journal.pone.0240523] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Accepted: 09/29/2020] [Indexed: 01/05/2023] Open
Abstract
Biological and medical sciences are increasingly acknowledging the significance of gene co-expression-networks for investigating complex-systems, phenotypes or diseases. Typically, complex phenotypes are investigated under varying conditions. While approaches for comparing nodes and links in two networks exist, almost no methods for the comparison of multiple networks are available and—to best of our knowledge—no comparative method allows for whole transcriptomic network analysis. However, it is the aim of many studies to compare networks of different conditions, for example, tissues, diseases, treatments, time points, or species. Here we present a method for the systematic comparison of an unlimited number of networks, with unlimited number of transcripts: Co-expression Differential Network Analysis (CoDiNA). In particular, CoDiNA detects links and nodes that are common, specific or different among the networks. We developed a statistical framework to normalize between these different categories of common or changed network links and nodes, resulting in a comprehensive network analysis method, more sophisticated than simply comparing the presence or absence of network nodes. Applying CoDiNA to a neurogenesis study we identified candidate genes involved in neuronal differentiation. We experimentally validated one candidate, demonstrating that its overexpression resulted in a significant disturbance in the underlying gene regulatory network of neurogenesis. Using clinical studies, we compared whole transcriptome co-expression networks from individuals with or without HIV and active tuberculosis (TB) and detected signature genes specific to HIV. Furthermore, analyzing multiple cancer transcription factor (TF) networks, we identified common and distinct features for particular cancer types. These CoDiNA applications demonstrate the successful detection of genes associated with specific phenotypes. Moreover, CoDiNA can also be used for comparing other types of undirected networks, for example, metabolic, protein-protein interaction, ecological and psychometric networks. CoDiNA is publicly available as an R package in CRAN (https://CRAN.R-project.org/package=CoDiNA).
Collapse
Affiliation(s)
- Deisy Morselli Gysi
- Department of Computer Science, Leipzig University, Leipzig, Germany
- * E-mail: (KN); (DMG)
| | | | - Fatemeh Zebardast
- Department of Biology, Chemistry, Pharmacy, Freie Universitaet Berlin, Berlin, Germany
| | - Wesley Bertoli
- Department of Statistics, Federal University of Technology - Paraná, Curitiba, Brazil
| | - Volker Busskamp
- Center for Regenerative Therapies (CRTD), Technical University Dresden, Dresden, Germany
- Dept. of Ophthalmology, Universitäts-Augenklinik Bonn, University of Bonn, Bonn, Germany
| | - Eivind Almaas
- Department of Biotechnology, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
- K.G. Jebsen Centre for Genetic Epidemiology, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| | - Katja Nowick
- Department of Biology, Chemistry, Pharmacy, Freie Universitaet Berlin, Berlin, Germany
- * E-mail: (KN); (DMG)
| |
Collapse
|
224
|
Abstract
AbstractCommunity detection is one of the most popular researches in a variety of complex systems, ranging from biology to sociology. In recent years, there’s an increasing focus on the rapid development of more complicated networks, namely multilayer networks. Communities in a single-layer network are groups of nodes that are more strongly connected among themselves than the others, while in multilayer networks, a group of well-connected nodes are shared in multiple layers. Most traditional algorithms can rarely perform well on a multilayer network without modifications. Thus, in this paper, we offer overall comparisons of existing works and analyze several representative algorithms, providing a comprehensive understanding of community detection methods in multilayer networks. The comparison results indicate that the promoting of algorithm efficiency and the extending for general multilayer networks are also expected in the forthcoming studies.
Collapse
|
225
|
Zarei B, Meybodi MR, Masoumi B. Detecting community structure in signed and unsigned social networks by using weighted label propagation. CHAOS (WOODBURY, N.Y.) 2020; 30:103118. [PMID: 33138454 DOI: 10.1063/1.5144139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/03/2020] [Accepted: 10/01/2020] [Indexed: 06/11/2023]
Abstract
Detecting community structure is one of the most important problems in analyzing complex networks such as technological, informational, biological, and social networks and has great importance in understanding the operation and organization of these networks. One of the significant properties of social networks is the communication intensity between the users, which has not received much attention so far. Most of the proposed methods for detecting community structure in social networks have only considered communications between users. In this paper, using MinHash and label propagation, an algorithm called weighted label propagation algorithm (WLPA) has been proposed to detect community structure in signed and unsigned social networks. WLPA takes into account the intensity of communications in addition to the communications. In WLPA, first, the similarity of all adjacent nodes is estimated by using MinHash. Then, each edge is assigned a weight equal to the estimated similarity of its end nodes. The weights assigned to the edges somehow indicate the intensity of communication between users. Finally, the community structure of the network is determined through the weighted label propagation. Experiments on the benchmark networks indicate that WLPA is efficient and effective for detecting community structure in both signed and unsigned social networks.
Collapse
Affiliation(s)
- Bagher Zarei
- Faculty of Computer and Information Technology Engineering, Qazvin Branch, Islamic Azad University, Qazvin 3419915195, Iran
| | - Mohammad Reza Meybodi
- Department of Computer Engineering and Information Technology, Amirkabir University of Technology, Tehran 1591634311, Iran
| | - Behrooz Masoumi
- Faculty of Computer and Information Technology Engineering, Qazvin Branch, Islamic Azad University, Qazvin 3419915195, Iran
| |
Collapse
|
226
|
Sun Z, Sun Y, Chang X, Wang Q, Yan X, Pan Z, Li ZP. Community detection based on the Matthew effect. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.106256] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
227
|
Yang Y, Liu H, Guan Z, He X, Liu G. CoHomo: A cluster-attribute correlation aware graph clustering framework. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.06.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
|
228
|
Zhang Y, Liu Y, Jin R, Tao J, Chen L, Wu X. GLLPA: A Graph Layout based Label Propagation Algorithm for community detection. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.106363] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
229
|
Smith NR, Zivich PN, Frerichs LM, Moody J, Aiello AE. A Guide for Choosing Community Detection Algorithms in Social Network Studies: The Question Alignment Approach. Am J Prev Med 2020; 59:597-605. [PMID: 32951683 PMCID: PMC7508227 DOI: 10.1016/j.amepre.2020.04.015] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Revised: 04/17/2020] [Accepted: 04/22/2020] [Indexed: 11/15/2022]
Abstract
INTRODUCTION Community detection, the process of identifying subgroups of highly connected individuals within a network, is an aspect of social network analysis that is relevant but potentially underutilized in prevention research. Guidance on using community detection methods stresses aligning methods with specific research questions but lacks clear operationalization. The Question Alignment approach was developed to help address this gap and promote the high-quality use of community detection methods. METHODS A total of 6 community detection methods are discussed: Walktrap, Edge-Betweenness, Infomap, Louvain, Label Propagation, and Spinglass. The Question Alignment approach is described and demonstrated using real-world data collected in 2013. This hypothetical case study was conducted in 2019 and focused on targeting a hand hygiene intervention to high-risk communities to prevent influenza transmission. RESULTS Community detection using the Walktrap method best fit the hypothetical case study. The communities derived using the Walktrap method were quite different from communities derived through the other 5 methods in both the number of communities and individuals within communities. There was evidence to support that the Question Alignment approach can help researchers produce more useful community detection results. Compared to other methods of selecting high-risk groups, the Walktrap produced the most communities that met the hypothetical intervention requirements. CONCLUSIONS As prevention research incorporating social networks increases, researchers can use the Question Alignment approach to produce more theoretically meaningful results and potentially more useful results for practice. Future research should focus on assessing whether the Question Alignment approach translates into improved intervention results.
Collapse
Affiliation(s)
- Natalie R Smith
- Department of Health Policy and Management, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina; Carolina Population Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina.
| | - Paul N Zivich
- Carolina Population Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina; Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Leah M Frerichs
- Department of Health Policy and Management, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - James Moody
- Department of Sociology, Duke University, Durham, North Carolina; Department of Sociology, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Allison E Aiello
- Carolina Population Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina; Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| |
Collapse
|
230
|
Singh J, Singh AK. NSLPCD: Topic based tweets clustering using Node significance based label propagation community detection algorithm. ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE 2020; 89:371-407. [PMID: 32989349 PMCID: PMC7511268 DOI: 10.1007/s10472-020-09709-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Social networks like Twitter, Facebook have recently become the most widely used communication platforms for people to propagate information rapidly. Fast diffusion of information creates accuracy and scalability issues towards topic detection. Most of the existing approaches can detect the most popular topics on a large scale. However, these approaches are not effective for faster detection. This article proposes a novel topic detection approach - Node Significance based Label Propagation Community Detection (NSLPCD) algorithm, which detects the topic faster without compromising accuracy. The proposed algorithm analyzes the frequency distribution of keywords in the collection of tweets and finds two types of keywords: topic-identifying and topic-describing keywords, which play an important role in topic detection. Based on these defined keywords, the keyword co-occurrence graph is built, and subsequently, the NSLPCD algorithm is applied to get topic clusters in the form of communities. The experimental results using the real data of Twitter, show that the proposed method is effective in quality as well as run-time performance as compared to other existing methods.
Collapse
Affiliation(s)
- Jagrati Singh
- CSED, Motilal Nehru National Institute of Technology Prayagraj, Prayagraj, India
| | - Anil Kumar Singh
- CSED, Motilal Nehru National Institute of Technology Prayagraj, Prayagraj, India
| |
Collapse
|
231
|
Chu Y, Shan X, Chen T, Jiang M, Wang Y, Wang Q, Salahub DR, Xiong Y, Wei DQ. DTI-MLCD: predicting drug-target interactions using multi-label learning with community detection method. Brief Bioinform 2020; 22:5910189. [PMID: 32964234 DOI: 10.1093/bib/bbaa205] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2020] [Revised: 08/06/2020] [Accepted: 08/10/2020] [Indexed: 12/20/2022] Open
Abstract
Identifying drug-target interactions (DTIs) is an important step for drug discovery and drug repositioning. To reduce the experimental cost, a large number of computational approaches have been proposed for this task. The machine learning-based models, especially binary classification models, have been developed to predict whether a drug-target pair interacts or not. However, there is still much room for improvement in the performance of current methods. Multi-label learning can overcome some difficulties caused by single-label learning in order to improve the predictive performance. The key challenge faced by multi-label learning is the exponential-sized output space, and considering label correlations can help to overcome this challenge. In this paper, we facilitate multi-label classification by introducing community detection methods for DTI prediction, named DTI-MLCD. Moreover, we updated the gold standard data set by adding 15,000 more positive DTI samples in comparison to the data set, which has widely been used by most of previously published DTI prediction methods since 2008. The proposed DTI-MLCD is applied to both data sets, demonstrating its superiority over other machine learning methods and several existing methods. The data sets and source code of this study are freely available at https://github.com/a96123155/DTI-MLCD.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Xiaoqi Shan
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Tianhang Chen
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Mingming Jiang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Yanjing Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Qiankun Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | | | - Yi Xiong
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Dong-Qing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| |
Collapse
|
232
|
Singh D, Garg R. Comparative analysis of sequential community detection algorithms based on internal and external quality measure. JOURNAL OF STATISTICS & MANAGEMENT SYSTEMS 2020. [DOI: 10.1080/09720510.2020.1800189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Dipika Singh
- Department of Computer Science, Institute of Science, Banaras Hindu University, Varanasi 221005, Uttar Pradesh, India
| | - Rakhi Garg
- Department of Computer Science, Mahila Mahavidyalaya, Banaras Hindu University, Varanasi 221005, Uttar Pradesh, India
| |
Collapse
|
233
|
Sheridan MA, Shi F, Miller AB, Sahali C, McLaughlin KA. Network structure reveals clusters of associations between childhood adversities and development outcomes. Dev Sci 2020; 23:e12934. [PMID: 31869484 PMCID: PMC7308216 DOI: 10.1111/desc.12934] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2019] [Revised: 07/03/2019] [Accepted: 07/05/2019] [Indexed: 01/18/2023]
Abstract
Exposure to childhood adversity is common and associated with a host of negative developmental outcomes. The most common approach used to examine the consequences of adversity exposure is a cumulative risk model. Recently, we have proposed a novel approach, the dimensional model of adversity and psychopathology (DMAP), where different dimensions of adversity are hypothesized to impact health and well-being through different pathways. We expect deprivation to primarily disrupt cognitive processing, whereas we expect threat to primarily alter emotional reactivity and automatic regulation. Recent hypothesis-driven approaches provide support for these differential associations of deprivation and threat on developmental outcomes. However, it is not clear whether these patterns would emerge using data-driven approaches. Here we use a network analytic approach to identify clusters of related adversity exposures and outcomes in an initial study (Study 1: N = 277 adolescents aged 16-17 years; 55.1% female) and a replication (Study 2: N = 262 children aged 8-16 years; 45.4% female). We statistically compare our observed clusters with our hypothesized DMAP model and a clustering we hypothesize would be the result of a cumulative stress model. In both samples we observed a network structure consistent with the DMAP model and statistically different than the hypothesized cumulative stress model. Future work seeking to identify in the pathways through which adversity impacts development should consider multiple dimensions of adversity.
Collapse
Affiliation(s)
| | - Feng Shi
- University of North Carolina, Chapel Hill
| | | | | | | |
Collapse
|
234
|
|
235
|
Hao Shi, Yan KK, Ding L, Qian C, Chi H, Yu J. Network Approaches for Dissecting the Immune System. iScience 2020; 23:101354. [PMID: 32717640 PMCID: PMC7390880 DOI: 10.1016/j.isci.2020.101354] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2020] [Revised: 06/21/2020] [Accepted: 07/08/2020] [Indexed: 02/06/2023] Open
Abstract
The immune system is a complex biological network composed of hierarchically organized genes, proteins, and cellular components that combat external pathogens and monitor the onset of internal disease. To meet and ultimately defeat these challenges, the immune system orchestrates an exquisitely complex interplay of numerous cells, often with highly specialized functions, in a tissue-specific manner. One of the major methodologies of systems immunology is to measure quantitatively the components and interaction levels in the immunologic networks to construct a computational network and predict the response of the components to perturbations. The recent advances in high-throughput sequencing techniques have provided us with a powerful approach to dissecting the complexity of the immune system. Here we summarize the latest progress in integrating omics data and network approaches to construct networks and to infer the underlying signaling and transcriptional landscape, as well as cell-cell communication, in the immune system, with a focus on hematopoiesis, adaptive immunity, and tumor immunology. Understanding the network regulation of immune cells has provided new insights into immune homeostasis and disease, with important therapeutic implications for inflammation, cancer, and other immune-mediated disorders.
Collapse
Affiliation(s)
- Hao Shi
- Departments of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Immunology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Koon-Kiu Yan
- Departments of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Liang Ding
- Departments of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Chenxi Qian
- Departments of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA; Department of Immunology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Hongbo Chi
- Department of Immunology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Jiyang Yu
- Departments of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA.
| |
Collapse
|
236
|
Spatiotemporal data analysis with chronological networks. Nat Commun 2020; 11:4036. [PMID: 32788573 PMCID: PMC7424518 DOI: 10.1038/s41467-020-17634-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2019] [Accepted: 07/02/2020] [Indexed: 11/08/2022] Open
Abstract
The number of spatiotemporal data sets has increased rapidly in the last years, which demands robust and fast methods to extract information from this kind of data. Here, we propose a network-based model, called Chronnet, for spatiotemporal data analysis. The network construction process consists of dividing a geometric space into grid cells represented by nodes connected chronologically. Strong links in the network represent consecutive recurrent events between cells. The chronnet construction process is fast, making the model suitable to process large data sets. Using artificial and real data sets, we show how chronnets can capture data properties beyond simple statistics, like frequent patterns, spatial changes, outliers, and spatiotemporal clusters. Therefore, we conclude that chronnets represent a robust tool for the analysis of spatiotemporal data sets. Extracting central information from ever-growing data generated in our lives calls for new data mining methods. Ferreira et al. show a simple model, called chronnets, that can capture frequent patterns, spatial changes, outliers, and spatiotemporal clusters.
Collapse
|
237
|
Development of a structure analytic hierarchy approach for the evaluation of the physical protection system effectiveness. NUCLEAR ENGINEERING AND TECHNOLOGY 2020. [DOI: 10.1016/j.net.2020.01.033] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
238
|
Ye F, Chen C, Wen Z, Zheng Z, Chen W, Zhou Y. Homophily Preserving Community Detection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:2903-2915. [PMID: 31502990 DOI: 10.1109/tnnls.2019.2933850] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
As a fundamental problem in social network analysis, community detection has recently attracted wide attention, accompanied by the output of numerous community detection methods. However, most existing methods are developed by only exploiting link topology, without taking node homophily (i.e., node similarity) into consideration. Thus, much useful information that can be utilized to improve the quality of detected communities is ignored. To overcome this limitation, we propose a new community detection approach based on nonnegative matrix factorization (NMF), namely, homophily preserving NMF (HPNMF), which models not only link topology but also node homophily of networks. As such, HPNMF is able to better reflect the inherent properties of community structure. In order to capture node homophily from scratch, we provide three similarity measurements that naturally reveal the association relationships between nodes. We further present an efficient learning algorithm with convergence guarantee to solve the proposed model. Finally, extensive experiments are conducted, and the results demonstrate that HPNMF has strong ability to outperform the state-of-the-art baseline methods.
Collapse
|
239
|
Hao Y, Zhang F. An unsupervised detection method for shilling attacks based on deep learning and community detection. Soft comput 2020. [DOI: 10.1007/s00500-020-05162-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
240
|
|
241
|
Jin D, Li B, Jiao P, He D, Shan H, Zhang W. Modeling with Node Popularities for Autonomous Overlapping Community Detection. ACM T INTEL SYST TEC 2020. [DOI: 10.1145/3373760] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Overlapping community detection has triggered recent research in network analysis. One of the promising techniques for finding overlapping communities is the popular stochastic models, which, unfortunately, have some common drawbacks. They do not support an important observation that highly connected nodes are more likely to reside in the overlapping regions of communities in the network. These methods are in essence not truly unsupervised, since they require a threshold on probabilistic memberships to derive overlapping structures and need the number of communities to be specified
a priori
. We develop a new method to address these issues for overlapping community detection. We first present a stochastic model to accommodate the relative importance and the expected degree of every node in each community. We then infer every overlapping community by ranking the nodes according to their importance. Second, we determine the number of communities under the Bayesian framework. We evaluate our method and compare it with five state-of-the-art methods. The results demonstrate the superior performance of our method. We also apply this new method to two applications, showing its superb performance on practical problems.
Collapse
Affiliation(s)
- Di Jin
- College of Intelligence and Computing, Tianjin University, China
| | - Bingyi Li
- College of Intelligence and Computing, Tianjin University, China
| | - Pengfei Jiao
- College of Intelligence and Computing, Center of Biosafety Research and Strategy, Tianjin University, China
| | - Dongxiao He
- College of Intelligence and Computing, Tianjin University, China
| | - Hongyu Shan
- College of Intelligence and Computing, Tianjin University, China
| | - Weixiong Zhang
- Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, Missouri
| |
Collapse
|
242
|
Ding X, Zhang J, Yang J. Node-community membership diversifies community structures: An overlapping community detection algorithm based on local expansion and boundary re-checking. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.105935] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
243
|
Taheri S, Bouyer A. Community Detection in Social Networks Using Affinity Propagation with Adaptive Similarity Matrix. BIG DATA 2020; 8:189-202. [PMID: 32397731 DOI: 10.1089/big.2019.0143] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Community detection problem is a projection of data clustering where the network's topological properties are only considered for measuring similarities among nodes. Also, finding communities' kernel nodes and expanding a community from kernel will certainly help us to find optimal communities. Among the existing community detection approaches, the affinity propagation (AP)-based method has been showing promising results and does not require any predefined information such as the number of clusters (communities). AP is an exemplar-based clustering method that defines the negative real-valued similarity measure sim(i, k) between data point i and exemplar k as the probability of k being the exemplar of data point i. According to our intuition, the value of sim(i, k) should not be identical to sim(k, i). In this study, a new version of AP using an adaptive similarity matrix, namely affinity propagation with adaptive similarity (APAS) matrix, is proposed, which could efficiently show the leadership probabilities between data points. APAS can adaptively transform the symmetric similarity matrix into an asymmetric one. It outperforms AP method in terms of accuracy. Extensive experiments conducted on artificial and real-world networks demonstrate the effectiveness of our approach.
Collapse
Affiliation(s)
- Sona Taheri
- Department of Computer Engineering, Azarbaijan Shahid Madani University, Tabriz, Iran
| | - Asgarali Bouyer
- Department of Computer Engineering, Azarbaijan Shahid Madani University, Tabriz, Iran
| |
Collapse
|
244
|
Zhu K, Walker D, Muchnik L. Content Growth and Attention Contagion in Information Networks: Addressing Information Poverty on Wikipedia. INFORMATION SYSTEMS RESEARCH 2020. [DOI: 10.1287/isre.2019.0899] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Open collaboration platforms have fundamentally changed the way that knowledge is produced, disseminated, and consumed. Although the community governance and open collaboration model of Wikipedia confers many benefits, its decentralized nature can leave questions of information poverty and skewness to the mercy of the system's natural dynamics. In this paper, we leverage a large-scale natural experiment to gain a causal understanding of how exogenous content contributions to Wikipedia articles affect the attention that they attract and how that attention spills over to other articles in the information network. We find a positive feedback loop: content contribution leads to significant and long-lasting increases of attention and future contribution. Unfortunately, this also suggests that impoverished regions of information networks are likely to remain so in the absence of intervention. However, our analysis reveals a potential solution. Articles in impoverished regions of information networks are particularly positioned to benefit from the phenomenon of attention spillovers. Using a simulation that is calibrated with real-world link traffic of the Wikipedia network, we show that an attention contagion policy, which focuses editorial effort coherently on impoverished regions, can lead to as much as a twofold gain in attention relative to unguided contributions.
Collapse
Affiliation(s)
- Kai Zhu
- Desautels Faculty of Management, McGill University, Montreal, Quebec H3A 1G5, Canada
| | - Dylan Walker
- Questrom School of Business, Boston University, Boston, Massachusetts 02215
| | - Lev Muchnik
- Jerusalem School of Business Administration, The Hebrew University of Jerusalem, 91905 Jerusalem, Israel
| |
Collapse
|
245
|
A Two-Tier Partition Algorithm for the Optimization of the Large-Scale Simulation of Information Diffusion in Social Networks. Symmetry (Basel) 2020. [DOI: 10.3390/sym12050843] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
As online social networks play a more and more important role in public opinion, the large-scale simulation of social networks has been focused on by many scientists from sociology, communication, informatics, and so on. It is a good way to study real information diffusion in a symmetrical simulation world by agent-based modeling and simulation (ABMS), which is considered an effective solution by scholars from computational sociology. However, on the one hand, classical ABMS tools such as NetLogo cannot support the simulation of more than thousands of agents. On the other hand, big data platforms such as Hadoop and Spark used to study big datasets do not provide optimization for the simulation of large-scale social networks. A two-tier partition algorithm for the optimization of large-scale simulation of social networks is proposed in this paper. First, the simulation kernel of ABMS for information diffusion is implemented based on the Spark platform. Both the data structure and the scheduling mechanism are implemented by Resilient Distributed Data (RDD) to simulate the millions of agents. Second, a two-tier partition algorithm is implemented by community detection and graph cut. Community detection is used to find the partition of high interactions in the social network. A graph cut is used to achieve the goal of load balance. Finally, with the support of the dataset recorded from Twitter, a series of experiments are used to testify the performance of the two-tier partition algorithm in both the communication cost and load balance.
Collapse
|
246
|
Li W, Kang Q, Kong H, Liu C, Kang Y. A novel iterated greedy algorithm for detecting communities in complex network. SOCIAL NETWORK ANALYSIS AND MINING 2020. [DOI: 10.1007/s13278-020-00641-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
247
|
Luo W, Lu N, Ni L, Zhu W, Ding W. Local community detection by the nearest nodes with greater centrality. Inf Sci (N Y) 2020. [DOI: 10.1016/j.ins.2020.01.001] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
248
|
Jiang H, Liu Z, Liu C, Su Y, Zhang X. Community detection in complex networks with an ambiguous structure using central node based link prediction. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.105626] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
249
|
Zarei B, Meybodi MR. Detecting community structure in complex networks using genetic algorithm based on object migrating automata. Comput Intell 2020. [DOI: 10.1111/coin.12273] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Affiliation(s)
- Bagher Zarei
- Faculty of Computer and Information Technology EngineeringQazvin Branch, Islamic Azad University Qazvin Iran
| | - Mohammad Reza Meybodi
- Department of Computer Engineering and Information TechnologyAmirkabir University of Technology Tehran Iran
| |
Collapse
|
250
|
Valejo A, Faleiros T, Oliveira MCFD, Lopes ADA. A coarsening method for bipartite networks via weight-constrained label propagation. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.105678] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|