1
|
Seiler JPH, Elpelt J, Ghobadi A, Kaschube M, Rumpel S. Perceptual and semantic maps in individual humans share structural features that predict creative abilities. COMMUNICATIONS PSYCHOLOGY 2025; 3:30. [PMID: 39994417 PMCID: PMC11850602 DOI: 10.1038/s44271-025-00214-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Accepted: 02/11/2025] [Indexed: 02/26/2025]
Abstract
Building perceptual and associative links between internal representations is a fundamental neural process, allowing individuals to structure their knowledge about the world and combine it to enable efficient and creative behavior. In this context, the representational similarity between pairs of represented entities is thought to reflect their associative linkage at different levels of sensory processing, ranging from lower-order perceptual levels up to higher-order semantic levels. While recently specific structural features of semantic representational maps were linked with creative abilities of individual humans, it remains unclear if these features are also shared on lower level, perceptual maps. Here, we address this question by presenting 148 human participants with psychophysical scaling tasks, using two sets of independent and qualitatively distinct stimuli, to probe representational map structures in the lower-order auditory and the higher-order semantic domain. We quantify individual representational features with graph-theoretical measures and demonstrate a robust correlation of representational structures in the perceptual auditory and semantic modality. We delineate these shared representational features to predict multiple verbal standard measures of creativity, observing that both, semantic and auditory features, reflect creative abilities. Our findings indicate that the general, modality-overarching representational geometry of an individual is a relevant underpinning of creative thought.
Collapse
Affiliation(s)
- Johannes P-H Seiler
- Institute of Physiology, Focus Program Translational Neurosciences, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.
| | - Jonas Elpelt
- Frankfurt Institute for Advanced Studies, Frankfurt am Main, Germany
- Institute of Computer Science, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Aida Ghobadi
- Institute of Physiology, Focus Program Translational Neurosciences, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
| | - Matthias Kaschube
- Frankfurt Institute for Advanced Studies, Frankfurt am Main, Germany
- Institute of Computer Science, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Simon Rumpel
- Institute of Physiology, Focus Program Translational Neurosciences, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.
| |
Collapse
|
2
|
Arthur R. Detectability constraints on meso-scale structure in complex networks. PLoS One 2025; 20:e0317670. [PMID: 39841660 PMCID: PMC11753644 DOI: 10.1371/journal.pone.0317670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2024] [Accepted: 01/02/2025] [Indexed: 01/24/2025] Open
Abstract
Community, core-periphery, disassortative and other node partitions allow us to understand the organisation and function of large networks. In this work we study common meso-scale structures using the idea of block modularity. We find that the configuration model imposes strong restrictions on core-periphery and related structures in directed and undirected networks. We derive inequalities expressing when such structures can be detected under the configuration model which are closely related to the resolution limit. Nestedness is closely related to core-periphery and is similarly restricted to only be detectable under certain conditions. We then derive a general equivalence between optimising block modularity and maximum likelihood estimation of the parameters of the degree corrected Stochastic Block Model. This allows us to contrast the two approaches, how they formalise the structure detection problem and understand these constraints in inferential versus descriptive approaches to meso-scale structure detection.
Collapse
Affiliation(s)
- Rudy Arthur
- Department of Computer Science, University of Exeter, Exeter, United Kingdom
| |
Collapse
|
3
|
Iskov NB, Olsen AS, Madsen KH, Mørup M. Discovering prominent differences in structural and functional connectomes using a multinomial stochastic block model. Netw Neurosci 2024; 8:1243-1264. [PMID: 39735501 PMCID: PMC11674489 DOI: 10.1162/netn_a_00399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Accepted: 06/13/2024] [Indexed: 12/31/2024] Open
Abstract
Understanding the differences between functional and structural human brain connectivity has been a focus of an extensive amount of neuroscience research. We employ a novel approach using the multinomial stochastic block model (MSBM) to explicitly extract components that characterize prominent differences across graphs. We analyze structural and functional connectomes derived from high-resolution diffusion-weighted MRI and fMRI scans of 250 Human Connectome Project subjects, analyzed at group connectivity level across 50 subjects. The inferred brain partitions revealed consistent, spatially homogeneous clustering patterns across inferred resolutions demonstrating the MSBM's reliability in identifying brain areas with prominent structure-function differences. Prominent differences in low-resolution brain maps (K = {3, 4} clusters) were attributed to weak functional connectivity in the bilateral anterior temporal lobes, while higher resolution results (K ≥ 25) revealed stronger interhemispheric functional than structural connectivity. Our findings emphasize significant differences in high-resolution functional and structural connectomes, revealing challenges in extracting meaningful connectivity measurements from both modalities, including tracking fibers through the corpus callosum and attenuated functional connectivity in anterior temporal lobe fMRI data, which we attribute to increased noise levels. The MSBM emerges as a valuable tool for understanding differences across graphs, with potential future applications and avenues beyond the current focus on characterizing modality-specific distinctions in connectomics data.
Collapse
Affiliation(s)
- Nina Braad Iskov
- Department of Applied Mathematics and Computer Science, Technical University of Denmark, Lyngby, Denmark
| | - Anders Stevnhoved Olsen
- Department of Applied Mathematics and Computer Science, Technical University of Denmark, Lyngby, Denmark
| | - Kristoffer Hougaard Madsen
- Department of Applied Mathematics and Computer Science, Technical University of Denmark, Lyngby, Denmark
- Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital - Amager and Hvidovre, Copenhagen, Denmark
| | - Morten Mørup
- Department of Applied Mathematics and Computer Science, Technical University of Denmark, Lyngby, Denmark
| |
Collapse
|
4
|
Aref S, Mostajabdaveh M, Chheda H. Bayan algorithm: Detecting communities in networks through exact and approximate optimization of modularity. Phys Rev E 2024; 110:044315. [PMID: 39562863 DOI: 10.1103/physreve.110.044315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2024] [Accepted: 09/24/2024] [Indexed: 11/21/2024]
Abstract
Community detection is a classic network problem with extensive applications in various fields. Its most common method is using modularity maximization heuristics which rarely return an optimal partition or anything similar. Partitions with globally optimal modularity are difficult to compute, and therefore have been underexplored. Using structurally diverse networks, we compare 30 community detection methods including our proposed algorithm that offers optimality and approximation guarantees: the Bayan algorithm. Unlike existing methods, Bayan globally maximizes modularity or approximates it within a factor. Our results show the distinctive accuracy and stability of maximum-modularity partitions in retrieving planted partitions at rates higher than most alternatives for a wide range of parameter settings in two standard benchmarks. Compared to the partitions from 29 other algorithms, maximum-modularity partitions have the best medians for description length, coverage, performance, average conductance, and well clusteredness. These advantages come at the cost of additional computations which Bayan makes possible for small networks (networks that have up to 3000 edges in their largest connected component). Bayan is several times faster than using open-source and commercial solvers for modularity maximization, making it capable of finding optimal partitions for instances that cannot be optimized by any other existing method. Our results point to a few well-performing algorithms, among which Bayan stands out as the most reliable method for small networks. A python implementation of the Bayan algorithm (bayanpy) is publicly available through the package installer for python.
Collapse
|
5
|
Menesse G, Houben AM, Soriano J, Torres JJ. Integrated information decomposition unveils major structural traits of in silico and in vitro neuronal networks. CHAOS (WOODBURY, N.Y.) 2024; 34:053139. [PMID: 38809907 DOI: 10.1063/5.0201454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 05/06/2024] [Indexed: 05/31/2024]
Abstract
The properties of complex networked systems arise from the interplay between the dynamics of their elements and the underlying topology. Thus, to understand their behavior, it is crucial to convene as much information as possible about their topological organization. However, in large systems, such as neuronal networks, the reconstruction of such topology is usually carried out from the information encoded in the dynamics on the network, such as spike train time series, and by measuring the transfer entropy between system elements. The topological information recovered by these methods does not necessarily capture the connectivity layout, but rather the causal flow of information between elements. New theoretical frameworks, such as Integrated Information Decomposition (Φ-ID), allow one to explore the modes in which information can flow between parts of a system, opening a rich landscape of interactions between network topology, dynamics, and information. Here, we apply Φ-ID on in silico and in vitro data to decompose the usual transfer entropy measure into different modes of information transfer, namely, synergistic, redundant, or unique. We demonstrate that the unique information transfer is the most relevant measure to uncover structural topological details from network activity data, while redundant information only introduces residual information for this application. Although the retrieved network connectivity is still functional, it captures more details of the underlying structural topology by avoiding to take into account emergent high-order interactions and information redundancy between elements, which are important for the functional behavior, but mask the detection of direct simple interactions between elements constituted by the structural network topology.
Collapse
Affiliation(s)
- Gustavo Menesse
- Department of Electromagnetism and Physics of the Matter & Institute Carlos I for Theoretical and Computational Physics, University of Granada, 18071 Granada, Spain
- Departamento de Física, Facultad de Ciencias Exactas y Naturales, Universidad Nacional de Asunción, 111451 San Lorenzo, Paraguay
| | - Akke Mats Houben
- Departament de Física de la Matèria Condensada, Universitat de Barcelona and Universitat de Barcelona Institute of Complex Systems (UBICS), E-08028 Barcelona, Spain
| | - Jordi Soriano
- Departament de Física de la Matèria Condensada, Universitat de Barcelona and Universitat de Barcelona Institute of Complex Systems (UBICS), E-08028 Barcelona, Spain
| | - Joaquín J Torres
- Department of Electromagnetism and Physics of the Matter & Institute Carlos I for Theoretical and Computational Physics, University of Granada, 18071 Granada, Spain
| |
Collapse
|
6
|
Brooks SJ, Jones VO, Wang H, Deng C, Golding SGH, Lim J, Gao J, Daoutidis P, Stamoulis C. Community detection in the human connectome: Method types, differences and their impact on inference. Hum Brain Mapp 2024; 45:e26669. [PMID: 38553865 PMCID: PMC10980844 DOI: 10.1002/hbm.26669] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 03/06/2024] [Accepted: 03/12/2024] [Indexed: 04/02/2024] Open
Abstract
Community structure is a fundamental topological characteristic of optimally organized brain networks. Currently, there is no clear standard or systematic approach for selecting the most appropriate community detection method. Furthermore, the impact of method choice on the accuracy and robustness of estimated communities (and network modularity), as well as method-dependent relationships between network communities and cognitive and other individual measures, are not well understood. This study analyzed large datasets of real brain networks (estimated from resting-state fMRI fromn $$ n $$ = 5251 pre/early adolescents in the adolescent brain cognitive development [ABCD] study), andn $$ n $$ = 5338 synthetic networks with heterogeneous, data-inspired topologies, with the goal to investigate and compare three classes of community detection methods: (i) modularity maximization-based (Newman and Louvain), (ii) probabilistic (Bayesian inference within the framework of stochastic block modeling (SBM)), and (iii) geometric (based on graph Ricci flow). Extensive comparisons between methods and their individual accuracy (relative to the ground truth in synthetic networks), and reliability (when applied to multiple fMRI runs from the same brains) suggest that the underlying brain network topology plays a critical role in the accuracy, reliability and agreement of community detection methods. Consistent method (dis)similarities, and their correlations with topological properties, were estimated across fMRI runs. Based on synthetic graphs, most methods performed similarly and had comparable high accuracy only in some topological regimes, specifically those corresponding to developed connectomes with at least quasi-optimal community organization. In contrast, in densely and/or weakly connected networks with difficult to detect communities, the methods yielded highly dissimilar results, with Bayesian inference within SBM having significantly higher accuracy compared to all others. Associations between method-specific modularity and demographic, anthropometric, physiological and cognitive parameters showed mostly method invariance but some method dependence as well. Although method sensitivity to different levels of community structure may in part explain method-dependent associations between modularity estimates and parameters of interest, method dependence also highlights potential issues of reliability and reproducibility. These findings suggest that a probabilistic approach, such as Bayesian inference in the framework of SBM, may provide consistently reliable estimates of community structure across network topologies. In addition, to maximize robustness of biological inferences, identified network communities and their cognitive, behavioral and other correlates should be confirmed with multiple reliable detection methods.
Collapse
Affiliation(s)
- Skylar J. Brooks
- Boston Children's HospitalDepartment of PediatricsBostonMassachusettsUSA
- University of California BerkeleyHelen Wills Neuroscience InstituteBerkeleyCaliforniaUSA
| | - Victoria O. Jones
- University of MinnesotaDepartment of Chemical Engineering and Material ScienceMinneapolisMinnesotaUSA
| | - Haotian Wang
- Rutgers UniversityDepartment of Computer SciencePiscatawayNew JerseyUSA
| | - Chengyuan Deng
- Rutgers UniversityDepartment of Computer SciencePiscatawayNew JerseyUSA
| | | | - Jethro Lim
- Boston Children's HospitalDepartment of PediatricsBostonMassachusettsUSA
| | - Jie Gao
- Rutgers UniversityDepartment of Computer SciencePiscatawayNew JerseyUSA
| | - Prodromos Daoutidis
- University of MinnesotaDepartment of Chemical Engineering and Material ScienceMinneapolisMinnesotaUSA
| | - Catherine Stamoulis
- Boston Children's HospitalDepartment of PediatricsBostonMassachusettsUSA
- Harvard Medical SchoolDepartment of PediatricsBostonMassachusettsUSA
| |
Collapse
|
7
|
Tavis S, Hettich RL. Multi-Omics integration can be used to rescue metabolic information for some of the dark region of the Pseudomonas putida proteome. BMC Genomics 2024; 25:267. [PMID: 38468234 PMCID: PMC10926591 DOI: 10.1186/s12864-024-10082-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 02/02/2024] [Indexed: 03/13/2024] Open
Abstract
In every omics experiment, genes or their products are identified for which even state of the art tools are unable to assign a function. In the biotechnology chassis organism Pseudomonas putida, these proteins of unknown function make up 14% of the proteome. This missing information can bias analyses since these proteins can carry out functions which impact the engineering of organisms. As a consequence of predicting protein function across all organisms, function prediction tools generally fail to use all of the types of data available for any specific organism, including protein and transcript expression information. Additionally, the release of Alphafold predictions for all Uniprot proteins provides a novel opportunity for leveraging structural information. We constructed a bespoke machine learning model to predict the function of recalcitrant proteins of unknown function in Pseudomonas putida based on these sources of data, which annotated 1079 terms to 213 proteins. Among the predicted functions supplied by the model, we found evidence for a significant overrepresentation of nitrogen metabolism and macromolecule processing proteins. These findings were corroborated by manual analyses of selected proteins which identified, among others, a functionally unannotated operon that likely encodes a branch of the shikimate pathway.
Collapse
Affiliation(s)
- Steven Tavis
- Genome Science and Technology Graduate Program, University of Tennessee Knoxville, Knoxville, USA
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Robert L Hettich
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA.
| |
Collapse
|
8
|
Gaiteri C, Connell DR, Sultan FA, Iatrou A, Ng B, Szymanski BK, Zhang A, Tasaki S. Robust, scalable, and informative clustering for diverse biological networks. Genome Biol 2023; 24:228. [PMID: 37828545 PMCID: PMC10571258 DOI: 10.1186/s13059-023-03062-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 09/19/2023] [Indexed: 10/14/2023] Open
Abstract
Clustering molecular data into informative groups is a primary step in extracting robust conclusions from big data. However, due to foundational issues in how they are defined and detected, such clusters are not always reliable, leading to unstable conclusions. We compare popular clustering algorithms across thousands of synthetic and real biological datasets, including a new consensus clustering algorithm-SpeakEasy2: Champagne. These tests identify trends in performance, show no single method is universally optimal, and allow us to examine factors behind variation in performance. Multiple metrics indicate SpeakEasy2 generally provides robust, scalable, and informative clusters for a range of applications.
Collapse
Affiliation(s)
- Chris Gaiteri
- Department of Psychiatry and Behavioral Sciences, SUNY Upstate Medical University, Syracuse, NY, USA.
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA.
- Department of Neurological Sciences, Rush University Medical Center, Chicago, IL, USA.
| | - David R Connell
- Rush University Graduate College, Rush University Medical Center, Chicago, IL, USA
| | - Faraz A Sultan
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA
| | - Artemis Iatrou
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA
- Department of Psychiatry, McLean Hospital, Harvard Medical School, Harvard University, Belmont, MA, USA
| | - Bernard Ng
- Department of Psychiatry and Behavioral Sciences, SUNY Upstate Medical University, Syracuse, NY, USA
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA
| | - Boleslaw K Szymanski
- Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY, USA
- Network Science and Technology Center, Rensselaer Polytechnic Institute, Troy, NY, USA
- Academy of Social Sciences, Łódź, Poland
| | - Ada Zhang
- Department of Psychiatry and Behavioral Sciences, SUNY Upstate Medical University, Syracuse, NY, USA
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA
| | - Shinya Tasaki
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA
- Department of Neurological Sciences, Rush University Medical Center, Chicago, IL, USA
| |
Collapse
|
9
|
Peixoto TP, Kirkley A. Implicit models, latent compression, intrinsic biases, and cheap lunches in community detection. Phys Rev E 2023; 108:024309. [PMID: 37723811 DOI: 10.1103/physreve.108.024309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 08/02/2023] [Indexed: 09/20/2023]
Abstract
The task of community detection, which aims to partition a network into clusters of nodes to summarize its large-scale structure, has spawned the development of many competing algorithms with varying objectives. Some community detection methods are inferential, explicitly deriving the clustering objective through a probabilistic generative model, while other methods are descriptive, dividing a network according to an objective motivated by a particular application, making it challenging to compare these methods on the same scale. Here we present a solution to this problem that associates any community detection objective, inferential or descriptive, with its corresponding implicit network generative model. This allows us to compute the description length of a network and its partition under arbitrary objectives, providing a principled measure to compare the performance of different algorithms without the need for "ground-truth" labels. Our approach also gives access to instances of the community detection problem that are optimal to any given algorithm and in this way reveals intrinsic biases in popular descriptive methods, explaining their tendency to overfit. Using our framework, we compare a number of community detection methods on artificial networks and on a corpus of over 500 structurally diverse empirical networks. We find that more expressive community detection methods exhibit consistently superior compression performance on structured data instances, without having degraded performance on a minority of situations where more specialized algorithms perform optimally. Our results undermine the implications of the "no free lunch" theorem for community detection, both conceptually and in practice, since it is confined to unstructured data instances, unlike relevant community detection problems which are structured by requirement.
Collapse
Affiliation(s)
- Tiago P Peixoto
- Department of Network and Data Science, Central European University, 1100 Vienna, Austria
| | - Alec Kirkley
- Institute of Data Science, University of Hong Kong, Hong Kong; Department of Urban Planning and Design, University of Hong Kong, Hong Kong; and Urban Systems Institute, University of Hong Kong, Hong Kong
| |
Collapse
|
10
|
Bernenko D, Lee SH, Stenberg P, Lizana L. Mapping the semi-nested community structure of 3D chromosome contact networks. PLoS Comput Biol 2023; 19:e1011185. [PMID: 37432974 PMCID: PMC10361492 DOI: 10.1371/journal.pcbi.1011185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 07/21/2023] [Accepted: 05/15/2023] [Indexed: 07/13/2023] Open
Abstract
Mammalian DNA folds into 3D structures that facilitate and regulate genetic processes such as transcription, DNA repair, and epigenetics. Several insights derive from chromosome capture methods, such as Hi-C, which allow researchers to construct contact maps depicting 3D interactions among all DNA segment pairs. These maps show a complex cross-scale organization spanning megabase-pair compartments to short-ranged DNA loops. To better understand the organizing principles, several groups analyzed Hi-C data assuming a Russian-doll-like nested hierarchy where DNA regions of similar sizes merge into larger and larger structures. Apart from being a simple and appealing description, this model explains, e.g., the omnipresent chequerboard pattern seen in Hi-C maps, known as A/B compartments, and foreshadows the co-localization of some functionally similar DNA regions. However, while successful, this model is incompatible with the two competing mechanisms that seem to shape a significant part of the chromosomes' 3D organization: loop extrusion and phase separation. This paper aims to map out the chromosome's actual folding hierarchy from empirical data. To this end, we take advantage of Hi-C experiments and treat the measured DNA-DNA interactions as a weighted network. From such a network, we extract 3D communities using the generalized Louvain algorithm. This algorithm has a resolution parameter that allows us to scan seamlessly through the community size spectrum, from A/B compartments to topologically associated domains (TADs). By constructing a hierarchical tree connecting these communities, we find that chromosomes are more complex than a perfect hierarchy. Analyzing how communities nest relative to a simple folding model, we found that chromosomes exhibit a significant portion of nested and non-nested community pairs alongside considerable randomness. In addition, by examining nesting and chromatin types, we discovered that nested parts are often associated with active chromatin. These results highlight that cross-scale relationships will be essential components in models aiming to reach a deep understanding of the causal mechanisms of chromosome folding.
Collapse
Affiliation(s)
- Dolores Bernenko
- Department of Physics, Integrated Science Lab, Umeå University, Umeå, Sweden
| | - Sang Hoon Lee
- Department of Physics and Research Institute of Natural Science, Gyeongsang National University, Jinju, Korea
- Future Convergence Technology Research Institute, Gyeongsang National University, Jinju, Korea
| | - Per Stenberg
- Department of Ecology and Environmental Science, Umeå University, Umeå, Sweden
| | - Ludvig Lizana
- Department of Physics, Integrated Science Lab, Umeå University, Umeå, Sweden
| |
Collapse
|
11
|
Tripathi R, Reza A, Mertel A, Su G, Calabrese JM. A network-based approach to identifying correlations between phylogeny, morphological traits and occurrence of fish species in US river basins. PLoS One 2023; 18:e0287482. [PMID: 37352314 PMCID: PMC10289417 DOI: 10.1371/journal.pone.0287482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Accepted: 06/06/2023] [Indexed: 06/25/2023] Open
Abstract
The complex network framework has been successfully used to model interactions between entities in Complex Systems in the Biological Sciences such as Proteomics, Genomics, Neuroscience, and Ecology. Networks of organisms at different spatial scales and in different ecosystems have provided insights into community assembly patterns and emergent properties of ecological systems. In the present work, we investigate two questions pertaining to fish species assembly rules in US river basins, a) if morphologically similar fish species also tend to be phylogenetically closer, and b) to what extent are co-occurring species that are phylogenetically close also morphologically similar? For the first question, we construct a network of Hydrologic Unit Code 8 (HUC8) regions as nodes with interaction strengths (edges) governed by the number of common species. For each of the modules of this network, which are found to be geographically separated, there is differential yet significant evidence that phylogenetic distance predicts morphological distance. For the second question, we construct and analyze nearest neighbor directed networks of species based on their morphological distances and phylogenetic distances. Through module detection on these networks and comparing the module-level mean phylogenetic distance and mean morphological distance with the number of basins of common occurrence of species in modules, we find that both phylogeny and morphology of species have significant roles in governing species co-occurrence, i.e. phylogenetically and morphologically distant species tend to co-exist more. In addition, between the two quantities (morphological distance and phylogentic distance), we find that morphological distance is a stronger determinant of species co-occurrences.
Collapse
Affiliation(s)
- Richa Tripathi
- Center for Advanced Systems Understanding (CASUS), Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Görlitz, Germany
| | - Amit Reza
- Nikhef, Amsterdam, The Netherlands
- Institute for Gravitational and Subatomic Physics (GRASP), Utrecht University, CC Utrecht, The Netherlands
| | - Adam Mertel
- Center for Advanced Systems Understanding (CASUS), Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Görlitz, Germany
| | - Guohuan Su
- Center for Advanced Systems Understanding (CASUS), Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Görlitz, Germany
| | - Justin M. Calabrese
- Center for Advanced Systems Understanding (CASUS), Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Görlitz, Germany
- Dept. of Ecological Modelling, UFZ – Helmholtz Centre for Environmental Research, Leipzig, Germany
- Dept. of Biology, University of Maryland, College Park, MD, United States of America
| |
Collapse
|
12
|
Schaub MT, Li J, Peel L. Hierarchical community structure in networks. Phys Rev E 2023; 107:054305. [PMID: 37329032 DOI: 10.1103/physreve.107.054305] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2020] [Accepted: 04/24/2023] [Indexed: 06/18/2023]
Abstract
Modular and hierarchical community structures are pervasive in real-world complex systems. A great deal of effort has gone into trying to detect and study these structures. Important theoretical advances in the detection of modular have included identifying fundamental limits of detectability by formally defining community structure using probabilistic generative models. Detecting hierarchical community structure introduces additional challenges alongside those inherited from community detection. Here we present a theoretical study on hierarchical community structure in networks, which has thus far not received the same rigorous attention. We address the following questions. (1) How should we define a hierarchy of communities? (2) How do we determine if there is sufficient evidence of a hierarchical structure in a network? (3) How can we detect hierarchical structure efficiently? We approach these questions by introducing a definition of hierarchy based on the concept of stochastic externally equitable partitions and their relation to probabilistic models, such as the popular stochastic block model. We enumerate the challenges involved in detecting hierarchies and, by studying the spectral properties of hierarchical structure, present an efficient and principled method for detecting them.
Collapse
Affiliation(s)
- Michael T Schaub
- Department of Computer Science, RWTH Aachen University, 52074 Aachen, Germany
| | - Jiaze Li
- Department of Data Analytics and Digitalisation, School of Business and Economics, Maastricht University, 6211 LM Maastricht, The Netherlands
| | - Leto Peel
- Department of Data Analytics and Digitalisation, School of Business and Economics, Maastricht University, 6211 LM Maastricht, The Netherlands
| |
Collapse
|
13
|
Zhang W, Yin M, Jiang M, Dai Q. Partitioned estimation methodology of biological neuronal networks with topology-based module detection. Comput Biol Med 2023; 154:106552. [PMID: 36738704 DOI: 10.1016/j.compbiomed.2023.106552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 12/27/2022] [Accepted: 01/11/2023] [Indexed: 02/02/2023]
Abstract
Parameter estimation of neuronal networks is closely related with information processing mechanisms in neural systems. Estimation of synaptic parameters for neuronal networks was an time consuming task. Due to complex interactions between neurons, computational efficiency and accuracy of estimation methods is relatively low. Meanwhile, inherent topological properties such as core-periphery and modular structures are not fully considered in estimation. In order to improve the efficiency and accuracy of estimation, this study proposes a two-stage PartitionMLE method which introduces detected neuronal modules as topological constraints in estimation. The proposed PartitionMLE method firstly decomposes the system into multiple non-overlapping neuronal modules, by performing topology-based module detection. Dynamic parameters including intra-modular and inter-modular parameters are estimated in two stages, using detected hubs to connect non-overlapping neuronal modules. The contributions of PartitionMLE method are two-folds: reducing estimation errors and improving the model interpretability. Experiments about neuronal networks consisting of Hodgkin-Huxley (HH) and leaky integrate-and-firing (LIF) neurons validated the effectiveness of the PartitionMLE method, with comparison to the single-stage MLE method.
Collapse
Affiliation(s)
- Wei Zhang
- Zhejiang Sci-Tech University, Second Street 928, Hangzhou, 310018, China.
| | - Muqi Yin
- Institute of Cyber-Systems and Control, Zhejiang University, Zheda Road 38, Hangzhou, 310027, China
| | - Mingfeng Jiang
- Zhejiang Sci-Tech University, Second Street 928, Hangzhou, 310018, China
| | - Qi Dai
- Zhejiang Sci-Tech University, Second Street 928, Hangzhou, 310018, China.
| |
Collapse
|
14
|
Wierzbiński M, Falcó-Roget J, Crimi A. Community detection in brain connectomes with hybrid quantum computing. Sci Rep 2023; 13:3446. [PMID: 36859591 PMCID: PMC9977923 DOI: 10.1038/s41598-023-30579-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Accepted: 02/27/2023] [Indexed: 03/03/2023] Open
Abstract
Recent advancements in network neuroscience are pointing in the direction of considering the brain as a small-world system with an efficient integration-segregation balance that facilitates different cognitive tasks and functions. In this context, community detection is a pivotal issue in computational neuroscience. In this paper we explored community detection within brain connectomes using the power of quantum annealers, and in particular the Leap's Hybrid Solver in D-Wave. By reframing the modularity optimization problem into a Discrete Quadratic Model, we show that quantum annealers achieved higher modularity indices compared to the Louvain Community Detection Algorithm without the need to overcomplicate the mathematical formulation. We also found that the number of communities detected in brain connectomes slightly differed while still being biologically interpretable. These promising preliminary results, together with recent findings, strengthen the claim that quantum optimization methods might be a suitable alternative against classical approaches when dealing with community assignment in networks.
Collapse
Affiliation(s)
- Marcin Wierzbiński
- grid.425010.20000 0001 2286 5863University of Warsaw, Institute of Mathematics, Warsaw, 02-097 Poland ,Sano Center for Compuational Personalised Medicine, Computer Vision Group, Krakow, 30-054 Poland
| | - Joan Falcó-Roget
- Sano Center for Compuational Personalised Medicine, Computer Vision Group, Krakow, 30-054 Poland
| | - Alessandro Crimi
- Sano Center for Compuational Personalised Medicine, Computer Vision Group, Krakow, 30-054, Poland.
| |
Collapse
|
15
|
Durand‐Bessart C, Cordeiro NJ, Chapman CA, Abernethy K, Forget P, Fontaine C, Bretagnolle F. Trait matching and sampling effort shape the structure of the frugivory network in Afrotropical forests. THE NEW PHYTOLOGIST 2023; 237:1446-1462. [PMID: 36377098 PMCID: PMC10108259 DOI: 10.1111/nph.18619] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 10/13/2022] [Indexed: 06/16/2023]
Abstract
Frugivory in tropical forests is a major ecological process as most tree species rely on frugivores to disperse their seeds. However, the underlying mechanisms driving frugivore-plant networks remain understudied. Here, we evaluate the data available on the Afrotropical frugivory network to identify structural properties, as well as assess knowledge gaps. We assembled a database of frugivory interactions from the literature with > 10 000 links, between 807 tree and 285 frugivore species. We analysed the network structure using a block model that groups species with similar interaction patterns and estimates interaction probabilities among them. We investigated the species traits related to this grouping structure. This frugivory network was simplified into 14 tree and 14 frugivore blocks. The block structure depended on the sampling effort among species: Large mammals were better-studied, while smaller frugivores were the least studied. Species traits related to frugivory were strong predictors of the species composition of blocks and interactions among them. Fruits from larger trees were consumed by most frugivores, and large frugivores had higher probabilities to consume larger fruits. To conclude, this large-scale frugivory network was mainly structured by species traits involved in frugivory, and as expected by the distribution areas of species, while still being limited by sampling incompleteness.
Collapse
Affiliation(s)
- Clémentine Durand‐Bessart
- Biogeosciences, UMR 6282Université Bourgogne Franche Comte‐CNRS21000DijonFrance
- Centre d'Ecologie et des Sciences de la ConservationCESCO, UMR 7204, MNHN‐CNRS‐SU75005ParisFrance
| | - Norbert J. Cordeiro
- Department of Biology (mc WB 816)Roosevelt University430 S. Michigan AvenueChicagoIL60605USA
- Science & EducationThe Field Museum1400 S. Lake Shore DriveChicagoIL60605USA
| | - Colin A. Chapman
- Wilson Center1300 Pennsylvania Avenue NWWashingtonDC20004USA
- Department of AnthropologyCenter for the Advanced Study of Human Paleobiology, The George Washington UniversityWashingtonDC20037USA
- School of Life SciencesUniversity of KwaZulu‐NatalScottsville3201PietermaritzburgSouth Africa
- Shaanxi Key Laboratory for Animal ConservationNorthwest University710069Xi'anChina
| | - Katharine Abernethy
- African Forest Ecology Group, School of Natural SciencesUniversity of StirlingStirlingFK9 4LAUK
- Institut de Recherches en Ecologie TropicaleCENARESTGros Bouquet2144LibrevilleGabon
| | - Pierre‐Michel Forget
- Muséum National d'Histoire NaturelleUMR 7179 MECADEV CNRS‐MNHN1 Avenue du Petit Château91800BrunoyFrance
| | - Colin Fontaine
- Centre d'Ecologie et des Sciences de la ConservationCESCO, UMR 7204, MNHN‐CNRS‐SU75005ParisFrance
| | | |
Collapse
|
16
|
Miao R, Li T. Informative core identification in complex networks. J R Stat Soc Series B Stat Methodol 2023. [DOI: 10.1093/jrsssb/qkac009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Abstract
In a complex network, the core component with interesting structures is usually hidden within noninformative connections. The noises and bias introduced by the noninformative component can obscure the salient structure and limit many network modeling procedures’ effectiveness. This paper introduces a novel core–periphery model for the noninformative periphery structure of networks without imposing a specific form of the core. We propose spectral algorithms for core identification for general downstream network analysis tasks under the model. The algorithms enjoy strong performance guarantees and are scalable for large networks. We evaluate the methods by extensive simulation studies demonstrating advantages over multiple traditional core–periphery methods. The methods are also used to extract the core structure from a citation network, which results in a more interpretable hierarchical community detection.
Collapse
Affiliation(s)
- Ruizhong Miao
- Department of Statistics, University of Virginia , Charlottesville, VA , USA
| | - Tianxi Li
- Department of Statistics, University of Virginia , Charlottesville, VA , USA
| |
Collapse
|
17
|
Logan AP, LaCasse PM, Lunday BJ. Social network analysis of Twitter interactions: a directed multilayer network approach. SOCIAL NETWORK ANALYSIS AND MINING 2023; 13:65. [PMID: 37041934 PMCID: PMC10081299 DOI: 10.1007/s13278-023-01063-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Revised: 03/04/2023] [Accepted: 03/04/2023] [Indexed: 04/13/2023]
Abstract
Effective employment of social media for any social influence outcome requires a detailed understanding of the target audience. Social media provides a rich repository of self-reported information that provides insight regarding the sentiments and implied priorities of an online population. Using Social Network Analysis, this research models user interactions on Twitter as a weighted, directed network. Topic modeling through Latent Dirichlet Allocation identifies the topics of discussion in Tweets, which this study uses to induce a directed multilayer network wherein users (in one layer) are connected to the conversations and topics (in a second layer) in which they have participated, with inter-layer connections representing user participation in conversations. Analysis of the resulting network identifies both influential users and highly connected groups of individuals, informing an understanding of group dynamics and individual connectivity. The results demonstrate that the generation of a topically-focused social network to represent conversations yields more robust findings regarding influential users, particularly when analysts collect Tweets from a variety of discussions through more general search queries. Within the analysis, PageRank performed best among four measures used to rank individual influence within this problem context. In contrast, the results of applying both the Greedy Modular Algorithm and the Leiden Algorithm to identify communities were mixed; each method yielded valuable insights, but neither technique was uniformly superior. The demonstrated four-step process is readily replicable, and an interested user can automate the process with relatively low effort or expense.
Collapse
Affiliation(s)
- Austin P. Logan
- Directorate of Plans, Programs, and Requirements, Air Combat Command, 129 Andrews Street, Langley Air Force Base, VA 23665 USA
| | - Phillip M. LaCasse
- Department of Operational Sciences, Air Force Institute of Technology, 2950 Hobson Way, Wright-Patterson Air Force Base, OH 45433 USA
| | - Brian J. Lunday
- Department of Operational Sciences, Air Force Institute of Technology, 2950 Hobson Way, Wright-Patterson Air Force Base, OH 45433 USA
| |
Collapse
|
18
|
Okamoto H, Qiu X. Detecting hierarchical organization of pervasive communities by modular decomposition of Markov chain. Sci Rep 2022; 12:20211. [PMID: 36418410 PMCID: PMC9684584 DOI: 10.1038/s41598-022-24567-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Accepted: 11/17/2022] [Indexed: 11/25/2022] Open
Abstract
Connecting nodes that contingently co-appear, which is a common process of networking in social and biological systems, normally leads to modular structure characterized by the absence of definite boundaries. This study seeks to find and evaluate methods to detect such modules, which will be called 'pervasive' communities. We propose a mathematical formulation to decompose a random walk spreading over the entire network into localized random walks as a proxy for pervasive communities. We applied this formulation to biological and social as well as synthetic networks to demonstrate that it can properly detect communities as pervasively structured objects. We further addressed a question that is fundamental but has been little discussed so far: What is the hierarchical organization of pervasive communities and how can it be extracted? Here we show that hierarchical organization of pervasive communities is unveiled from finer to coarser layers through discrete phase transitions that intermittently occur as the value for a resolution-controlling parameter is quasi-statically increased. To our knowledge, this is the first elucidation of how the pervasiveness and hierarchy, both hallmarks of community structure of real-world networks, are unified.
Collapse
Affiliation(s)
- Hiroshi Okamoto
- Department of Bioengineering, The University of Tokyo, Tokyo, 113-8656, Japan.
- DWANGO Co., Ltd., Tokyo , Japan.
| | - Xule Qiu
- FUJIFILM Business Innovation Corp., Tokyo, Japan
| |
Collapse
|
19
|
Peel L, Peixoto TP, De Domenico M. Statistical inference links data and theory in network science. Nat Commun 2022; 13:6794. [PMID: 36357376 PMCID: PMC9649740 DOI: 10.1038/s41467-022-34267-9] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 10/18/2022] [Indexed: 11/11/2022] Open
Abstract
The number of network science applications across many different fields has been rapidly increasing. Surprisingly, the development of theory and domain-specific applications often occur in isolation, risking an effective disconnect between theoretical and methodological advances and the way network science is employed in practice. Here we address this risk constructively, discussing good practices to guarantee more successful applications and reproducible results. We endorse designing statistically grounded methodologies to address challenges in network science. This approach allows one to explain observational data in terms of generative models, naturally deal with intrinsic uncertainties, and strengthen the link between theory and applications. Theoretical models and structures recovered from measured data serve for analysis of complex networks. The authors discuss here existing gaps between theoretical methods and real-world applied networks, and potential ways to improve the interplay between theory and applications.
Collapse
|
20
|
Finite-state parameter space maps for pruning partitions in modularity-based community detection. Sci Rep 2022; 12:15928. [PMID: 36151268 PMCID: PMC9508178 DOI: 10.1038/s41598-022-20142-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 09/09/2022] [Indexed: 11/08/2022] Open
Abstract
Partitioning networks into communities of densely connected nodes is an important tool used widely across different applications, with numerous methods and software packages available for community detection. Modularity-based methods require parameters to be selected (or assume defaults) to control the resolution and, in multilayer networks, interlayer coupling. Meanwhile, most useful algorithms are heuristics yielding different near-optimal results upon repeated runs (even at the same parameters). To address these difficulties, we combine recent developments into a simple-to-use framework for pruning a set of partitions to a subset that are self-consistent by an equivalence with the objective function for inference of a degree-corrected planted partition stochastic block model (SBM). Importantly, this combined framework reduces some of the problems associated with the stochasticity that is inherent in the use of heuristics for optimizing modularity. In our examples, the pruning typically highlights only a small number of partitions that are fixed points of the corresponding map on the set of somewhere-optimal partitions in the parameter space. We also derive resolution parameter upper bounds for fitting a constrained SBM of K blocks and demonstrate that these bounds hold in practice, further guiding parameter space regions to consider. With publicly available code ( http://github.com/ragibson/ModularityPruning ), our pruning procedure provides a new baseline for using modularity-based community detection in practice.
Collapse
|
21
|
Assessment of Discrete BAT-Modified (DBAT-M) Optimization Algorithm for Community Detection in Complex Network. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2022. [DOI: 10.1007/s13369-022-07229-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
22
|
Gribel D, Gendreau M, Vidal T. Semi-supervised clustering with inaccurate pairwise annotations. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.05.035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
23
|
Zhang Z, Wan J, Zhou M, Lu K, Chen G, Liao H. Information diffusion-aware likelihood maximization optimization for community detection. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.04.009] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
24
|
Aggarwal K, Arora A. Detecting Community Structure in Financial Markets Using the Bat Optimization Algorithm. INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY PROJECT MANAGEMENT 2022. [DOI: 10.4018/ijitpm.313421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
A lucid representation of the hidden structure of real-world application has attracted complex network research communities and triggered a vast number of solutions in order to resolve complex network issues. In the same direction, initially, this paper proposes a methodology to act on the financial dataset and construct a stock correlation network of four stock indexes based on the closing stock price. The significance of this research work is to form an effective stock community based on their complex price pattern dependencies (i.e., simultaneous fluctuations in stock prices of companies in a time series data). This paper proposes a community detection approach for stock correlation complex networks using the BAT optimization algorithm aiming to achieve high modularity and better-correlated communities. Theoretical analysis and empirical modularity performance measure results have shown that the usage of BAT algorithm for community detection proves to transcend performance in comparison to standard network community detection algorithms – greedy and label propagation.
Collapse
Affiliation(s)
- Kirti Aggarwal
- Jaypee Institute of Information Technology, Noida, India
| | - Anuja Arora
- Jaypee Institute of Information Technology, Noida, India
| |
Collapse
|
25
|
Wiratsudakul A, Wongnak P, Thanapongtharm W. Emerging infectious diseases may spread across pig trade networks in Thailand once introduced: a network analysis approach. Trop Anim Health Prod 2022; 54:209. [PMID: 35687155 DOI: 10.1007/s11250-022-03205-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Accepted: 05/25/2022] [Indexed: 11/30/2022]
Abstract
In Thailand, pork is one of the most consumed meats nationwide. Pig farming is hence an important business in the country. However, 95% of the farms were considered smallholders raising only 50 pigs or less. With limited budgets and resources, the biosecurity level in these farms is relatively low. Pig movements have been previously identified as a risk factor in the spread of infectious diseases. Therefore, the present study aimed to explicitly analyze the pig movement network structure and assess its vulnerability to the spread of emerging diseases in Thailand. We used official electronic records of nationwide pig movements throughout the year 2021 to construct a directed weighted one-mode network. Degree centrality, degree distribution, connected components, network community, and modularity were measured to explore the network architectures and properties. In this network, 484,483 pig movements were captured. In which, 379,948 (78.42%) were moved toward slaughterhouses and hence excluded from further analyses. From the remaining links, we suggested that the pig movement network in Thailand was vulnerable to the spread of emerging infectious diseases. Within the network, we found a strongly connected component (SCC) connecting 1044 subdistricts (38.6% of the nodes), a giant weakly connected component (GWCC) covering 98.2% of the nodes (2654/2704), and inter-regional communities with overall network modularity of 0.68. The disease may rapidly spread throughout the country. A better understanding of the nationwide pig movement networks is helpful in tailoring control interventions to cope with the newly emerged diseases once introduced.
Collapse
Affiliation(s)
- Anuwat Wiratsudakul
- Department of Clinical Sciences and Public Health and the Monitoring and Surveillance Center for Zoonotic Diseases in Wildlife and Exotic Animals, Faculty of Veterinary Science, Mahidol University, Nakhon Pathom, Thailand.
| | - Phrutsamon Wongnak
- Université de Lyon, INRAE, VetAgro Sup, UMR EPIA, 69280, Marcy-l'Etoile, France.,Université Clermont Auvergne, INRAE, VetAgro Sup, UMR EPIA, 63122, Saint-Genès-Champanelle, France
| | | |
Collapse
|
26
|
|
27
|
Wang H, Ma C, Chen HS, Lai YC, Zhang HF. Full reconstruction of simplicial complexes from binary contagion and Ising data. Nat Commun 2022; 13:3043. [PMID: 35650211 PMCID: PMC9160016 DOI: 10.1038/s41467-022-30706-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 05/13/2022] [Indexed: 11/29/2022] Open
Abstract
Previous efforts on data-based reconstruction focused on complex networks with pairwise or two-body interactions. There is a growing interest in networks with higher-order or many-body interactions, raising the need to reconstruct such networks based on observational data. We develop a general framework combining statistical inference and expectation maximization to fully reconstruct 2-simplicial complexes with two- and three-body interactions based on binary time-series data from two types of discrete-state dynamics. We further articulate a two-step scheme to improve the reconstruction accuracy while significantly reducing the computational load. Through synthetic and real-world 2-simplicial complexes, we validate the framework by showing that all the connections can be faithfully identified and the full topology of the 2-simplicial complexes can be inferred. The effects of noisy data or stochastic disturbance are studied, demonstrating the robustness of the proposed framework. Data-driven recovery of topology is challenging for networks beyond pairwise interactions. The authors propose a framework to reconstruct complex networks with higher-order interactions from time series, focusing on networks with 2-simplexes where social contagion and Ising dynamics generate binary data.
Collapse
Affiliation(s)
- Huan Wang
- The Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Mathematical Science, Anhui University, Hefei, 230601, China
| | - Chuang Ma
- School of Internet, Anhui University, Hefei, 230601, China
| | - Han-Shuang Chen
- School of Physics and Material Science, Anhui University, Hefei, 230601, China
| | - Ying-Cheng Lai
- School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, AZ, 85287, USA
| | - Hai-Feng Zhang
- The Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Mathematical Science, Anhui University, Hefei, 230601, China.
| |
Collapse
|
28
|
Autoencoder Model Using Edge Enhancement to Detect Communities in Complex Networks. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2022. [DOI: 10.1007/s13369-022-06747-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
29
|
PITX1 Is a Regulator of TERT Expression in Prostate Cancer with Prognostic Power. Cancers (Basel) 2022; 14:cancers14051267. [PMID: 35267575 PMCID: PMC8909694 DOI: 10.3390/cancers14051267] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 02/23/2022] [Accepted: 02/24/2022] [Indexed: 02/05/2023] Open
Abstract
Simple Summary Most prostate cancer is of an indolent form and is curable. However, some prostate cancer belongs to rather aggressive subtypes leading to metastasis and death, and immediate therapy is mandatory. However, for these, the therapeutic options are highly invasive, such as radical prostatectomy, radiation or brachytherapy. Hence, a precise diagnosis of these tumor subtypes is needed, and the thus far applied diagnostic means are insufficient for this. Besides this, for their endless cell divisions, prostate cancer cells need the enzyme telomerase to elongate their telomeres (chromatin endings). In this study, we developed a gene regulatory model based on large data from transcription profiles from prostate cancer and chromatin-immuno-precipitation studies. We identified the developmental regulator PITX1 regulating telomerase. Besides observing experimental evidence of PITX1′s functional role in telomerase regulation, we also found PITX1 serving as a prognostic marker, as concluded from an analysis of more than 15,000 prostate cancer samples. Abstract The current risk stratification in prostate cancer (PCa) is frequently insufficient to adequately predict disease development and outcome. One hallmark of cancer is telomere maintenance. For telomere maintenance, PCa cells exclusively employ telomerase, making it essential for this cancer entity. However, TERT, the catalytic protein component of the reverse transcriptase telomerase, itself does not suit as a prognostic marker for prostate cancer as it is rather low expressed. We investigated if, instead of TERT, transcription factors regulating TERT may suit as prognostic markers. To identify transcription factors regulating TERT, we developed and applied a new gene regulatory modeling strategy to a comprehensive transcriptome dataset of 445 primary PCa. Six transcription factors were predicted as TERT regulators, and most prominently, the developmental morphogenic factor PITX1. PITX1 expression positively correlated with telomere staining intensity in PCa tumor samples. Functional assays and chromatin immune-precipitation showed that PITX1 activates TERT expression in PCa cells. Clinically, we observed that PITX1 is an excellent prognostic marker, as concluded from an analysis of more than 15,000 PCa samples. PITX1 expression in tumor samples associated with (i) increased Ki67 expression indicating increased tumor growth, (ii) a worse prognosis, and (iii) correlated with telomere length.
Collapse
|
30
|
Elsisy A, Mandviwalla A, Szymanski BK, Sharkey T. A network generator for covert network structures. Inf Sci (N Y) 2022; 584:387-398. [PMID: 37927357 PMCID: PMC10620467 DOI: 10.1016/j.ins.2021.10.066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 09/12/2021] [Accepted: 10/25/2021] [Indexed: 11/26/2022]
Abstract
We focus on organizational structures in covert networks, such as criminal or terrorist networks. Their members engage in illegal activities and attempt to hide their association and interactions with these networks. Hence, data about such networks are incomplete. We introduce a novel method of rewiring covert networks parameterized by the edge connectivity standard deviation. The generated networks are statistically similar to themselves and to the original network. The higher-level organizational structures are modeled as a multi-layer network while the lowest level uses the Stochastic Block Model. Such synthetic networks provide alternative structures for data about the original network. Using them, analysts can find structures that are frequent, therefore stable under perturbations. Another application is to anonymize generated networks and use them for testing new software developed in open research facilities. The results indicate that modeling edge structure and the hierarchy together is essential for generating networks that are statistically similar but not identical to each other or the original network. In experiments, we generate many synthetic networks from two covert networks. Only a few structures of synthetics networks repeat, with the most stable ones shared by 18% of all synthetic networks making them strong candidates for the ground truth structure.
Collapse
Affiliation(s)
- Amr Elsisy
- Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY 12180, USA
- Network Science and Technology Center, Rensselaer Polytechnic Institute, Troy, NY 12180, USA
| | - Aamir Mandviwalla
- Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY 12180, USA
- Network Science and Technology Center, Rensselaer Polytechnic Institute, Troy, NY 12180, USA
| | - Boleslaw K. Szymanski
- Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY 12180, USA
- Network Science and Technology Center, Rensselaer Polytechnic Institute, Troy, NY 12180, USA
- Społeczna Akademia Nauk, Łódź, Poland
| | - Thomas Sharkey
- Network Science and Technology Center, Rensselaer Polytechnic Institute, Troy, NY 12180, USA
- Department of Industrial Engineering, Clemson University, Clemson, SC 29631, USA
| |
Collapse
|
31
|
Bartlett TE. Comodularity and detection of co-communities. Phys Rev E 2021; 104:054309. [PMID: 34942704 DOI: 10.1103/physreve.104.054309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Accepted: 11/08/2021] [Indexed: 11/07/2022]
Abstract
This paper introduces the notion of comodularity, to cocluster observations of bipartite networks into co-communities. The task of coclustering is to group together nodes of one type with nodes of another type, according to the interactions that are the most similar. The measure of comodularity is introduced to assess the strength of co-communities, as well as to arrange the representation of nodes and clusters for visualization, and to define an objective function for optimization. We demonstrate the usefulness of our proposed methodology on simulated data, and with examples from genomics and consumer-product reviews.
Collapse
Affiliation(s)
- Thomas E Bartlett
- Department of Statistical Science, University College London, London WC1E 7HB, United Kingdom
| |
Collapse
|
32
|
Taborsky P, Vermue L, Korzepa M, Morup M. The Bayesian Cut. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:4111-4124. [PMID: 32406825 DOI: 10.1109/tpami.2020.2994396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
An important task in the analysis of graphs is separating nodes into densely connected groups with little interaction between each other. Prominent methods here include flow based graph cutting procedures as well as statistical network modeling approaches. However, adequately accounting for this, the so-called community structure, in complex networks remains a major challenge. We present a novel generic Bayesian probabilistic model for graph cutting in which we derive an analytical solution to the marginalization of nuisance parameters under constraints enforcing community structure. As a part of the solution a large scale approximation for integrals involving multiple incomplete gamma functions is derived. Our multiple cluster solution presents a generic tool for Bayesian inference on Poisson weighted graphs across different domains. Applied on three real world social networks as well as three image segmentation problems our approach shows on par or better performance to existing spectral graph cutting and community detection methods, while learning the underlying parameter space. The developed procedure provides a principled statistical framework for graph cutting and the Bayesian Cut source code provided enables easy adoption of the procedure as an alternative to existing graph cutting methods.
Collapse
|
33
|
Mitrai I, Tang W, Daoutidis P. Stochastic blockmodeling for learning the structure of optimization problems. AIChE J 2021. [DOI: 10.1002/aic.17415] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Affiliation(s)
- Ilias Mitrai
- Department of Chemical Engineering and Materials Science University of Minnesota Minneapolis Minnesota USA
| | - Wentao Tang
- Projects and Technology Shell Global Solutions (U.S.) Inc. Houston Texas USA
| | - Prodromos Daoutidis
- Department of Chemical Engineering and Materials Science University of Minnesota Minneapolis Minnesota USA
| |
Collapse
|
34
|
Tang QY, Kaneko K. Dynamics-Evolution Correspondence in Protein Structures. PHYSICAL REVIEW LETTERS 2021; 127:098103. [PMID: 34506164 DOI: 10.1103/physrevlett.127.098103] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Accepted: 07/28/2021] [Indexed: 06/13/2023]
Abstract
The genotype-phenotype mapping of proteins is a fundamental question in structural biology. In this Letter, with the analysis of a large dataset of proteins from hundreds of protein families, we quantitatively demonstrate the correlations between the noise-induced protein dynamics and mutation-induced variations of native structures, indicating the dynamics-evolution correspondence of proteins. Based on the investigations of the linear responses of native proteins, the origin of such a correspondence is elucidated. It is essential that the noise- and mutation-induced deformations of the proteins are restricted on a common low-dimensional subspace, as confirmed from the data. These results suggest an evolutionary mechanism of the proteins gaining both dynamical flexibility and evolutionary structural variability.
Collapse
Affiliation(s)
- Qian-Yuan Tang
- Center for Complex Systems Biology, Universal Biology Institute, University of Tokyo, Komaba 3-8-1, Meguro-ku, Tokyo 153-8902, Japan
- Lab for Neural Computation and Adaptation, RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Kunihiko Kaneko
- Center for Complex Systems Biology, Universal Biology Institute, University of Tokyo, Komaba 3-8-1, Meguro-ku, Tokyo 153-8902, Japan
| |
Collapse
|
35
|
Faccin M, Schaub MT, Delvenne JC. State Aggregations in Markov Chains and Block Models of Networks. PHYSICAL REVIEW LETTERS 2021; 127:078301. [PMID: 34459654 DOI: 10.1103/physrevlett.127.078301] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Revised: 06/17/2021] [Accepted: 07/15/2021] [Indexed: 06/13/2023]
Abstract
We consider state-aggregation schemes for Markov chains from an information-theoretic perspective. Specifically, we consider aggregating the states of a Markov chain such that the mutual information of the aggregated states separated by T time steps is maximized. We show that for T=1 this recovers the maximum-likelihood estimator of the degree-corrected stochastic block model as a particular case, which enables us to explain certain features of the likelihood landscape of this generative network model from a dynamical lens. We further highlight how we can uncover coherent, long-range dynamical modules for which considering a timescale T≫1 is essential. We demonstrate our results using synthetic flows and real-world ocean currents, where we are able to recover the fundamental features of the surface currents of the oceans.
Collapse
Affiliation(s)
- Mauro Faccin
- ICTEAM, Université catholique de Louvain, 1348 Louvain-la-Neuve, Belgium
| | - Michael T Schaub
- Department of Engineering Science, University of Oxford, Oxford OX1 2JD, United Kingdom
- Department of Computer Science, RWTH Aachen University, 52074 Aachen, Germany
| | - Jean-Charles Delvenne
- ICTEAM, Université catholique de Louvain, 1348 Louvain-la-Neuve, Belgium
- CORE, Université catholique de Louvain, 1348 Louvain-la-Neuve, Belgium
| |
Collapse
|
36
|
Liu X, Ding N, Liu C, Zhang Y, Tang T. Novel social network community discovery method combined local distance with node rank optimization function. APPL INTELL 2021. [DOI: 10.1007/s10489-020-02040-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
|
37
|
Chodrow PS, Veldt N, Benson AR. Generative hypergraph clustering: From blockmodels to modularity. SCIENCE ADVANCES 2021; 7:eabh1303. [PMID: 34233880 PMCID: PMC11559555 DOI: 10.1126/sciadv.abh1303] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Accepted: 05/24/2021] [Indexed: 06/13/2023]
Abstract
Hypergraphs are a natural modeling paradigm for networked systems with multiway interactions. A standard task in network analysis is the identification of closely related or densely interconnected nodes. We propose a probabilistic generative model of clustered hypergraphs with heterogeneous node degrees and edge sizes. Approximate maximum likelihood inference in this model leads to a clustering objective that generalizes the popular modularity objective for graphs. From this, we derive an inference algorithm that generalizes the Louvain graph community detection method, and a faster, specialized variant in which edges are expected to lie fully within clusters. Using synthetic and empirical data, we demonstrate that the specialized method is highly scalable and can detect clusters where graph-based methods fail. We also use our model to find interpretable higher-order structure in school contact networks, U.S. congressional bill cosponsorship and committees, product categories in copurchasing behavior, and hotel locations from web browsing sessions.
Collapse
Affiliation(s)
- Philip S Chodrow
- Department of Mathematics, University of California, Los Angeles, 520 Portola Plaza, Los Angeles, CA 90095, USA.
| | - Nate Veldt
- Center for Applied Mathematics, Cornell University, 657 Frank H.T. Rhodes Hall, Ithaca, NY 14853, USA
| | - Austin R Benson
- Department of Computer Science, Cornell University, 413B Gates Hall, Ithaca, NY 14853, USA
| |
Collapse
|
38
|
Gu W, Tandon A, Ahn YY, Radicchi F. Principled approach to the selection of the embedding dimension of networks. Nat Commun 2021; 12:3772. [PMID: 34145234 PMCID: PMC8213704 DOI: 10.1038/s41467-021-23795-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Accepted: 05/18/2021] [Indexed: 11/08/2022] Open
Abstract
Network embedding is a general-purpose machine learning technique that encodes network structure in vector spaces with tunable dimension. Choosing an appropriate embedding dimension - small enough to be efficient and large enough to be effective - is challenging but necessary to generate embeddings applicable to a multitude of tasks. Existing strategies for the selection of the embedding dimension rely on performance maximization in downstream tasks. Here, we propose a principled method such that all structural information of a network is parsimoniously encoded. The method is validated on various embedding algorithms and a large corpus of real-world networks. The embedding dimension selected by our method in real-world networks suggest that efficient encoding in low-dimensional spaces is usually possible.
Collapse
Affiliation(s)
- Weiwei Gu
- UrbanNet Lab, College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, P. R. China
| | - Aditya Tandon
- Center for Complex Networks and Systems Research, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, USA
| | - Yong-Yeol Ahn
- Center for Complex Networks and Systems Research, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, USA
- Network Science Institute, Indiana University, Bloomington (IUNI), IN, USA
- Connection Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Filippo Radicchi
- Center for Complex Networks and Systems Research, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, USA.
| |
Collapse
|
39
|
Abstract
Network-based procedures for topic detection in huge text collections offer an intuitive alternative to probabilistic topic models. We present in detail a method that is especially designed with the requirements of domain experts in mind. Like similar methods, it employs community detection in term co-occurrence graphs, but it is enhanced by including a resolution parameter that can be used for changing the targeted topic granularity. We also establish a term ranking and use semantic word-embedding for presenting term communities in a way that facilitates their interpretation. We demonstrate the application of our method with a widely used corpus of general news articles and show the results of detailed social-sciences expert evaluations of detected topics at various resolutions. A comparison with topics detected by Latent Dirichlet Allocation is also included. Finally, we discuss factors that influence topic interpretation.
Collapse
|
40
|
Calderer G, Kuijjer ML. Community Detection in Large-Scale Bipartite Biological Networks. Front Genet 2021; 12:649440. [PMID: 33968132 PMCID: PMC8099108 DOI: 10.3389/fgene.2021.649440] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Accepted: 03/18/2021] [Indexed: 11/13/2022] Open
Abstract
Networks are useful tools to represent and analyze interactions on a large, or genome-wide scale and have therefore been widely used in biology. Many biological networks-such as those that represent regulatory interactions, drug-gene, or gene-disease associations-are of a bipartite nature, meaning they consist of two different types of nodes, with connections only forming between the different node sets. Analysis of such networks requires methodologies that are specifically designed to handle their bipartite nature. Community structure detection is a method used to identify clusters of nodes in a network. This approach is especially helpful in large-scale biological network analysis, as it can find structure in networks that often resemble a "hairball" of interactions in visualizations. Often, the communities identified in biological networks are enriched for specific biological processes and thus allow one to assign drugs, regulatory molecules, or diseases to such processes. In addition, comparison of community structures between different biological conditions can help to identify how network rewiring may lead to tissue development or disease, for example. In this mini review, we give a theoretical basis of different methods that can be applied to detect communities in bipartite biological networks. We introduce and discuss different scores that can be used to assess the quality of these community structures. We then apply a wide range of methods to a drug-gene interaction network to highlight the strengths and weaknesses of these methods in their application to large-scale, bipartite biological networks.
Collapse
Affiliation(s)
- Genís Calderer
- Centre for Molecular Medicine Norway, University of Oslo, Oslo, Norway
| | - Marieke L Kuijjer
- Centre for Molecular Medicine Norway, University of Oslo, Oslo, Norway.,Department of Pathology, Leiden University Medical Center, Leiden, Netherlands
| |
Collapse
|
41
|
Ma R, Barnett I. The asymptotic distribution of modularity in weighted signed networks. Biometrika 2021; 108:1-16. [PMID: 34305154 PMCID: PMC8300091 DOI: 10.1093/biomet/asaa059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Modularity is a popular metric for quantifying the degree of community structure within a network. The distribution of the largest eigenvalue of a network's edge weight or adjacency matrix is well studied and is frequently used as a substitute for modularity when performing statistical inference. However, we show that the largest eigenvalue and modularity are asymptotically uncorrelated, which suggests the need for inference directly on modularity itself when the network size is large. To this end, we derive the asymptotic distributions of modularity in the case where the network's edge weight matrix belongs to the Gaussian orthogonal ensemble, and study the statistical power of the corresponding test for community structure under some alternative models. We empirically explore universality extensions of the limiting distribution and demonstrate the accuracy of these asymptotic distributions through Type I error simulations. We also compare the empirical powers of the modularity based tests with some existing methods. Our method is then used to test for the presence of community structure in two real data applications.
Collapse
Affiliation(s)
- Rong Ma
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, U.S.A
| | - Ian Barnett
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, U.S.A
| |
Collapse
|
42
|
Portes LL, Small M. Navigating differential structures in complex networks. Phys Rev E 2021; 102:062301. [PMID: 33466036 DOI: 10.1103/physreve.102.062301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Accepted: 11/20/2020] [Indexed: 11/07/2022]
Abstract
Structural changes in a network representation of a system, due to different experimental conditions, different connectivity across layers, or to its time evolution, can provide insight on its organization, function, and on how it responds to external perturbations. The deeper understanding of how gene networks cope with diseases and treatments is maybe the most incisive demonstration of the gains obtained through this differential network analysis point of view, which led to an explosion of new numeric techniques in the last decade. However, where to focus one's attention, or how to navigate through the differential structures in the context of large networks, can be overwhelming even for a few experimental conditions. In this paper, we propose a theory and a methodological implementation for the characterization of shared "structural roles" of nodes simultaneously within and between networks. Inspired by recent methodological advances in chaotic phase synchronization analysis, we show how the information about the shared structures of a set of networks can be split and organized in an automatic fashion, in scenarios with very different (i) community sizes, (ii) total number of communities, and (iii) even for a large number of 100 networks compared using numerical benchmarks generated by a stochastic block model. Then, we investigate how the network size, number of networks, and mean size of communities influence the method performance in a series of Monte Carlo experiments. To illustrate its potential use in a more challenging scenario with real-world data, we show evidence that the method can still split and organize the structural information of a set of four gene coexpression networks obtained from two cell types × two treatments (interferon-β stimulated or control). Aside from its potential use as for automatic feature extraction and preprocessing tool, we discuss that another strength of the method is its "story-telling"-like characterization of the information encoded in a set of networks, which can be used to pinpoint unexpected shared structure, leading to further investigations and providing new insights. Finally, the method is flexible to address different research-field-specific questions, by not restricting what scientific-meaningful characteristic (or relevant feature) of a node shall be used.
Collapse
Affiliation(s)
- Leonardo L Portes
- Complex Systems Group, Department of Mathematics and Statistics, University of Western Australia, Nedlands, Perth, WA 6009, Australia
| | - Michael Small
- Complex Systems Group, Department of Mathematics and Statistics, University of Western Australia, Nedlands, Perth, WA 6009, Australia.,Mineral Resources, CSIRO, Kensington, Perth, WA 6151, Australia
| |
Collapse
|
43
|
Zhao F, Ye M, Huang SL. Exact Recovery of Stochastic Block Model by Ising Model. ENTROPY 2021; 23:e23010065. [PMID: 33401691 PMCID: PMC7823472 DOI: 10.3390/e23010065] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/29/2020] [Revised: 12/20/2020] [Accepted: 12/30/2020] [Indexed: 11/16/2022]
Abstract
In this paper, we study the phase transition property of an Ising model defined on a special random graph—the stochastic block model (SBM). Based on the Ising model, we propose a stochastic estimator to achieve the exact recovery for the SBM. The stochastic algorithm can be transformed into an optimization problem, which includes the special case of maximum likelihood and maximum modularity. Additionally, we give an unbiased convergent estimator for the model parameters of the SBM, which can be computed in constant time. Finally, we use metropolis sampling to realize the stochastic estimator and verify the phase transition phenomenon thfough experiments.
Collapse
Affiliation(s)
- Feng Zhao
- Department of Electronics, Tsinghua University, Beijing 100084, China;
| | - Min Ye
- Tsinghua Berkeley Shenzhen Institute, Berkeley, CA 94704, USA;
| | - Shao-Lun Huang
- Tsinghua Berkeley Shenzhen Institute, Berkeley, CA 94704, USA;
- Correspondence:
| |
Collapse
|
44
|
Correspondence analysis-based network clustering and importance of degenerate solutions unification of spectral clustering and modularity maximization. SOCIAL NETWORK ANALYSIS AND MINING 2020. [DOI: 10.1007/s13278-020-00686-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
45
|
Uncovering New Drug Properties in Target-Based Drug-Drug Similarity Networks. Pharmaceutics 2020; 12:pharmaceutics12090879. [PMID: 32947845 PMCID: PMC7557376 DOI: 10.3390/pharmaceutics12090879] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Revised: 09/09/2020] [Accepted: 09/10/2020] [Indexed: 01/19/2023] Open
Abstract
Despite recent advances in bioinformatics, systems biology, and machine learning, the accurate prediction of drug properties remains an open problem. Indeed, because the biological environment is a complex system, the traditional approach—based on knowledge about the chemical structures—can not fully explain the nature of interactions between drugs and biological targets. Consequently, in this paper, we propose an unsupervised machine learning approach that uses the information we know about drug–target interactions to infer drug properties. To this end, we define drug similarity based on drug–target interactions and build a weighted Drug–Drug Similarity Network according to the drug–drug similarity relationships. Using an energy-model network layout, we generate drug communities associated with specific, dominant drug properties. DrugBank confirms the properties of 59.52% of the drugs in these communities, and 26.98% are existing drug repositioning hints we reconstruct with our DDSN approach. The remaining 13.49% of the drugs seem not to match the dominant pharmacologic property; thus, we consider them potential drug repurposing hints. The resources required to test all these repurposing hints are considerable. Therefore we introduce a mechanism of prioritization based on the betweenness/degree node centrality. Using betweenness/degree as an indicator of drug repurposing potential, we select Azelaic acid and Meprobamate as a possible antineoplastic and antifungal, respectively. Finally, we use a test procedure based on molecular docking to analyze Azelaic acid and Meprobamate’s repurposing.
Collapse
|
46
|
Smiljanić J, Edler D, Rosvall M. Mapping flows on sparse networks with missing links. Phys Rev E 2020; 102:012302. [PMID: 32794952 DOI: 10.1103/physreve.102.012302] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Accepted: 06/09/2020] [Indexed: 11/07/2022]
Abstract
Unreliable network data can cause community-detection methods to overfit and highlight spurious structures with misleading information about the organization and function of complex systems. Here we show how to detect significant flow-based communities in sparse networks with missing links using the map equation. Since the map equation builds on Shannon entropy estimation, it assumes complete data such that analyzing undersampled networks can lead to overfitting. To overcome this problem, we incorporate a Bayesian approach with assumptions about network uncertainties into the map equation framework. Results in both synthetic and real-world networks show that the Bayesian estimate of the map equation provides a principled approach to revealing significant structures in undersampled networks.
Collapse
Affiliation(s)
- Jelena Smiljanić
- Integrated Science Lab, Department of Physics, Umeå University, SE-901 87 Umeå, Sweden.,Scientific Computing Laboratory, Center for the Study of Complex Systems, Institute of Physics Belgrade, University of Belgrade, Pregrevica 118, 11080 Belgrade, Serbia
| | - Daniel Edler
- Integrated Science Lab, Department of Physics, Umeå University, SE-901 87 Umeå, Sweden.,Gothenburg Global Biodiversity Centre, Box 461, SE-405 30 Gothenburg, Sweden.,Department of Biological and Environmental Sciences, University of Gothenburg, Carl Skottsbergs gata 22B, Gothenburg 41319, Sweden
| | - Martin Rosvall
- Integrated Science Lab, Department of Physics, Umeå University, SE-901 87 Umeå, Sweden
| |
Collapse
|
47
|
Polovnikov K, Gorsky A, Nechaev S, Razin SV, Ulianov SV. Non-backtracking walks reveal compartments in sparse chromatin interaction networks. Sci Rep 2020; 10:11398. [PMID: 32647272 PMCID: PMC7347895 DOI: 10.1038/s41598-020-68182-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Accepted: 06/19/2020] [Indexed: 12/31/2022] Open
Abstract
Chromatin communities stabilized by protein machinery play essential role in gene regulation and refine global polymeric folding of the chromatin fiber. However, treatment of these communities in the framework of the classical network theory (stochastic block model, SBM) does not take into account intrinsic linear connectivity of the chromatin loci. Here we propose the polymer block model, paving the way for community detection in polymer networks. On the basis of this new model we modify the non-backtracking flow operator and suggest the first protocol for annotation of compartmental domains in sparse single cell Hi-C matrices. In particular, we prove that our approach corresponds to the maximum entropy principle. The benchmark analyses demonstrates that the spectrum of the polymer non-backtracking operator resolves the true compartmental structure up to the theoretical detectability threshold, while all commonly used operators fail above it. We test various operators on real data and conclude that the sizes of the non-backtracking single cell domains are most close to the sizes of compartments from the population data. Moreover, the found domains clearly segregate in the gene density and correlate with the population compartmental mask, corroborating biological significance of our annotation of the chromatin compartmental domains in single cells Hi-C matrices.
Collapse
Affiliation(s)
- K Polovnikov
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA. .,Skolkovo Institute of Science and Technology, Skolkovo, Russia, 143026.
| | - A Gorsky
- Moscow Institute for Physics and Technology, Dolgoprudnyi, Russia.,Institute for Information Transmission Problems of RAS, Moscow, Russia
| | - S Nechaev
- Interdisciplinary Scientific Center Poncelet (UMI 2615 CNRS), Moscow, Russia, 119002.,Lebedev Physical Institute RAS, Moscow, Russia, 119991
| | - S V Razin
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russia.,Faculty of Biology, M.V. Lomonosov Moscow State University, Moscow, Russia
| | - S V Ulianov
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russia.,Faculty of Biology, M.V. Lomonosov Moscow State University, Moscow, Russia
| |
Collapse
|
48
|
Lu X, Cross B, Szymanski BK. Asymptotic resolution bounds of generalized modularity and multi-scale community detection. Inf Sci (N Y) 2020. [DOI: 10.1016/j.ins.2020.03.082] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
49
|
Pascual‐García A, Bell T. functionInk: An efficient method to detect functional groups in multidimensional networks reveals the hidden structure of ecological communities. Methods Ecol Evol 2020. [DOI: 10.1111/2041-210x.13377] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
| | - Thomas Bell
- Department of Life Sciences Imperial College London Ascot UK
| |
Collapse
|
50
|
Krakauer D, Bertschinger N, Olbrich E, Flack JC, Ay N. The information theory of individuality. Theory Biosci 2020; 139:209-223. [PMID: 32212028 PMCID: PMC7244620 DOI: 10.1007/s12064-020-00313-7] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2020] [Accepted: 03/05/2020] [Indexed: 12/02/2022]
Abstract
Despite the near universal assumption of individuality in biology, there is little agreement about what individuals are and few rigorous quantitative methods for their identification. Here, we propose that individuals are aggregates that preserve a measure of temporal integrity, i.e., "propagate" information from their past into their futures. We formalize this idea using information theory and graphical models. This mathematical formulation yields three principled and distinct forms of individuality-an organismal, a colonial, and a driven form-each of which varies in the degree of environmental dependence and inherited information. This approach can be thought of as a Gestalt approach to evolution where selection makes figure-ground (agent-environment) distinctions using suitable information-theoretic lenses. A benefit of the approach is that it expands the scope of allowable individuals to include adaptive aggregations in systems that are multi-scale, highly distributed, and do not necessarily have physical boundaries such as cell walls or clonal somatic tissue. Such individuals might be visible to selection but hard to detect by observers without suitable measurement principles. The information theory of individuality allows for the identification of individuals at all levels of organization from molecular to cultural and provides a basis for testing assumptions about the natural scales of a system and argues for the importance of uncertainty reduction through coarse-graining in adaptive systems.
Collapse
Affiliation(s)
| | - Nils Bertschinger
- Frankfurt Institute for Advanced Studies, Frankfurt am Main, Germany
| | - Eckehard Olbrich
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany
| | | | - Nihat Ay
- Santa Fe Institute, Santa Fe, USA
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany
| |
Collapse
|