1
|
Valle F, Caselle M, Osella M. Exploring the latent space of transcriptomic data with topic modeling. NAR Genom Bioinform 2025; 7:lqaf049. [PMID: 40264683 PMCID: PMC12012681 DOI: 10.1093/nargab/lqaf049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2024] [Revised: 04/03/2025] [Accepted: 04/11/2025] [Indexed: 04/24/2025] Open
Abstract
The availability of high-dimensional transcriptomic datasets is increasing at a tremendous pace, together with the need for suitable computational tools. Clustering and dimensionality reduction methods are popular go-to methods to identify basic structures in these datasets. At the same time, different topic modeling techniques have been developed to organize the deluge of available data of natural language using their latent topical structure. This paper leverages the statistical analogies between text and transcriptomic datasets to compare different topic modeling methods when applied to gene expression data. Specifically, we test their accuracy in the specific task of discovering and reconstructing the tissue structure of the human transcriptome and distinguishing healthy from cancerous tissues. We examine the properties of the latent space recovered by different methods, highlight their differences, and their pros and cons across different tasks. We focus in particular on how different statistical priors can affect the results and their interpretability. Finally, we show that the latent topic space can be a useful low-dimensional embedding space, where a basic neural network classifier can annotate transcriptomic profiles with high accuracy.
Collapse
Affiliation(s)
- Filippo Valle
- Physics Department, University of Turin and INFN, Via Pietro Giuria 1, 12125 Torino, Italy
| | - Michele Caselle
- Physics Department, University of Turin and INFN, Via Pietro Giuria 1, 12125 Torino, Italy
| | - Matteo Osella
- Physics Department, University of Turin and INFN, Via Pietro Giuria 1, 12125 Torino, Italy
| |
Collapse
|
2
|
Hussain MT, Halappanavar M, Chatterjee S, Radicchi F, Fortunato S, Azad A. Parallel median consensus clustering in complex networks. Sci Rep 2025; 15:3788. [PMID: 39885235 PMCID: PMC11782583 DOI: 10.1038/s41598-025-87479-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Accepted: 01/20/2025] [Indexed: 02/01/2025] Open
Abstract
We develop an algorithm that finds the consensus among many different clustering solutions of a graph. We formulate the problem as a median set partitioning problem and propose a greedy optimization technique. Unlike other approaches that find median set partitions, our algorithm takes graph structure into account and finds a comparable quality solution much faster than the other approaches. For graphs with known communities, our consensus partition captures the actual community structure more accurately than alternative approaches. To make it applicable to large graphs, we remove sequential dependencies from our algorithm and design a parallel algorithm. Our parallel algorithm achieves 35x speedup when utilizing 64 processing cores for large real-world graphs representing mass cytometry data from single-cell experiments.
Collapse
Affiliation(s)
- Md Taufique Hussain
- Department of Intelligent Systems Engineering, Indiana University, Bloomington, IN, USA.
| | - Mahantesh Halappanavar
- Data Sciences and Machine Intelligence Group, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Samrat Chatterjee
- Data Sciences and Machine Intelligence Group, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Filippo Radicchi
- Center for Complex Networks and Systems Research (CNetS), Indiana University, Bloomington, IN, USA
| | - Santo Fortunato
- Center for Complex Networks and Systems Research (CNetS), Indiana University, Bloomington, IN, USA
| | - Ariful Azad
- Department of Intelligent Systems Engineering, Indiana University, Bloomington, IN, USA.
- Department of Computer Science & Engineering, Texas A&M University, College Station, TX, USA.
| |
Collapse
|
3
|
Mangold L, Roth C. Quantifying metadata relevance to network block structure using description length. COMMUNICATIONS PHYSICS 2024; 7:331. [PMID: 39398491 PMCID: PMC11469959 DOI: 10.1038/s42005-024-01819-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Accepted: 09/30/2024] [Indexed: 10/15/2024]
Abstract
Network analysis is often enriched by including an examination of node metadata. In the context of understanding the mesoscale of networks it is often assumed that node groups based on metadata and node groups based on connectivity patterns are intrinsically linked. This assumption is increasingly being challenged, whereby metadata might be entirely unrelated to structure or, similarly, multiple sets of metadata might be relevant to the structure of a network in different ways. We propose the metablox tool to quantify the relationship between a network's node metadata and its mesoscale structure, measuring the strength of the relationship and the type of structural arrangement exhibited by the metadata. We show on a number of synthetic and empirical networks that our tool distinguishes relevant metadata and allows for this in a comparative setting, demonstrating that it can be used as part of systematic meta analyses for the comparison of networks from different domains.
Collapse
Affiliation(s)
- Lena Mangold
- Centre d’Analyse et de Mathématique Sociales (CNRS/EHESS), 54 Bd Raspail, 75006 Paris, France
- Computational Social Science Team, Centre Marc Bloch (CNRS/MEAE), Friedrichstr. 191, 10117 Berlin, Germany
| | - Camille Roth
- Centre d’Analyse et de Mathématique Sociales (CNRS/EHESS), 54 Bd Raspail, 75006 Paris, France
- Computational Social Science Team, Centre Marc Bloch (CNRS/MEAE), Friedrichstr. 191, 10117 Berlin, Germany
| |
Collapse
|
4
|
Aref S, Mostajabdaveh M, Chheda H. Bayan algorithm: Detecting communities in networks through exact and approximate optimization of modularity. Phys Rev E 2024; 110:044315. [PMID: 39562863 DOI: 10.1103/physreve.110.044315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2024] [Accepted: 09/24/2024] [Indexed: 11/21/2024]
Abstract
Community detection is a classic network problem with extensive applications in various fields. Its most common method is using modularity maximization heuristics which rarely return an optimal partition or anything similar. Partitions with globally optimal modularity are difficult to compute, and therefore have been underexplored. Using structurally diverse networks, we compare 30 community detection methods including our proposed algorithm that offers optimality and approximation guarantees: the Bayan algorithm. Unlike existing methods, Bayan globally maximizes modularity or approximates it within a factor. Our results show the distinctive accuracy and stability of maximum-modularity partitions in retrieving planted partitions at rates higher than most alternatives for a wide range of parameter settings in two standard benchmarks. Compared to the partitions from 29 other algorithms, maximum-modularity partitions have the best medians for description length, coverage, performance, average conductance, and well clusteredness. These advantages come at the cost of additional computations which Bayan makes possible for small networks (networks that have up to 3000 edges in their largest connected component). Bayan is several times faster than using open-source and commercial solvers for modularity maximization, making it capable of finding optimal partitions for instances that cannot be optimized by any other existing method. Our results point to a few well-performing algorithms, among which Bayan stands out as the most reliable method for small networks. A python implementation of the Bayan algorithm (bayanpy) is publicly available through the package installer for python.
Collapse
|
5
|
Betzel R, Puxeddu MG, Seguin C. Hierarchical communities in the larval Drosophila connectome: Links to cellular annotations and network topology. Proc Natl Acad Sci U S A 2024; 121:e2320177121. [PMID: 39269775 PMCID: PMC11420166 DOI: 10.1073/pnas.2320177121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 05/28/2024] [Indexed: 09/15/2024] Open
Abstract
One of the longstanding aims of network neuroscience is to link a connectome's topological properties-i.e., features defined from connectivity alone-with an organism's neurobiology. One approach for doing so is to compare connectome properties with annotational maps. This type of analysis is popular at the meso-/macroscale, but is less common at the nano-scale, owing to a paucity of neuron-level connectome data. However, recent methodological advances have made possible the reconstruction of whole-brain connectomes at single-neuron resolution for a select set of organisms. These include the fruit fly, Drosophila melanogaster, and its developing larvae. In addition to fine-scale descriptions of connectivity, these datasets are accompanied by rich annotations. Here, we use a variant of the stochastic blockmodel to detect multilevel communities in the larval Drosophila connectome. We find that communities partition neurons based on function and cell type and that most interact assortatively, reflecting the principle of functional segregation. However, a small number of communities interact nonassortatively, forming form a "rich-club" of interneurons that receive sensory/ascending inputs and deliver outputs along descending pathways. Next, we investigate the role of community structure in shaping communication patterns. We find that polysynaptic signaling follows specific trajectories across modular hierarchies, with interneurons playing a key role in mediating communication routes between modules and hierarchical scales. Our work suggests a relationship between system-level architecture and the biological function and classification of individual neurons. We envision our study as an important step toward bridging the gap between complex systems and neurobiological lines of investigation in brain sciences.
Collapse
Affiliation(s)
- Richard Betzel
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN47401
- Cognitive Science Program, Indiana University, Bloomington, IN47401
- Program in Neuroscience, Indiana University, Bloomington, IN47401
- Department of Neuroscience, University of Minnesota, Minneapolis, MN55455
| | - Maria Grazia Puxeddu
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN47401
| | - Caio Seguin
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN47401
| |
Collapse
|
6
|
Ruffle JK, Gray RJ, Mohinta S, Pombo G, Kaul C, Hyare H, Rees G, Nachev P. Computational limits to the legibility of the imaged human brain. Neuroimage 2024; 291:120600. [PMID: 38569979 DOI: 10.1016/j.neuroimage.2024.120600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Revised: 03/08/2024] [Accepted: 03/31/2024] [Indexed: 04/05/2024] Open
Abstract
Our knowledge of the organisation of the human brain at the population-level is yet to translate into power to predict functional differences at the individual-level, limiting clinical applications and casting doubt on the generalisability of inferred mechanisms. It remains unknown whether the difficulty arises from the absence of individuating biological patterns within the brain, or from limited power to access them with the models and compute at our disposal. Here we comprehensively investigate the resolvability of such patterns with data and compute at unprecedented scale. Across 23 810 unique participants from UK Biobank, we systematically evaluate the predictability of 25 individual biological characteristics, from all available combinations of structural and functional neuroimaging data. Over 4526 GPU*hours of computation, we train, optimize, and evaluate out-of-sample 700 individual predictive models, including fully-connected feed-forward neural networks of demographic, psychological, serological, chronic disease, and functional connectivity characteristics, and both uni- and multi-modal 3D convolutional neural network models of macro- and micro-structural brain imaging. We find a marked discrepancy between the high predictability of sex (balanced accuracy 99.7%), age (mean absolute error 2.048 years, R2 0.859), and weight (mean absolute error 2.609Kg, R2 0.625), for which we set new state-of-the-art performance, and the surprisingly low predictability of other characteristics. Neither structural nor functional imaging predicted an individual's psychology better than the coincidence of common chronic disease (p < 0.05). Serology predicted chronic disease (p < 0.05) and was best predicted by it (p < 0.001), followed by structural neuroimaging (p < 0.05). Our findings suggest either more informative imaging or more powerful models will be needed to decipher individual level characteristics from the human brain. We make our models and code openly available.
Collapse
Affiliation(s)
- James K Ruffle
- Queen Square Institute of Neurology, University College London, London, United Kingdom.
| | - Robert J Gray
- Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Samia Mohinta
- Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Guilherme Pombo
- Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Chaitanya Kaul
- School of Computing Science, University of Glasgow, Glasgow, United Kingdom
| | - Harpreet Hyare
- Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Geraint Rees
- Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Parashkev Nachev
- Queen Square Institute of Neurology, University College London, London, United Kingdom.
| |
Collapse
|
7
|
Guan J, Chen B, Huang X. Community Detection via Autoencoder-Like Nonnegative Tensor Decomposition. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:4179-4191. [PMID: 36170387 DOI: 10.1109/tnnls.2022.3201906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Community detection aims at partitioning a network into several densely connected subgraphs. Recently, nonnegative matrix factorization (NMF) has been widely adopted in many successful community detection applications. However, most existing NMF-based community detection algorithms neglect the multihop network topology and the extreme sparsity of adjacency matrices. To resolve them, we propose a novel conception of adjacency tensor, which extends adjacency matrix to multihop cases. Then, we develop a novel tensor Tucker decomposition-based community detection method-autoencoder-like nonnegative tensor decomposition (ANTD), leveraging the constructed adjacency tensor. Distinct from simply applying tensor decomposition on the constructed adjacency tensor, which only works as a decoder, ANTD also introduces an encoder component to constitute an autoencoder-like architecture, which can further enhance the quality of the detected communities. We also develop an efficient alternative updating algorithm with convergence guarantee to optimize ANTD, and theoretically analyze the algorithm complexity. Moreover, we also study a graph regularized variant of ANTD. Extensive experiments on real-world benchmark networks by comparing 27 state-of-the-art methods, validate the effectiveness, efficiency, and robustness of our proposed methods.
Collapse
|
8
|
Ochi M, Kawamoto T. Finding community structure using the ordered random graph model. Phys Rev E 2023; 108:014303. [PMID: 37583142 DOI: 10.1103/physreve.108.014303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Accepted: 06/07/2023] [Indexed: 08/17/2023]
Abstract
Visualization of the adjacency matrix enables us to capture macroscopic features of a network when the matrix elements are aligned properly. Community structure, a network consisting of several densely connected components, is a particularly important feature and the structure can be identified through the adjacency matrix when it is close to a block-diagonal form. However, classical ordering algorithms for matrices fail to align matrix elements such that the community structure is visible. In this study, we propose an ordering algorithm based on the maximum-likelihood estimate of the ordered random graph model. We show that the proposed method allows us to more clearly identify community structures than the existing ordering algorithms.
Collapse
Affiliation(s)
- Masaki Ochi
- Department of Physics, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8574, Japan
| | - Tatsuro Kawamoto
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology, 2-3-26 Aomi, Koto-ku, Tokyo 135-0064, Japan
| |
Collapse
|
9
|
Frąszczak D. Detecting rumor outbreaks in online social networks. SOCIAL NETWORK ANALYSIS AND MINING 2023; 13:91. [PMID: 37274600 PMCID: PMC10233536 DOI: 10.1007/s13278-023-01092-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 03/29/2023] [Accepted: 05/03/2023] [Indexed: 06/06/2023]
Abstract
Social media platforms are broadly used to exchange information by milliards of people worldwide. Each day people share a lot of their updates and opinions on various types of topics. Moreover, politicians also use it to share their postulates and programs, shops to advertise their products, etc. Social media are so popular nowadays because of critical factors, including quick and accessible Internet communication, always available. These conditions make it easy to spread information from one user to another in close neighborhoods and around the whole social network located on the given platform. Unfortunately, it has recently been increasingly used for malicious purposes, e.g., rumor propagation. In most cases, the process starts from multiple nodes (users). There are numerous papers about detecting the real source with only one initiator. There is a lack of solutions dedicated to problems with multiple sources. Most solutions that meet those criteria need an accurate number of origins to detect them correctly, which is impossible to obtain in real-life usage. This paper analyzes the methods to detect rumor outbreaks in online social networks that can be used as an initial guess for the number of real propagation initiators.
Collapse
|
10
|
Vegué M, Thibeault V, Desrosiers P, Allard A. Dimension reduction of dynamics on modular and heterogeneous directed networks. PNAS NEXUS 2023; 2:pgad150. [PMID: 37215634 PMCID: PMC10198746 DOI: 10.1093/pnasnexus/pgad150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 02/17/2023] [Accepted: 04/12/2023] [Indexed: 05/24/2023]
Abstract
Dimension reduction is a common strategy to study nonlinear dynamical systems composed by a large number of variables. The goal is to find a smaller version of the system whose time evolution is easier to predict while preserving some of the key dynamical features of the original system. Finding such a reduced representation for complex systems is, however, a difficult task. We address this problem for dynamics on weighted directed networks, with special emphasis on modular and heterogeneous networks. We propose a two-step dimension-reduction method that takes into account the properties of the adjacency matrix. First, units are partitioned into groups of similar connectivity profiles. Each group is associated to an observable that is a weighted average of the nodes' activities within the group. Second, we derive a set of equations that must be fulfilled for these observables to properly represent the original system's behavior, together with a method for approximately solving them. The result is a reduced adjacency matrix and an approximate system of ODEs for the observables' evolution. We show that the reduced system can be used to predict some characteristic features of the complete dynamics for different types of connectivity structures, both synthetic and derived from real data, including neuronal, ecological, and social networks. Our formalism opens a way to a systematic comparison of the effect of various structural properties on the overall network dynamics. It can thus help to identify the main structural driving forces guiding the evolution of dynamical processes on networks.
Collapse
Affiliation(s)
- Marina Vegué
- Département de physique, de génie physique et d'optique, Université Laval, 2325 rue de l'Université, G1V 0A6 Québec, Canada
- Centre interdisciplinaire en modélisation mathématique, Université Laval, 2325 rue de l'Université, G1V 0A6 Québec, Canada
| | - Vincent Thibeault
- Département de physique, de génie physique et d'optique, Université Laval, 2325 rue de l'Université, G1V 0A6 Québec, Canada
- Centre interdisciplinaire en modélisation mathématique, Université Laval, 2325 rue de l'Université, G1V 0A6 Québec, Canada
| | - Patrick Desrosiers
- Département de physique, de génie physique et d'optique, Université Laval, 2325 rue de l'Université, G1V 0A6 Québec, Canada
- Centre interdisciplinaire en modélisation mathématique, Université Laval, 2325 rue de l'Université, G1V 0A6 Québec, Canada
- CERVO Brain Research Center, 2301 avenue d'Estimauville, G1E 1T2 Québec, Canada
| | - Antoine Allard
- Département de physique, de génie physique et d'optique, Université Laval, 2325 rue de l'Université, G1V 0A6 Québec, Canada
- Centre interdisciplinaire en modélisation mathématique, Université Laval, 2325 rue de l'Université, G1V 0A6 Québec, Canada
| |
Collapse
|
11
|
Pedigo BD, Powell M, Bridgeford EW, Winding M, Priebe CE, Vogelstein JT. Generative network modeling reveals quantitative definitions of bilateral symmetry exhibited by a whole insect brain connectome. eLife 2023; 12:e83739. [PMID: 36976249 PMCID: PMC10115445 DOI: 10.7554/elife.83739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 03/27/2023] [Indexed: 03/29/2023] Open
Abstract
Comparing connectomes can help explain how neural connectivity is related to genetics, disease, development, learning, and behavior. However, making statistical inferences about the significance and nature of differences between two networks is an open problem, and such analysis has not been extensively applied to nanoscale connectomes. Here, we investigate this problem via a case study on the bilateral symmetry of a larval Drosophila brain connectome. We translate notions of 'bilateral symmetry' to generative models of the network structure of the left and right hemispheres, allowing us to test and refine our understanding of symmetry. We find significant differences in connection probabilities both across the entire left and right networks and between specific cell types. By rescaling connection probabilities or removing certain edges based on weight, we also present adjusted definitions of bilateral symmetry exhibited by this connectome. This work shows how statistical inferences from networks can inform the study of connectomes, facilitating future comparisons of neural structures.
Collapse
Affiliation(s)
- Benjamin D Pedigo
- Department of Biomedical Engineering, Johns Hopkins UniversityBaltimoreUnited States
| | - Mike Powell
- Department of Biomedical Engineering, Johns Hopkins UniversityBaltimoreUnited States
| | - Eric W Bridgeford
- Department of Biostatistics, Johns Hopkins UniversityBaltimoreUnited States
| | - Michael Winding
- Department of Zoology, University of CambridgeCambridgeUnited Kingdom
- Neurobiology Division, MRC Laboratory of Molecular BiologyCambridgeUnited Kingdom
- Janelia Research Campus, Howard Hughes Medical InstituteAshburnUnited States
| | - Carey E Priebe
- Department of Applied Mathematics and Statistics, Johns Hopkins UniversityBaltimoreUnited States
| | - Joshua T Vogelstein
- Department of Biomedical Engineering, Johns Hopkins UniversityBaltimoreUnited States
| |
Collapse
|
12
|
Hohmann M, Devriendt K, Coscia M. Quantifying ideological polarization on a network using generalized Euclidean distance. SCIENCE ADVANCES 2023; 9:eabq2044. [PMID: 36857460 PMCID: PMC9977176 DOI: 10.1126/sciadv.abq2044] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Accepted: 01/31/2023] [Indexed: 06/18/2023]
Abstract
An intensely debated topic is whether political polarization on social media is on the rise. We can investigate this question only if we can quantify polarization, by taking into account how extreme the opinions of the people are, how much they organize into echo chambers, and how these echo chambers organize in the network. Current polarization estimates are insensitive to at least one of these factors: They cannot conclusively clarify the opening question. Here, we propose a measure of ideological polarization that can capture the factors we listed. The measure is based on the generalized Euclidean distance, which estimates the distance between two vectors on a network, e.g., representing people's opinion. This measure can fill the methodological gap left by the state of the art and leads to useful insights when applied to real-world debates happening on social media and to data from the U.S. Congress.
Collapse
Affiliation(s)
- Marilena Hohmann
- Copenhagen Center for Social Data Science, University of Copenhagen, Øster Farimagsgade 5, Copenhagen, Denmark
| | - Karel Devriendt
- Mathematical Institute, University of Oxford, Woodstock Road, Oxford, UK
- Alan Turing Institute, Euston Road 96, London, UK
| | - Michele Coscia
- CS Department, IT University of Copenhagen, Rued Langgaards Vej 7, Copenhagen, Denmark
| |
Collapse
|
13
|
Plana F, Pérez J, Abeliuk A. Modularity of food-sharing networks minimises the risk for individual and group starvation in hunter-gatherer societies. PLoS One 2023; 18:e0272733. [PMID: 37163503 PMCID: PMC10171659 DOI: 10.1371/journal.pone.0272733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2022] [Accepted: 10/08/2022] [Indexed: 05/12/2023] Open
Abstract
It has been argued that hunter-gatherers' food-sharing may have provided the basis for a whole range of social interactions, and hence its study may provide important insight into the evolutionary origin of human sociality. Motivated by this observation, we propose a simple network optimization model inspired by a food-sharing dynamic that can recover some empirical patterns found in social networks. We focus on two of the main food-sharing drivers discussed by the anthropological literature: the reduction of individual starvation risk and the care for the group welfare or egalitarian access to food shares, and show that networks optimizing both criteria may exhibit a community structure of highly-cohesive groups around special agents that we call hunters, those who inject food into the system. These communities appear under conditions of uncertainty and scarcity in the food supply, which suggests their adaptive value in this context. We have additionally obtained that optimal welfare networks resemble social networks found in lab experiments that promote more egalitarian income distribution, and also distinct distributions of reciprocity among hunters and non-hunters, which may be consistent with some empirical reports on how sharing is distributed in waves, first among hunters, and then hunters with their families. These model results are consistent with the view that social networks functionally adaptive for optimal resource use, may have created the environment in which prosocial behaviors evolved. Finally, our model also relies on an original formulation of starvation risk, and it may contribute to a formal framework to proceed in this discussion regarding the principles guiding food-sharing networks.
Collapse
Affiliation(s)
- Francisco Plana
- Department of Computer Science, Universidad de Chile, Santiago, Chile
| | - Jorge Pérez
- Millennium Institute Foundational Research on Data, Santiago, Chile
| | - Andrés Abeliuk
- Department of Computer Science, Universidad de Chile, Santiago, Chile
- National Center for Artificial Intelligence (CENIA), Santiago, Chile
| |
Collapse
|
14
|
Cipolotti L, Ruffle JK, Mole J, Xu T, Hyare H, Shallice T, Chan E, Nachev P. Graph lesion-deficit mapping of fluid intelligence. Brain 2022; 146:167-181. [PMID: 36574957 PMCID: PMC9825598 DOI: 10.1093/brain/awac304] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 05/27/2022] [Accepted: 08/11/2022] [Indexed: 12/29/2022] Open
Abstract
Fluid intelligence is arguably the defining feature of human cognition. Yet the nature of its relationship with the brain remains a contentious topic. Influential proposals drawing primarily on functional imaging data have implicated 'multiple demand' frontoparietal and more widely distributed cortical networks, but extant lesion-deficit studies with greater causal power are almost all small, methodologically constrained, and inconclusive. The task demands large samples of patients, comprehensive investigation of performance, fine-grained anatomical mapping, and robust lesion-deficit inference, yet to be brought to bear on it. We assessed 165 healthy controls and 227 frontal or non-frontal patients with unilateral brain lesions on the best-established test of fluid intelligence, Raven's Advanced Progressive Matrices, employing an array of lesion-deficit inferential models responsive to the potentially distributed nature of fluid intelligence. Non-parametric Bayesian stochastic block models were used to reveal the community structure of lesion deficit networks, disentangling functional from confounding pathological distributed effects. Impaired performance was confined to patients with frontal lesions [F(2,387) = 18.491; P < 0.001; frontal worse than non-frontal and healthy participants P < 0.01, P <0.001], more marked on the right than left [F(4,385) = 12.237; P < 0.001; right worse than left and healthy participants P < 0.01, P < 0.001]. Patients with non-frontal lesions were indistinguishable from controls and showed no modulation by laterality. Neither the presence nor the extent of multiple demand network involvement affected performance. Both conventional network-based statistics and non-parametric Bayesian stochastic block modelling heavily implicated the right frontal lobe. Crucially, this localization was confirmed on explicitly disentangling functional from pathology-driven effects within a layered stochastic block model, prominently highlighting a right frontal network involving middle and inferior frontal gyrus, pre- and post-central gyri, with a weak contribution from right superior parietal lobule. Similar results were obtained with standard lesion-deficit analyses. Our study represents the first large-scale investigation of the distributed neural substrates of fluid intelligence in the focally injured brain. Combining novel graph-based lesion-deficit mapping with detailed investigation of cognitive performance in a large sample of patients provides crucial information about the neural basis of intelligence. Our findings indicate that a set of predominantly right frontal regions, rather than a more widely distributed network, is critical to the high-level functions involved in fluid intelligence. Further they suggest that Raven's Advanced Progressive Matrices is a useful clinical index of fluid intelligence and a sensitive marker of right frontal lobe dysfunction.
Collapse
Affiliation(s)
- Lisa Cipolotti
- Correspondence to: Prof. Lisa Cipolotti Department of NeuropsychologyNational Hospital for Neurology and NeurosurgeryQueen Square, London WC1N 3BG, UKE-mail:
| | - James K Ruffle
- Institute of Neurology, University College London, London WC1N 3BG, UK,Department of Radiology, University College London Hospitals NHS Foundation Trust, London NW1 2PG, UK
| | - Joe Mole
- Department of Neuropsychology, National Hospital for Neurology and Neurosurgery, London WC1N 3BG, UK,Institute of Neurology, University College London, London WC1N 3BG, UK
| | - Tianbo Xu
- Institute of Neurology, University College London, London WC1N 3BG, UK
| | - Harpreet Hyare
- Institute of Neurology, University College London, London WC1N 3BG, UK,Department of Radiology, University College London Hospitals NHS Foundation Trust, London NW1 2PG, UK
| | - Tim Shallice
- Institute of Cognitive Neuroscience, University College London, London WC1N 3AZ, UK,Cognitive Neuropsychology and Neuroimaging Lab, International School for Advanced Studies (SISSA-ISAS), 34136 Trieste, Italy
| | - Edgar Chan
- Department of Neuropsychology, National Hospital for Neurology and Neurosurgery, London WC1N 3BG, UK,Institute of Neurology, University College London, London WC1N 3BG, UK
| | - Parashkev Nachev
- Institute of Neurology, University College London, London WC1N 3BG, UK
| |
Collapse
|
15
|
Ramaciotti Morales P, Cointet JP, Muñoz Zolotoochin G, Fernández Peralta A, Iñiguez G, Pournaki A. Inferring attitudinal spaces in social networks. SOCIAL NETWORK ANALYSIS AND MINING 2022. [DOI: 10.1007/s13278-022-01013-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
16
|
"Stealing fire or stacking knowledge" by machine intelligence to model link prediction in complex networks. iScience 2022; 26:105697. [PMID: 36570772 PMCID: PMC9771718 DOI: 10.1016/j.isci.2022.105697] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Revised: 09/15/2022] [Accepted: 11/22/2022] [Indexed: 12/02/2022] Open
Abstract
Current methodologies to model connectivity in complex networks either rely on network scientists' intelligence to discover reliable physical rules or use artificial intelligence (AI) that stacks hundreds of inaccurate human-made rules to make a new one that optimally summarizes them together. Here, we provide an accurate and reproducible scientific analysis showing that, contrary to the current belief, stacking more good link prediction rules does not necessarily improve the link prediction performance to nearly optimal as suggested by recent studies. Finally, under the light of our novel results, we discuss the pros and cons of each current state-of-the-art link prediction strategy, concluding that none of the current solutions are what the future might hold for us. Future solutions might require the design and development of next generation "creative" AI that are able to generate and understand complex physical rules for us.
Collapse
|
17
|
Barbarino G, Noferini V, Van Dooren P. Role extraction for digraphs via neighborhood pattern similarity. Phys Rev E 2022; 106:054301. [PMID: 36559511 DOI: 10.1103/physreve.106.054301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 10/06/2022] [Indexed: 12/24/2022]
Abstract
We analyze the recovery of different roles in a network modeled by a directed graph, based on the so-called Neighborhood Pattern Similarity approach. Our analysis uses results from random matrix theory to show that, when assuming that the graph is generated as a particular stochastic block model with Bernoulli probability distributions for the different blocks, then the recovery is asymptotically correct when the graph has a sufficiently large dimension. Under these assumptions there is a sufficient gap between the dominant and dominated eigenvalues of the similarity matrix, which guarantees the asymptotic correct identification of the number of different roles. We also comment on the connections with the literature on stochastic block models, including the case of probabilities of order log(n)/n where n is the graph size. We provide numerical experiments to assess the effectiveness of the method when applied to practical networks of finite size.
Collapse
Affiliation(s)
- Giovanni Barbarino
- Aalto University, Department of Mathematics and Systems Analysis, P.O. Box 11100, FI-00076 Aalto, Finland
| | - Vanni Noferini
- Aalto University, Department of Mathematics and Systems Analysis, P.O. Box 11100, FI-00076 Aalto, Finland
| | - Paul Van Dooren
- Université catholique de Louvain, Department of Mathematical Engineering, Av. Lemaitre 4, B-1348 Louvain-la-Neuve, Belgium
| |
Collapse
|
18
|
Network-based integration of multi-omics data for clinical outcome prediction in neuroblastoma. Sci Rep 2022; 12:15425. [PMID: 36104347 PMCID: PMC9475034 DOI: 10.1038/s41598-022-19019-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 08/23/2022] [Indexed: 11/08/2022] Open
Abstract
AbstractMulti-omics data are increasingly being gathered for investigations of complex diseases such as cancer. However, high dimensionality, small sample size, and heterogeneity of different omics types pose huge challenges to integrated analysis. In this paper, we evaluate two network-based approaches for integration of multi-omics data in an application of clinical outcome prediction of neuroblastoma. We derive Patient Similarity Networks (PSN) as the first step for individual omics data by computing distances among patients from omics features. The fusion of different omics can be investigated in two ways: the network-level fusion is achieved using Similarity Network Fusion algorithm for fusing the PSNs derived for individual omics types; and the feature-level fusion is achieved by fusing the network features obtained from individual PSNs. We demonstrate our methods on two high-risk neuroblastoma datasets from SEQC project and TARGET project. We propose Deep Neural Network and Machine Learning methods with Recursive Feature Elimination as the predictor of survival status of neuroblastoma patients. Our results indicate that network-level fusion outperformed feature-level fusion for integration of different omics data whereas feature-level fusion is more suitable incorporating different feature types derived from same omics type. We conclude that the network-based methods are capable of handling heterogeneity and high dimensionality well in the integration of multi-omics.
Collapse
|
19
|
Kovács L, Bóta A, Hajdu L, Krész M. Brands, networks, communities: How brand names are wired in the mind. PLoS One 2022; 17:e0273192. [PMID: 36006965 PMCID: PMC9409517 DOI: 10.1371/journal.pone.0273192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Accepted: 08/03/2022] [Indexed: 11/18/2022] Open
Abstract
Brands can be defined as psychological constructs residing in our minds. By analyzing brand associations, we can study the mental constructs around them. In this paper, we study brands as parts of an associative network based on a word association database. We explore the communities–closely-knit groups in the mind–around brand names in this structure using two community detection algorithms in the Hungarian word association database ConnectYourMind. We identify brand names inside the communities of a word association network and explain why these brand names are part of the community. Several detected communities contain brand names from the same product category, and the words in these categories were connected either to brands in the category or to words describing the product category. Based on our findings, we describe the mental position of brand names. We show that brand knowledge, product knowledge and real word knowledge interact with each other. We also show how the meaning of a product category arises and how this meaning is related to brand meaning. Our results suggest that words sharing the same community with brand names can be used in brand communication and brand positioning.
Collapse
Affiliation(s)
- László Kovács
- Savaria Department of Business Administration, Faculty of Social Sciences, E¨otv¨os Lor´and University, Szombathely, Hungary
| | - András Bóta
- Department of Computer Science, Electrical and Space Engineering, Embedded Intelligent Systems Lab, Lule˚a University of Technology, Lule˚a, Sweden
- * E-mail:
| | - László Hajdu
- Innorenew CoE, Izola, Slovenia
- Faculty of Mathematics, Natural Sciences and Information Technologies, University of Primorska, Koper, Slovenia
- Gyula Juh´asz Faculty of Education, University of Szeged, Szeged, Hungary
| | - Miklós Krész
- Innorenew CoE, Izola, Slovenia
- Andrej Maruˇsiˆc Institute, University of Primorska, Koper, Slovenia
- Gyula Juh´asz Faculty of Education, University of Szeged, Szeged, Hungary
| |
Collapse
|
20
|
Peixoto TP. Ordered community detection in directed networks. Phys Rev E 2022; 106:024305. [PMID: 36109944 DOI: 10.1103/physreve.106.024305] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Accepted: 08/02/2022] [Indexed: 06/15/2023]
Abstract
We develop a method to infer community structure in directed networks where the groups are ordered in a latent one-dimensional hierarchy that determines the preferred edge direction. Our nonparametric Bayesian approach is based on a modification of the stochastic block model (SBM), which can take advantage of rank alignment and coherence to produce parsimonious descriptions of networks that combine ordered hierarchies with arbitrary mixing patterns between groups. Since our model also includes directed degree correction, we can use it to distinguish nonlocal hierarchical structure from local in- and out-degree imbalance-thus, removing a source of conflation present in most ranking methods. We also demonstrate how we can reliably compare with the results obtained with the unordered SBM variant to determine whether a hierarchical ordering is statistically warranted in the first place. We illustrate the application of our method on a wide variety of empirical networks across several domains.
Collapse
Affiliation(s)
- Tiago P Peixoto
- Department of Network and Data Science, Central European University, 1100 Vienna, Austria
| |
Collapse
|
21
|
Vaca-Ramírez F, Peixoto TP. Systematic assessment of the quality of fit of the stochastic block model for empirical networks. Phys Rev E 2022; 105:054311. [PMID: 35706168 DOI: 10.1103/physreve.105.054311] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Accepted: 04/19/2022] [Indexed: 06/15/2023]
Abstract
We perform a systematic analysis of the quality of fit of the stochastic block model (SBM) for 275 empirical networks spanning a wide range of domains and orders of size magnitude. We employ posterior predictive model checking as a criterion to assess the quality of fit, which involves comparing networks generated by the inferred model with the empirical network, according to a set of network descriptors. We observe that the SBM is capable of providing an accurate description for the majority of networks considered, but falls short of saturating all modeling requirements. In particular, networks possessing a large diameter and slow-mixing random walks tend to be badly described by the SBM. However, contrary to what is often assumed, networks with a high abundance of triangles can be well described by the SBM in many cases. We demonstrate that simple network descriptors can be used to evaluate whether or not the SBM can provide a sufficiently accurate representation, potentially pointing to possible model extensions that can systematically improve the expressiveness of this class of models.
Collapse
Affiliation(s)
- Felipe Vaca-Ramírez
- Department of Network and Data Science, Central European University, 1100 Vienna, Austria
| | - Tiago P Peixoto
- Department of Network and Data Science, Central European University, 1100 Vienna, Austria
| |
Collapse
|
22
|
Autoencoder Model Using Edge Enhancement to Detect Communities in Complex Networks. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2022. [DOI: 10.1007/s13369-022-06747-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
23
|
Valle F, Osella M, Caselle M. Multiomics Topic Modeling for Breast Cancer Classification. Cancers (Basel) 2022; 14:1150. [PMID: 35267458 PMCID: PMC8909787 DOI: 10.3390/cancers14051150] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 02/18/2022] [Indexed: 12/04/2022] Open
Abstract
The integration of transcriptional data with other layers of information, such as the post-transcriptional regulation mediated by microRNAs, can be crucial to identify the driver genes and the subtypes of complex and heterogeneous diseases such as cancer. This paper presents an approach based on topic modeling to accomplish this integration task. More specifically, we show how an algorithm based on a hierarchical version of stochastic block modeling can be naturally extended to integrate any combination of 'omics data. We test this approach on breast cancer samples from the TCGA database, integrating data on messenger RNA, microRNAs, and copy number variations. We show that the inclusion of the microRNA layer significantly improves the accuracy of subtype classification. Moreover, some of the hidden structures or "topics" that the algorithm extracts actually correspond to genes and microRNAs involved in breast cancer development and are associated to the survival probability.
Collapse
Affiliation(s)
- Filippo Valle
- Physics Department, University of Turin and INFN, via P. Giuria 1, 10125 Turin, Italy; (M.O.); (M.C.)
| | | | | |
Collapse
|
24
|
Suriyalaksh M, Raimondi C, Mains A, Segonds-Pichon A, Mukhtar S, Murdoch S, Aldunate R, Krueger F, Guimerà R, Andrews S, Sales-Pardo M, Casanueva O. Gene regulatory network inference in long-lived C. elegans reveals modular properties that are predictive of novel aging genes. iScience 2022; 25:103663. [PMID: 35036864 PMCID: PMC8753122 DOI: 10.1016/j.isci.2021.103663] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 09/09/2021] [Accepted: 12/15/2021] [Indexed: 11/24/2022] Open
Abstract
We design a “wisdom-of-the-crowds” GRN inference pipeline and couple it to complex network analysis to understand the organizational principles governing gene regulation in long-lived glp-1/Notch Caenorhabditis elegans. The GRN has three layers (input, core, and output) and is topologically equivalent to bow-tie/hourglass structures prevalent among metabolic networks. To assess the functional importance of structural layers, we screened 80% of regulators and discovered 50 new aging genes, 86% with human orthologues. Genes essential for longevity—including ones involved in insulin-like signaling (ILS)—are at the core, indicating that GRN's structure is predictive of functionality. We used in vivo reporters and a novel functional network covering 5,497 genetic interactions to make mechanistic predictions. We used genetic epistasis to test some of these predictions, uncovering a novel transcriptional regulator, sup-37, that works alongside DAF-16/FOXO. We present a framework with predictive power that can accelerate discovery in C. elegans and potentially humans. Gene-regulatory inference provides global network of long-lived animals The large-scale topology of the network has an hourglass structure Membership to the core of the hourglass is a good predictor of functionality Discovered 50 novel aging genes, including sup-37, a DAF-16 dependent gene
Collapse
Affiliation(s)
| | | | - Abraham Mains
- Babraham Institute, Babraham, Cambridge CB22 3AT, UK
| | | | | | | | - Rebeca Aldunate
- Escuela de Biotecnología, Facultad de Ciencias, Universidad Santo Tomas, Santiago, Chile
| | - Felix Krueger
- Babraham Institute, Babraham, Cambridge CB22 3AT, UK
| | - Roger Guimerà
- ICREA, Barcelona 08010, Catalonia, Spain.,Department of Chemical Engineering, Universitat Rovira i Virgili, Tarragona 43007, Catalonia, Spain
| | - Simon Andrews
- Babraham Institute, Babraham, Cambridge CB22 3AT, UK
| | - Marta Sales-Pardo
- Department of Chemical Engineering, Universitat Rovira i Virgili, Tarragona 43007, Catalonia, Spain
| | | |
Collapse
|
25
|
Abstract
AbstractComplex systems, abstractly represented as networks, are ubiquitous in everyday life. Analyzing and understanding these systems requires, among others, tools for community detection. As no single best community detection algorithm can exist, robustness across a wide variety of problem settings is desirable. In this work, we present Synwalk, a random walk-based community detection method. Synwalk builds upon a solid theoretical basis and detects communities by synthesizing the random walk induced by the given network from a class of candidate random walks. We thoroughly validate the effectiveness of our approach on synthetic and empirical networks, respectively, and compare Synwalk’s performance with the performance of Infomap and Walktrap (also random walk-based), Louvain (based on modularity maximization) and stochastic block model inference. Our results indicate that Synwalk performs robustly on networks with varying mixing parameters and degree distributions. We outperform Infomap on networks with high mixing parameter, and Infomap and Walktrap on networks with many small communities and low average degree. Our work has a potential to inspire further development of community detection via synthesis of random walks and we provide concrete ideas for future research.
Collapse
|
26
|
Abstract
AbstractWe propose two new algorithms for clustering graphs and networks. The first, called K‑algorithm, is derived directly from the k-means algorithm. It applies similar iterative local optimization but without the need to calculate the means. It inherits the properties of k-means clustering in terms of both good local optimization capability and the tendency to get stuck at a local optimum. The second algorithm, called the M-algorithm, gradually improves on the results of the K-algorithm to find new and potentially better local optima. It repeatedly merges and splits random clusters and tunes the results with the K-algorithm. Both algorithms are general in the sense that they can be used with different cost functions. We consider the conductance cost function and also introduce two new cost functions, called inverse internal weight and mean internal weight. According to our experiments, the M-algorithm outperforms eight other state-of-the-art methods. We also perform a case study by analyzing clustering results of a disease co-occurrence network, which demonstrate the usefulness of the algorithms in an important real-life application.
Collapse
|
27
|
Morelli L, Giansanti V, Cittaro D. Nested Stochastic Block Models applied to the analysis of single cell data. BMC Bioinformatics 2021; 22:576. [PMID: 34847879 PMCID: PMC8630903 DOI: 10.1186/s12859-021-04489-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Accepted: 11/19/2021] [Indexed: 12/30/2022] Open
Abstract
Single cell profiling has been proven to be a powerful tool in molecular biology to understand the complex behaviours of heterogeneous system. The definition of the properties of single cells is the primary endpoint of such analysis, cells are typically clustered to underpin the common determinants that can be used to describe functional properties of the cell mixture under investigation. Several approaches have been proposed to identify cell clusters; while this is matter of active research, one popular approach is based on community detection in neighbourhood graphs by optimisation of modularity. In this paper we propose an alternative and principled solution to this problem, based on Stochastic Block Models. We show that such approach not only is suitable for identification of cell groups, it also provides a solid framework to perform other relevant tasks in single cell analysis, such as label transfer. To encourage the use of Stochastic Block Models, we developed a python library, schist, that is compatible with the popular scanpy framework.
Collapse
Affiliation(s)
- Leonardo Morelli
- Center for Omics Sciences, IRCCS San Raffaele Institute, Milan, Italy
- Università Vita-Salute San Raffaele, Milan, Italy
| | - Valentina Giansanti
- Center for Omics Sciences, IRCCS San Raffaele Institute, Milan, Italy
- Department of Informatics, Systems and Communication, University of Milano-Bicocca, Milan, Italy
| | - Davide Cittaro
- Center for Omics Sciences, IRCCS San Raffaele Institute, Milan, Italy.
| |
Collapse
|
28
|
Dolan H, Rastelli R. A Model-Based Approach to Assess Epidemic Risk. STATISTICS IN BIOSCIENCES 2021; 14:452-484. [PMID: 34804245 PMCID: PMC8591322 DOI: 10.1007/s12561-021-09329-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Revised: 10/06/2021] [Accepted: 10/20/2021] [Indexed: 11/16/2022]
Abstract
We study how international flights can facilitate the spread of an epidemic to a worldwide scale. We combine an infrastructure network of flight connections with a population density dataset to derive the mobility network, and then we define an epidemic framework to model the spread of the disease. Our approach combines a compartmental SEIRS model with a graph diffusion model to capture the clusteredness of the distribution of the population. The resulting model is characterised by the dynamics of a metapopulation SEIRS, with amplification or reduction of the infection rate which is determined also by the mobility of individuals. We use simulations to characterise and study a variety of realistic scenarios that resemble the recent spread of COVID-19. Crucially, we define a formal framework that can be used to design epidemic mitigation strategies: we propose an optimisation approach based on genetic algorithms that can be used to identify an optimal airport closure strategy, and that can be employed to aid decision making for the mitigation of the epidemic, in a timely manner.
Collapse
Affiliation(s)
- Hugo Dolan
- University College Dublin, Dublin, Ireland
| | | |
Collapse
|
29
|
Wang J, Li K. Community structure exploration considering latent link patterns in complex networks. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.06.032] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
30
|
Mitrai I, Tang W, Daoutidis P. Stochastic blockmodeling for learning the structure of optimization problems. AIChE J 2021. [DOI: 10.1002/aic.17415] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Affiliation(s)
- Ilias Mitrai
- Department of Chemical Engineering and Materials Science University of Minnesota Minneapolis Minnesota USA
| | - Wentao Tang
- Projects and Technology Shell Global Solutions (U.S.) Inc. Houston Texas USA
| | - Prodromos Daoutidis
- Department of Chemical Engineering and Materials Science University of Minnesota Minneapolis Minnesota USA
| |
Collapse
|
31
|
De Nicola G, Sischka B, Kauermann G. Mixture models and networks: The stochastic blockmodel. STAT MODEL 2021. [DOI: 10.1177/1471082x211033169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Mixture models are probabilistic models aimed at uncovering and representing latent subgroups within a population. In the realm of network data analysis, the latent subgroups of nodes are typically identified by their connectivity behaviour, with nodes behaving similarly belonging to the same community. In this context, mixture modelling is pursued through stochastic blockmodelling. We consider stochastic blockmodels and some of their variants and extensions from a mixture modelling perspective. We also explore some of the main classes of estimation methods available and propose an alternative approach based on the reformulation of the blockmodel as a graphon. In addition to the discussion of inferential properties and estimating procedures, we focus on the application of the models to several real-world network datasets, showcasing the advantages and pitfalls of different approaches.
Collapse
Affiliation(s)
- Giacomo De Nicola
- Department of Statistics, Faculty of Mathematics, Informatics and Statistics, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Benjamin Sischka
- Department of Statistics, Faculty of Mathematics, Informatics and Statistics, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Göran Kauermann
- Department of Statistics, Faculty of Mathematics, Informatics and Statistics, Ludwig-Maximilians-Universität München, Munich, Germany
| |
Collapse
|
32
|
The autonomic brain: Multi-dimensional generative hierarchical modelling of the autonomic connectome. Cortex 2021; 143:164-179. [PMID: 34438298 PMCID: PMC8500219 DOI: 10.1016/j.cortex.2021.06.012] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2020] [Revised: 05/11/2021] [Accepted: 06/20/2021] [Indexed: 01/08/2023]
Abstract
The autonomic nervous system governs the body's multifaceted internal adaptation to diverse changes in the external environment, a role more complex than is accessible to the methods-and data scales-hitherto used to illuminate its operation. Here we apply generative graphical modelling to large-scale multimodal neuroimaging data encompassing normal and abnormal states to derive a comprehensive hierarchical representation of the autonomic brain. We demonstrate that whereas conventional structural and functional maps identify regions jointly modulated by parasympathetic and sympathetic systems, only graphical analysis discriminates between them, revealing the cardinal roles of the autonomic system to be mediated by high-level distributed interactions. We provide a novel representation of the autonomic system-a multidimensional, generative network-that renders its richness tractable within future models of its function in health and disease.
Collapse
|
33
|
Gallagher RJ, Young JG, Welles BF. A clarified typology of core-periphery structure in networks. SCIENCE ADVANCES 2021; 7:7/12/eabc9800. [PMID: 33731343 PMCID: PMC7968838 DOI: 10.1126/sciadv.abc9800] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Accepted: 01/29/2021] [Indexed: 06/12/2023]
Abstract
Core-periphery structure, the arrangement of a network into a dense core and sparse periphery, is a versatile descriptor of various social, biological, and technological networks. In practice, different core-periphery algorithms are often applied interchangeably despite the fact that they can yield inconsistent descriptions of core-periphery structure. For example, two of the most widely used algorithms, the k-cores decomposition and the classic two-block model of Borgatti and Everett, extract fundamentally different structures: The latter partitions a network into a binary hub-and-spoke layout, while the former divides it into a layered hierarchy. We introduce a core-periphery typology to clarify these differences, along with Bayesian stochastic block modeling techniques to classify networks in accordance with this typology. Empirically, we find a rich diversity of core-periphery structure among networks. Through a detailed case study, we demonstrate the importance of acknowledging this diversity and situating networks within the core-periphery typology when conducting domain-specific analyses.
Collapse
Affiliation(s)
- Ryan J Gallagher
- Network Science Institute, Northeastern University, Boston, MA 02115, USA.
| | - Jean-Gabriel Young
- Center for the Study of Complex Systems, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Computer Science, University of Vermont, Burlington, VT 05405, USA
| | - Brooke Foucault Welles
- Network Science Institute, Northeastern University, Boston, MA 02115, USA
- Department of Communication Studies, Northeastern University, Boston, MA 02115, USA
| |
Collapse
|
34
|
Fonseca Dos Reis E, Viney M, Masuda N. Network analysis of the immune state of mice. Sci Rep 2021; 11:4306. [PMID: 33619299 PMCID: PMC7900184 DOI: 10.1038/s41598-021-83139-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 01/21/2021] [Indexed: 11/09/2022] Open
Abstract
The mammalian immune system protects individuals from infection and disease. It is a complex system of interacting cells and molecules, which has been studied extensively to investigate its detailed function, principally using laboratory mice. Despite the complexity of the immune system, it is often analysed using a restricted set of immunological parameters. Here we have sought to generate a system-wide view of the murine immune response, which we have done by undertaking a network analysis of 120 immune measures. To date, there has only been limited network analyses of the immune system. Our network analysis identified a relatively low number of communities of immune measure nodes. Some of these communities recapitulate the well-known T helper 1 vs. T helper 2 cytokine polarisation (where ordination analyses failed to do so), which validates the utility of our approach. Other communities we detected show apparently novel juxtapositions of immune nodes. We suggest that the structure of these other communities might represent functional immunological units, which may require further empirical investigation. These results show the utility of network analysis in understanding the functioning of the mammalian immune system.
Collapse
Affiliation(s)
| | - Mark Viney
- Department of Evolution, Ecology and Behaviour, University of Liverpool, Liverpool, L69 7ZB, UK
| | - Naoki Masuda
- Department of Mathematics, State University of New York at Buffalo, Buffalo, 14260, USA. .,Computational and Data-Enabled Science and Engineering Program, State University of New York at Buffalo, Buffalo, 14260, USA. .,Faculty of Science and Engineering, Waseda University, Tokyo, 169-8555, Japan.
| |
Collapse
|
35
|
Abstract
Hostile influence operations (IOs) that weaponize digital communications and social media pose a rising threat to open democracies. This paper presents a system framework to automate detection of disinformation narratives, networks, and influential actors. The framework integrates natural language processing, machine learning, graph analytics, and network causal inference to quantify the impact of individual actors in spreading the IO narrative. We present a classifier that detects reported IO accounts with 96% precision, 79% recall, and 96% AUPRC, demonstrated on real social media data collected for the 2017 French presidential election and known IO accounts disclosed by Twitter. Our system also discovers salient network communities and high-impact accounts that are independently corroborated by US Congressional reports and investigative journalism. The weaponization of digital communications and social media to conduct disinformation campaigns at immense scale, speed, and reach presents new challenges to identify and counter hostile influence operations (IOs). This paper presents an end-to-end framework to automate detection of disinformation narratives, networks, and influential actors. The framework integrates natural language processing, machine learning, graph analytics, and a network causal inference approach to quantify the impact of individual actors in spreading IO narratives. We demonstrate its capability on real-world hostile IO campaigns with Twitter datasets collected during the 2017 French presidential elections and known IO accounts disclosed by Twitter over a broad range of IO campaigns (May 2007 to February 2020), over 50,000 accounts, 17 countries, and different account types including both trolls and bots. Our system detects IO accounts with 96% precision, 79% recall, and 96% area-under-the precision-recall (P-R) curve; maps out salient network communities; and discovers high-impact accounts that escape the lens of traditional impact statistics based on activity counts and network centrality. Results are corroborated with independent sources of known IO accounts from US Congressional reports, investigative journalism, and IO datasets provided by Twitter.
Collapse
|
36
|
Valle F, Osella M, Caselle M. A Topic Modeling Analysis of TCGA Breast and Lung Cancer Transcriptomic Data. Cancers (Basel) 2020; 12:E3799. [PMID: 33339347 PMCID: PMC7766023 DOI: 10.3390/cancers12123799] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Revised: 12/07/2020] [Accepted: 12/11/2020] [Indexed: 01/18/2023] Open
Abstract
Topic modeling is a widely used technique to extract relevant information from large arrays of data. The problem of finding a topic structure in a dataset was recently recognized to be analogous to the community detection problem in network theory. Leveraging on this analogy, a new class of topic modeling strategies has been introduced to overcome some of the limitations of classical methods. This paper applies these recent ideas to TCGA transcriptomic data on breast and lung cancer. The established cancer subtype organization is well reconstructed in the inferred latent topic structure. Moreover, we identify specific topics that are enriched in genes known to play a role in the corresponding disease and are strongly related to the survival probability of patients. Finally, we show that a simple neural network classifier operating in the low dimensional topic space is able to predict with high accuracy the cancer subtype of a test expression sample.
Collapse
Affiliation(s)
- Filippo Valle
- Physics Department, University of Turin and INFN, via P. Giuria 1, 10125 Turin, Italy; (M.O.); (M.C.)
| | | | | |
Collapse
|
37
|
Alves LGA, Sigaki HYD, Perc M, Ribeiro HV. Collective dynamics of stock market efficiency. Sci Rep 2020; 10:21992. [PMID: 33319788 PMCID: PMC7738547 DOI: 10.1038/s41598-020-78707-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 11/30/2020] [Indexed: 11/09/2022] Open
Abstract
Summarized by the efficient market hypothesis, the idea that stock prices fully reflect all available information is always confronted with the behavior of real-world markets. While there is plenty of evidence indicating and quantifying the efficiency of stock markets, most studies assume this efficiency to be constant over time so that its dynamical and collective aspects remain poorly understood. Here we define the time-varying efficiency of stock markets by calculating the permutation entropy within sliding time-windows of log-returns of stock market indices. We show that major world stock markets can be hierarchically classified into several groups that display similar long-term efficiency profiles. However, we also show that efficiency ranks and clusters of markets with similar trends are only stable for a few months at a time. We thus propose a network representation of stock markets that aggregates their short-term efficiency patterns into a global and coherent picture. We find this financial network to be strongly entangled while also having a modular structure that consists of two distinct groups of stock markets. Our results suggest that stock market efficiency is a collective phenomenon that can drive its operation at a high level of informational efficiency, but also places the entire system under risk of failure.
Collapse
Affiliation(s)
- Luiz G A Alves
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, 60208, USA
| | - Higor Y D Sigaki
- Departamento de Física, Universidade Estadual de Maringá, Maringá, PR, 87020-900, Brazil
| | - Matjaž Perc
- Faculty of Natural Sciences and Mathematics, University of Maribor, Koroška cesta 160, 2000, Maribor, Slovenia. .,Department of Medical Research, China Medical University Hospital, China Medical University, Taichung, Taiwan. .,Complexity Science Hub Vienna, Josefstädterstraße 39, 1080, Vienna, Austria.
| | - Haroldo V Ribeiro
- Departamento de Física, Universidade Estadual de Maringá, Maringá, PR, 87020-900, Brazil
| |
Collapse
|
38
|
Schawe H, Hartmann AK. Large deviations of connected components in the stochastic block model. Phys Rev E 2020; 102:052108. [PMID: 33327148 DOI: 10.1103/physreve.102.052108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Accepted: 10/19/2020] [Indexed: 06/12/2023]
Abstract
We study the stochastic block model, which is often used to model community structures and study community-detection algorithms. We consider the case of two blocks in regard to its largest connected component and largest biconnected component, respectively. We are especially interested in the distributions of their sizes including the tails down to probabilities smaller than 10^{-800}. For this purpose we use sophisticated Markov chain Monte Carlo simulations to sample graphs from the stochastic block model ensemble. We use these data to study the large-deviation rate function and conjecture that the large-deviation principle holds. Further we compare the distribution to the well-known Erdős-Rényi ensemble, where we notice subtle differences at and above the percolation threshold.
Collapse
Affiliation(s)
- Hendrik Schawe
- Laboratoire de Physique Théorique et Modélisation, UMR-8089 CNRS, CY Cergy Paris Université, 95000 Cergy, France
- Institut für Physik, Universität Oldenburg, 26111 Oldenburg, Germany
| | | |
Collapse
|
39
|
Yen TC, Larremore DB. Community detection in bipartite networks with stochastic block models. Phys Rev E 2020; 102:032309. [PMID: 33075933 DOI: 10.1103/physreve.102.032309] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2020] [Accepted: 07/23/2020] [Indexed: 11/07/2022]
Abstract
In bipartite networks, community structures are restricted to being disassortative, in that nodes of one type are grouped according to common patterns of connection with nodes of the other type. This makes the stochastic block model (SBM), a highly flexible generative model for networks with block structure, an intuitive choice for bipartite community detection. However, typical formulations of the SBM do not make use of the special structure of bipartite networks. Here we introduce a Bayesian nonparametric formulation of the SBM and a corresponding algorithm to efficiently find communities in bipartite networks which parsimoniously chooses the number of communities. The biSBM improves community detection results over general SBMs when data are noisy, improves the model resolution limit by a factor of sqrt[2], and expands our understanding of the complicated optimization landscape associated with community detection tasks. A direct comparison of certain terms of the prior distributions in the biSBM and a related high-resolution hierarchical SBM also reveals a counterintuitive regime of community detection problems, populated by smaller and sparser networks, where nonhierarchical models outperform their more flexible counterpart.
Collapse
Affiliation(s)
- Tzu-Chi Yen
- Department of Computer Science, University of Colorado, Boulder, Colorado 80309, USA
| | - Daniel B Larremore
- Department of Computer Science, University of Colorado, Boulder, Colorado 80309, USA.,BioFrontiers Institute, University of Colorado, Boulder, Colorado 80303, USA
| |
Collapse
|
40
|
Elliott A, Chiu A, Bazzi M, Reinert G, Cucuringu M. Core-periphery structure in directed networks. Proc Math Phys Eng Sci 2020; 476:20190783. [PMID: 33061788 PMCID: PMC7544362 DOI: 10.1098/rspa.2019.0783] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2019] [Accepted: 06/25/2020] [Indexed: 11/17/2022] Open
Abstract
Empirical networks often exhibit different meso-scale structures, such as community and core-periphery structures. Core-periphery structure typically consists of a well-connected core and a periphery that is well connected to the core but sparsely connected internally. Most core-periphery studies focus on undirected networks. We propose a generalization of core-periphery structure to directed networks. Our approach yields a family of core-periphery block model formulations in which, contrary to many existing approaches, core and periphery sets are edge-direction dependent. We focus on a particular structure consisting of two core sets and two periphery sets, which we motivate empirically. We propose two measures to assess the statistical significance and quality of our novel structure in empirical data, where one often has no ground truth. To detect core-periphery structure in directed networks, we propose three methods adapted from two approaches in the literature, each with a different trade-off between computational complexity and accuracy. We assess the methods on benchmark networks where our methods match or outperform standard methods from the literature, with a likelihood approach achieving the highest accuracy. Applying our methods to three empirical networks-faculty hiring, a world trade dataset and political blogs-illustrates that our proposed structure provides novel insights in empirical networks.
Collapse
Affiliation(s)
- Andrew Elliott
- The Alan Turing Institute, London, UK
- Department of Statistics, University of Oxford, Oxford, UK
| | - Angus Chiu
- Department of Statistics, University of Oxford, Oxford, UK
| | - Marya Bazzi
- The Alan Turing Institute, London, UK
- Mathematical Institute, University of Oxford, Oxford, UK
- Mathematics Institute, University of Warwick, Coventry, UK
| | - Gesine Reinert
- The Alan Turing Institute, London, UK
- Department of Statistics, University of Oxford, Oxford, UK
| | - Mihai Cucuringu
- The Alan Turing Institute, London, UK
- Department of Statistics, University of Oxford, Oxford, UK
- Mathematical Institute, University of Oxford, Oxford, UK
| |
Collapse
|
41
|
Peixoto TP. Merge-split Markov chain Monte Carlo for community detection. Phys Rev E 2020; 102:012305. [PMID: 32794904 DOI: 10.1103/physreve.102.012305] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Accepted: 06/19/2020] [Indexed: 11/07/2022]
Abstract
We present a Markov chain Monte Carlo scheme based on merges and splits of groups that is capable of efficiently sampling from the posterior distribution of network partitions, defined according to the stochastic block model (SBM). We demonstrate how schemes based on the move of single nodes between groups systematically fail at correctly sampling from the posterior distribution even on small networks, and how our merge-split approach behaves significantly better, and improves the mixing time of the Markov chain by several orders of magnitude in typical cases. We also show how the scheme can be straightforwardly extended to nested versions of the SBM, yielding asymptotically exact samples of hierarchical network partitions.
Collapse
Affiliation(s)
- Tiago P Peixoto
- Department of Network and Data Science, Central European University, H-1051 Budapest, Hungary; ISI Foundation, Via Chisola 5, 10126 Torino, Italy; and Department of Mathematical Sciences, University of Bath, Claverton Down, Bath BA2 7AY, United Kingdom
| |
Collapse
|
42
|
Abstract
MOTIVATION Recent single-cell DNA sequencing technologies enable whole-genome sequencing of hundreds to thousands of individual cells. However, these technologies have ultra-low sequencing coverage (<0.5× per cell) which has limited their use to the analysis of large copy-number aberrations (CNAs) in individual cells. While CNAs are useful markers in cancer studies, single-nucleotide mutations are equally important, both in cancer studies and in other applications. However, ultra-low coverage sequencing yields single-nucleotide mutation data that are too sparse for current single-cell analysis methods. RESULTS We introduce SBMClone, a method to infer clusters of cells, or clones, that share groups of somatic single-nucleotide mutations. SBMClone uses a stochastic block model to overcome sparsity in ultra-low coverage single-cell sequencing data, and we show that SBMClone accurately infers the true clonal composition on simulated datasets with coverage at low as 0.2×. We applied SBMClone to single-cell whole-genome sequencing data from two breast cancer patients obtained using two different sequencing technologies. On the first patient, sequenced using the 10X Genomics CNV solution with sequencing coverage ≈0.03×, SBMClone recovers the major clonal composition when incorporating a small amount of additional information. On the second patient, where pre- and post-treatment tumor samples were sequenced using DOP-PCR with sequencing coverage ≈0.5×, SBMClone shows that tumor cells are present in the post-treatment sample, contrary to published analysis of this dataset. AVAILABILITY AND IMPLEMENTATION SBMClone is available on the GitHub repository https://github.com/raphael-group/SBMClone. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Matthew A Myers
- Department of Computer Science, Princeton University, Princeton, NJ 08544, USA
| | - Simone Zaccaria
- Department of Computer Science, Princeton University, Princeton, NJ 08544, USA
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ 08544, USA
| |
Collapse
|
43
|
Brzoska L, Fischer M, Lentz HHK. Hierarchical Structures in Livestock Trade Networks-A Stochastic Block Model of the German Cattle Trade Network. Front Vet Sci 2020; 7:281. [PMID: 32537461 PMCID: PMC7266987 DOI: 10.3389/fvets.2020.00281] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2019] [Accepted: 04/27/2020] [Indexed: 12/16/2022] Open
Abstract
Trade of cattle between farms forms a complex trade network. We investigate partitions of this network for cattle trade in Germany. These partitions are groups of farms with similar properties and they are inferred directly from the trade pattern between farms. We make use of a rather new method known as stochastic block modeling (SBM) in order to divide the network into smaller units. SBM turns out to outperform the more established community detection method in the context of disease control in terms of trade restriction. Moreover, SBM is also superior to geographical based trade restrictions and could be a promising approach for disease control.
Collapse
Affiliation(s)
- Laura Brzoska
- Institute of Mathematics and Computer Science, University of Greifswald, Greifswald, Germany.,Institute of Epidemiology, Friedrich-Loeffler-Institut, Greifswald-Insel Riems, Greifswald, Germany
| | - Mareike Fischer
- Institute of Mathematics and Computer Science, University of Greifswald, Greifswald, Germany
| | - Hartmut H K Lentz
- Institute of Epidemiology, Friedrich-Loeffler-Institut, Greifswald-Insel Riems, Greifswald, Germany
| |
Collapse
|
44
|
Zippo AG, Castiglioni I, Lin J, Borsa VM, Valente M, Biella GEM. Short-Term Classification Learning Promotes Rapid Global Improvements of Information Processing in Human Brain Functional Connectome. Front Hum Neurosci 2020; 13:462. [PMID: 32009918 PMCID: PMC6971211 DOI: 10.3389/fnhum.2019.00462] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2019] [Accepted: 12/17/2019] [Indexed: 01/21/2023] Open
Abstract
Classification learning is a preeminent human ability within the animal kingdom but the key mechanisms of brain networks regulating learning remain mostly elusive. Recent neuroimaging advancements have depicted human brain as a complex graph machinery where brain regions are nodes and coherent activities among them represent the functional connections. While long-term motor memories have been found to alter functional connectivity in the resting human brain, a graph topological investigation of the short-time effects of learning are still not widely investigated. For instance, classification learning is known to orchestrate rapid modulation of diverse memory systems like short-term and visual working memories but how the brain functional connectome accommodates such modulations is unclear. We used publicly available repositories (openfmri.org) selecting three experiments, two focused on short-term classification learning along two consecutive runs where learning was promoted by trial-by-trial feedback errors, while a further experiment was used as supplementary control. We analyzed the functional connectivity extracted from BOLD fMRI signals, and estimated the graph information processing in the cerebral networks. The information processing capability, characterized by complex network statistics, significantly improved over runs, together with the subject classification accuracy. Instead, null-learning experiments, where feedbacks came with poor consistency, did not provoke any significant change in the functional connectivity over runs. We propose that learning induces fast modifications in the overall brain network dynamics, definitely ameliorating the short-term potential of the brain to process and integrate information, a dynamic consistently orchestrated by modulations of the functional connections among specific brain regions.
Collapse
Affiliation(s)
- Antonio G Zippo
- Institute of Molecular Bioimaging and Physiology, Consiglio Nazionale delle Ricerche, Milan, Italy
| | - Isabella Castiglioni
- Institute of Molecular Bioimaging and Physiology, Consiglio Nazionale delle Ricerche, Milan, Italy
| | - Jianyi Lin
- Department of Mathematics, Khalifa University, Abu Dhabi, United Arab Emirates
| | - Virginia M Borsa
- Department of Human and Social Sciences, University of Bergamo, Bergamo, Italy
| | - Maurizio Valente
- Institute of Molecular Bioimaging and Physiology, Consiglio Nazionale delle Ricerche, Milan, Italy
| | - Gabriele E M Biella
- Institute of Molecular Bioimaging and Physiology, Consiglio Nazionale delle Ricerche, Milan, Italy
| |
Collapse
|
45
|
Calatayud J, Bernardo-Madrid R, Neuman M, Rojas A, Rosvall M. Exploring the solution landscape enables more reliable network community detection. Phys Rev E 2019; 100:052308. [PMID: 31869919 DOI: 10.1103/physreve.100.052308] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2019] [Indexed: 06/10/2023]
Abstract
To understand how a complex system is organized and functions, researchers often identify communities in the system's network of interactions. Because it is practically impossible to explore all solutions to guarantee the best one, many community-detection algorithms rely on multiple stochastic searches. But for a given combination of network and stochastic algorithms, how many searches are sufficient to find a solution that is good enough? The standard approach is to pick a reasonably large number of searches and select the network partition with the highest quality or derive a consensus solution based on all network partitions. However, if different partitions have similar qualities such that the solution landscape is degenerate, the single best partition may miss relevant information, and a consensus solution may blur complementary communities. Here we address this degeneracy problem with coarse-grained descriptions of the solution landscape. We cluster network partitions based on their similarity and suggest an approach to determine the minimum number of searches required to describe the solution landscape adequately. To make good use of all partitions, we also propose different ways to explore the solution landscape, including a significance clustering procedure. We test these approaches on synthetic networks and a real-world network using two contrasting community-detection algorithms: The algorithm that can identify more general structures requires more searches, and networks with clearer community structures require fewer searches. We also find that exploring the coarse-grained solution landscape can reveal complementary solutions and enable more reliable community detection.
Collapse
Affiliation(s)
- Joaquín Calatayud
- Integrated Science Lab, Department of Physics, Umeå University, Sweden
| | | | - Magnus Neuman
- Integrated Science Lab, Department of Physics, Umeå University, Sweden
| | - Alexis Rojas
- Integrated Science Lab, Department of Physics, Umeå University, Sweden
| | - Martin Rosvall
- Integrated Science Lab, Department of Physics, Umeå University, Sweden
| |
Collapse
|
46
|
Wharrie S, Azizi L, Altmann EG. Micro-, meso-, macroscales: The effect of triangles on communities in networks. Phys Rev E 2019; 100:022315. [PMID: 31574618 DOI: 10.1103/physreve.100.022315] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2019] [Indexed: 11/07/2022]
Abstract
Mesoscale structures (communities) are used to understand the macroscale properties of complex networks, such as their functionality and formation mechanisms. Microscale structures are known to exist in most complex networks (e.g., large number of triangles or motifs), but they are absent in the simple random-graph models considered (e.g., as null models) in community-detection algorithms. In this paper we investigate the effect of microstructures on the appearance of communities in networks. We find that alone the presence of triangles leads to the appearance of communities even in methods designed to avoid the detection of communities in random networks. This shows that communities can emerge spontaneously from simple processes of motiff generation happening at a microlevel. Our results are based on four widely used community-detection approaches (stochastic block model, spectral method, modularity maximization, and the Infomap algorithm) and three different generative network models (triadic closure, generalized configuration model, and random graphs with triangles).
Collapse
Affiliation(s)
- Sophie Wharrie
- School of Mathematics and Statistics, University of Sydney, 2006 NSW, Australia
| | - Lamiae Azizi
- School of Mathematics and Statistics, University of Sydney, 2006 NSW, Australia
| | - Eduardo G Altmann
- School of Mathematics and Statistics, University of Sydney, 2006 NSW, Australia
| |
Collapse
|
47
|
Peixoto TP. Network Reconstruction and Community Detection from Dynamics. PHYSICAL REVIEW LETTERS 2019; 123:128301. [PMID: 31633974 PMCID: PMC7226905 DOI: 10.1103/physrevlett.123.128301] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2019] [Revised: 05/21/2019] [Indexed: 05/06/2023]
Abstract
We present a scalable nonparametric Bayesian method to perform network reconstruction from observed functional behavior that at the same time infers the communities present in the network. We show that the joint reconstruction with community detection has a synergistic effect, where the edge correlations used to inform the existence of communities are also inherently used to improve the accuracy of the reconstruction which, in turn, can better inform the uncovering of communities. We illustrate the use of our method with observations arising from epidemic models and the Ising model, both on synthetic and empirical networks, as well as on data containing only functional information.
Collapse
Affiliation(s)
- Tiago P Peixoto
- Department of Network and Data Science, Central European University, H-1051 Budapest, Hungary
- ISI Foundation, Via Chisola 5, 10126 Torino, Italy
- Department of Mathematical Sciences, University of Bath, Claverton Down, Bath BA2 7AY, United Kingdom
| |
Collapse
|
48
|
Lu X, Szymanski BK. A Regularized Stochastic Block Model for the robust community detection in complex networks. Sci Rep 2019; 9:13247. [PMID: 31519944 PMCID: PMC6744415 DOI: 10.1038/s41598-019-49580-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Accepted: 08/28/2019] [Indexed: 11/23/2022] Open
Abstract
The stochastic block model is able to generate random graphs with different types of network partitions, ranging from the traditional assortative structures to the disassortative structures. Since the stochastic block model does not specify which mixing pattern is desired, the inference algorithms discover the locally most likely nodes’ partition, regardless of its type. Here we introduce a new model constraining nodes’ internal degree ratios in the objective function to guide the inference algorithms to converge to the desired type of structure in the observed network data. We show experimentally that given the regularized model, the inference algorithms, such as Markov chain Monte Carlo, reliably and quickly find the assortative or disassortative structure as directed by the value of a single parameter. In contrast, when the sought-after assortative community structure is not strong in the observed network, the traditional inference algorithms using the degree-corrected stochastic block model tend to converge to undesired disassortative partitions.
Collapse
Affiliation(s)
- Xiaoyan Lu
- Social and Cognitive Networks Academic Research Center and Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY, 12180, USA
| | - Boleslaw K Szymanski
- Social and Cognitive Networks Academic Research Center and Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY, 12180, USA. .,Społeczna Akademia Nauk, Łódź, Poland.
| |
Collapse
|
49
|
Van Soom M, van den Heuvel M, Ryckebusch J, Schoors K. Loan maturity aggregation in interbank lending networks obscures mesoscale structure and economic functions. Sci Rep 2019; 9:12512. [PMID: 31467301 PMCID: PMC6715684 DOI: 10.1038/s41598-019-48924-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2019] [Accepted: 08/05/2019] [Indexed: 11/09/2022] Open
Abstract
Since the 2007-2009 financial crisis, substantial academic effort has been dedicated to improving our understanding of interbank lending networks (ILNs). Because of data limitations or by choice, the literature largely lacks multiple loan maturities. We employ a complete interbank loan contract dataset to investigate whether maturity details are informative of the network structure. Applying the layered stochastic block model of Peixoto (2015) and other tools from network science on a time series of bilateral loans with multiple maturity layers in the Russian ILN, we find that collapsing all such layers consistently obscures mesoscale structure. The optimal maturity granularity lies between completely collapsing and completely separating the maturity layers and depends on the development phase of the interbank market, with a more developed market requiring more layers for optimal description. Closer inspection of the inferred maturity bins associated with the optimal maturity granularity reveals specific economic functions, from liquidity intermediation to financing. Collapsing a network with multiple underlying maturity layers or extracting one such layer, common in economic research, is therefore not only an incomplete representation of the ILN's mesoscale structure, but also conceals existing economic functions. This holds important insights and opportunities for theoretical and empirical studies on interbank market functioning, contagion, stability, and on the desirable level of regulatory data disclosure.
Collapse
Affiliation(s)
- Marnix Van Soom
- Vrije Universiteit Brussel, Artificial Intelligence Lab, Brussels, 1050, Belgium
| | - Milan van den Heuvel
- Ghent University, Department of Physics and Astronomy, Ghent, 9000, Belgium. .,Ghent University, Department of Economics, Ghent, 9000, Belgium.
| | - Jan Ryckebusch
- Ghent University, Department of Physics and Astronomy, Ghent, 9000, Belgium
| | - Koen Schoors
- Ghent University, Department of Economics, Ghent, 9000, Belgium.,National Research University, Higher School of Economics, Moscow, Russia
| |
Collapse
|
50
|
Funke T, Becker T. Stochastic block models: A comparison of variants and inference methods. PLoS One 2019; 14:e0215296. [PMID: 31013290 PMCID: PMC6478296 DOI: 10.1371/journal.pone.0215296] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Accepted: 03/30/2019] [Indexed: 11/19/2022] Open
Abstract
Finding communities in complex networks is a challenging task and one promising approach is the Stochastic Block Model (SBM). But the influences from various fields led to a diversity of variants and inference methods. Therefore, a comparison of the existing techniques and an independent analysis of their capabilities and weaknesses is needed. As a first step, we review the development of different SBM variants such as the degree-corrected SBM of Karrer and Newman or Peixoto's hierarchical SBM. Beside stating all these variants in a uniform notation, we show the reasons for their development. Knowing the variants, we discuss a variety of approaches to infer the optimal partition like the Metropolis-Hastings algorithm. We perform our analysis based on our extension of the Girvan-Newman test and the Lancichinetti-Fortunato-Radicchi benchmark as well as a selection of some real world networks. Using these results, we give some guidance to the challenging task of selecting an inference method and SBM variant. In addition, we give a simple heuristic to determine the number of steps for the Metropolis-Hastings algorithms that lack a usual stop criterion. With our comparison, we hope to guide researches in the field of SBM and highlight the problem of existing techniques to focus future research. Finally, by making our code freely available, we want to promote a faster development, integration and exchange of new ideas.
Collapse
Affiliation(s)
- Thorben Funke
- Production Systems and Logistic Systems, BIBA - Bremer Institut für Produktion und Logistik GmbH at the University of Bremen, Bremen, Bremen, Germany
- Faculty of Production Engineering, University of Bremen, Bremen, Bremen, Germany
| | - Till Becker
- Production Systems and Logistic Systems, BIBA - Bremer Institut für Produktion und Logistik GmbH at the University of Bremen, Bremen, Bremen, Germany
- Faculty of Business Studies, University of Applied Sciences Emden/Leer, Emden, Lower Saxony, Germany
| |
Collapse
|