1
|
Mokhtaridoost M, Chalmers JJ, Soleimanpoor M, McMurray BJ, Lato DF, Nguyen SC, Musienko V, Nash JO, Espeso-Gil S, Ahmed S, Delfosse K, Browning JWL, Barutcu AR, Wilson MD, Liehr T, Shlien A, Aref S, Joyce EF, Weise A, Maass PG. Inter-chromosomal contacts demarcate genome topology along a spatial gradient. Nat Commun 2024; 15:9813. [PMID: 39532865 PMCID: PMC11557711 DOI: 10.1038/s41467-024-53983-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 10/28/2024] [Indexed: 11/16/2024] Open
Abstract
Non-homologous chromosomal contacts (NHCCs) between different chromosomes participate considerably in gene and genome regulation. Due to analytical challenges, NHCCs are currently considered as singular, stochastic events, and their extent and fundamental principles across cell types remain controversial. We develop a supervised and unsupervised learning algorithm, termed Signature, to call NHCCs in Hi-C datasets to advance our understanding of genome topology. Signature reveals 40,282 NHCCs and their properties across 62 Hi-C datasets of 53 diploid human cell types. Genomic regions of NHCCs are gene-dense, highly expressed, and harbor genes for cell-specific and sex-specific functions. Extensive inter-telomeric and inter-centromeric clustering occurs across cell types [Rabl's configuration] and 61 NHCCs are consistently found at the nuclear speckles. These constitutive 'anchor loci' facilitate an axis of genome activity whilst cell-type-specific NHCCs act in discrete hubs. Our results suggest that non-random chromosome positioning is supported by constitutive NHCCs that shape genome topology along an off-centered spatial gradient of genome activity.
Collapse
Affiliation(s)
- Milad Mokhtaridoost
- Genetics and Genome Biology Program, SickKids Research Institute, Toronto, ON, M5G 0A4, Canada
| | - Jordan J Chalmers
- Genetics and Genome Biology Program, SickKids Research Institute, Toronto, ON, M5G 0A4, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada
| | - Marzieh Soleimanpoor
- Genetics and Genome Biology Program, SickKids Research Institute, Toronto, ON, M5G 0A4, Canada
| | - Brandon J McMurray
- Genetics and Genome Biology Program, SickKids Research Institute, Toronto, ON, M5G 0A4, Canada
| | - Daniella F Lato
- Genetics and Genome Biology Program, SickKids Research Institute, Toronto, ON, M5G 0A4, Canada
| | - Son C Nguyen
- Penn Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Viktoria Musienko
- Jena University Hospital, Friedrich Schiller University, Institute of Human Genetics, Am Klinikum 1, 07747, Jena, Germany
| | - Joshua O Nash
- Genetics and Genome Biology Program, SickKids Research Institute, Toronto, ON, M5G 0A4, Canada
- Laboratory of Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada
| | - Sergio Espeso-Gil
- Genetics and Genome Biology Program, SickKids Research Institute, Toronto, ON, M5G 0A4, Canada
| | - Sameen Ahmed
- Genetics and Genome Biology Program, SickKids Research Institute, Toronto, ON, M5G 0A4, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada
| | - Kate Delfosse
- Genetics and Genome Biology Program, SickKids Research Institute, Toronto, ON, M5G 0A4, Canada
| | - Jared W L Browning
- Genetics and Genome Biology Program, SickKids Research Institute, Toronto, ON, M5G 0A4, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada
| | - A Rasim Barutcu
- Donnelly Centre, University of Toronto, Toronto, ON, M5S 3E1, Canada
| | - Michael D Wilson
- Genetics and Genome Biology Program, SickKids Research Institute, Toronto, ON, M5G 0A4, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada
| | - Thomas Liehr
- Jena University Hospital, Friedrich Schiller University, Institute of Human Genetics, Am Klinikum 1, 07747, Jena, Germany
| | - Adam Shlien
- Genetics and Genome Biology Program, SickKids Research Institute, Toronto, ON, M5G 0A4, Canada
- Laboratory of Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada
| | - Samin Aref
- Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, ON, M5S3G8, Canada
| | - Eric F Joyce
- Penn Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Anja Weise
- Jena University Hospital, Friedrich Schiller University, Institute of Human Genetics, Am Klinikum 1, 07747, Jena, Germany
| | - Philipp G Maass
- Genetics and Genome Biology Program, SickKids Research Institute, Toronto, ON, M5G 0A4, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada.
| |
Collapse
|
2
|
Aref S, Mostajabdaveh M, Chheda H. Bayan algorithm: Detecting communities in networks through exact and approximate optimization of modularity. Phys Rev E 2024; 110:044315. [PMID: 39562863 DOI: 10.1103/physreve.110.044315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2024] [Accepted: 09/24/2024] [Indexed: 11/21/2024]
Abstract
Community detection is a classic network problem with extensive applications in various fields. Its most common method is using modularity maximization heuristics which rarely return an optimal partition or anything similar. Partitions with globally optimal modularity are difficult to compute, and therefore have been underexplored. Using structurally diverse networks, we compare 30 community detection methods including our proposed algorithm that offers optimality and approximation guarantees: the Bayan algorithm. Unlike existing methods, Bayan globally maximizes modularity or approximates it within a factor. Our results show the distinctive accuracy and stability of maximum-modularity partitions in retrieving planted partitions at rates higher than most alternatives for a wide range of parameter settings in two standard benchmarks. Compared to the partitions from 29 other algorithms, maximum-modularity partitions have the best medians for description length, coverage, performance, average conductance, and well clusteredness. These advantages come at the cost of additional computations which Bayan makes possible for small networks (networks that have up to 3000 edges in their largest connected component). Bayan is several times faster than using open-source and commercial solvers for modularity maximization, making it capable of finding optimal partitions for instances that cannot be optimized by any other existing method. Our results point to a few well-performing algorithms, among which Bayan stands out as the most reliable method for small networks. A python implementation of the Bayan algorithm (bayanpy) is publicly available through the package installer for python.
Collapse
|
3
|
Gómez-Pascual A, Rocamora-Pérez G, Ibanez L, Botía JA. Targeted co-expression networks for the study of traits. Sci Rep 2024; 14:16675. [PMID: 39030261 PMCID: PMC11271532 DOI: 10.1038/s41598-024-67329-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Accepted: 07/10/2024] [Indexed: 07/21/2024] Open
Abstract
Weighted Gene Co-expression Network Analysis (WGCNA) is a widely used approach for the generation of gene co-expression networks. However, networks generated with this tool usually create large modules with a large set of functional annotations hard to decipher. We have developed TGCN, a new method to create Targeted Gene Co-expression Networks. This method identifies the transcripts that best predict the trait of interest based on gene expression using a refinement of the LASSO regression. Then, it builds the co-expression modules around those transcripts. Algorithm properties were characterized using the expression of 13 brain regions from the Genotype-Tissue Expression project. When comparing our method with WGCNA, TGCN networks lead to more precise modules that have more specific and yet rich biological meaning. Then, we illustrate its applicability by creating an APP-TGCN on The Religious Orders Study and Memory and Aging Project dataset, aiming to identify the molecular pathways specifically associated with APP role in Alzheimer's disease. Main biological findings were further validated in two independent cohorts. In conclusion, we provide a new framework that serves to create targeted networks that are smaller, biologically relevant and useful in high throughput hypothesis driven research. The TGCN R package is available on Github: https://github.com/aliciagp/TGCN .
Collapse
Affiliation(s)
- A Gómez-Pascual
- Communications Engineering and Information Department, University of Murcia, 30100, Murcia, Spain
| | - G Rocamora-Pérez
- Department of Genetics and Genomic Medicine Research and Teaching, UCL GOS Institute of Child Health, London, WC1N 1EH, UK
| | - L Ibanez
- Department of Psychiatry, Washington University School of Medicine, Saint Louis, MO, 63110, USA
- Department of Neurology, Washington University School of Medicine, Saint Louis, MO, 63110, USA
| | - J A Botía
- Communications Engineering and Information Department, University of Murcia, 30100, Murcia, Spain.
| |
Collapse
|
4
|
Su X, Xue S, Liu F, Wu J, Yang J, Zhou C, Hu W, Paris C, Nepal S, Jin D, Sheng QZ, Yu PS. A Comprehensive Survey on Community Detection With Deep Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:4682-4702. [PMID: 35263257 DOI: 10.1109/tnnls.2021.3137396] [Citation(s) in RCA: 37] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Detecting a community in a network is a matter of discerning the distinct features and connections of a group of members that are different from those in other communities. The ability to do this is of great significance in network analysis. However, beyond the classic spectral clustering and statistical inference methods, there have been significant developments with deep learning techniques for community detection in recent years-particularly when it comes to handling high-dimensional network data. Hence, a comprehensive review of the latest progress in community detection through deep learning is timely. To frame the survey, we have devised a new taxonomy covering different state-of-the-art methods, including deep learning models based on deep neural networks (DNNs), deep nonnegative matrix factorization, and deep sparse filtering. The main category, i.e., DNNs, is further divided into convolutional networks, graph attention networks, generative adversarial networks, and autoencoders. The popular benchmark datasets, evaluation metrics, and open-source implementations to address experimentation settings are also summarized. This is followed by a discussion on the practical applications of community detection in various domains. The survey concludes with suggestions of challenging topics that would make for fruitful future research directions in this fast-growing deep learning field.
Collapse
|
5
|
Castaneda EU, Baker EJ. KNeXT: a NetworkX-based topologically relevant KEGG parser. Front Genet 2024; 15:1292394. [PMID: 38415058 PMCID: PMC10896898 DOI: 10.3389/fgene.2024.1292394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 01/25/2024] [Indexed: 02/29/2024] Open
Abstract
Automating the recreation of gene and mixed gene-compound networks from Kyoto Encyclopedia of Genes and Genomes (KEGG) Markup Language (KGML) files is challenging because the data structure does not preserve the independent or loosely connected neighborhoods in which they were originally derived, referred to here as its topological environment. Identical accession numbers may overlap, causing neighborhoods to artificially collapse based on duplicated identifiers. This causes current parsers to create misleading or erroneous graphical representations when mixed gene networks are converted to gene-only networks. To overcome these challenges we created a python-based KEGG NetworkX Topological (KNeXT) parser that allows users to accurately recapitulate genetic networks and mixed networks from KGML map data. The software, archived as a python package index (PyPI) file to ensure broad application, is designed to ingest KGML files through built-in APIs and dynamically create high-fidelity topological representations. The utilization of NetworkX's framework to generate tab-separated files additionally ensures that KNeXT results may be imported into other graph frameworks and maintain programmatic access to the original x-y axis positions to each node in the KEGG pathway. KNeXT is a well-described Python 3 package that allows users to rapidly download and aggregate specific KGML files and recreate KEGG pathways based on a range of user-defined settings. KNeXT is platform-independent, distinctive, and it is not written on top of other Python parsers. Furthermore, KNeXT enables users to parse entire local folders or single files through command line scripts and convert the output into NCBI or UniProt IDs. KNeXT provides an ability for researchers to generate pathway visualizations while persevering the original context of a KEGG pathway. Source code is freely available at https://github.com/everest-castaneda/knext.
Collapse
Affiliation(s)
- Everest Uriel Castaneda
- Department of Biology, Baylor University, Waco, TX, United States
- School of Engineering and Computer Science, Baylor University, Waco, TX, United States
| | - Erich J Baker
- Department of Mathematics and Computer Science, Belmont University, Nashville, TN, United States
| |
Collapse
|
6
|
Day J. Mapping the cultural divides of England and Wales: Did the geographies of 'Belonging' act as a brake on British Urbanisation, 1851-1911? PLoS One 2023; 18:e0286244. [PMID: 37228149 PMCID: PMC10212145 DOI: 10.1371/journal.pone.0286244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 05/11/2023] [Indexed: 05/27/2023] Open
Abstract
Although both the analysis of regional culture and urbanisation are long-standing preoccupations in geography, few studies have considered the relationship between the two, the former traditionally being a topic in cultural geography, while the latter is usually interpreted and analysed as a process in economic geography. Taking evidence from the 1851-1911 censuses of England and Wales, this article analyses individual migration paths to identify stable regions of human interaction by applying a sophisticated community-detection algorithm. By accurately mapping the regions within which the majority of migration occurred between 1851 and 1911 and arguing that the stability of these geographies is evidence of more than just mutable communities but rather of persistent regional cultures, this article responds to previous studies that have sought to identify the cultural provinces of England and Wales. Indeed, by demonstrating that the regions bear a striking resemblance to those that have long been hypothesised as being distinct cultural provinces of England and Wales, this article empirically corroborates their existence. In order to further demonstrate that the regions constitute cultural provinces, this paper incorporates these boundaries into a spatial interaction model (SIM). The results of the SIM not only shows that the boundaries between the regions limited the number of migrants that crossed them-over and above that explained by control variables-and therefore represented the boundaries of cultural provinces, demarcating discrete regions of human interaction-but that such boundaries disproportionately restricted rural-urban migrants, thereby slowing the pace at which England and Wales urbanised. This paper therefore demonstrates that urbanisation should not only be interpreted as only an economic phenomenon, but a cultural one also, and that if urbanisation is to be fully understood, individuals' attachment to place as a component of their identity, ought to be formally incorporated into models of migration.
Collapse
Affiliation(s)
- Joseph Day
- School of Geographical Sciences, University of Bristol, Bristol, United Kingdom
| |
Collapse
|
7
|
Feng Z, Cao Z, Qi X. Generalized network dismantling via a novel spectral partition algorithm. Inf Sci (N Y) 2023. [DOI: 10.1016/j.ins.2023.03.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/11/2023]
|
8
|
TSCDA: a dynamic two-stage community discovery approach. SOCIAL NETWORK ANALYSIS AND MINING 2022. [DOI: 10.1007/s13278-022-00874-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
9
|
Comparing World City Networks by Language: A Complex-Network Approach. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION 2021. [DOI: 10.3390/ijgi10040219] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
City networks are multiplex and diverse rather than being regarded as part of a single universal model that is valid worldwide. This study contributes to the debate on multiple globalizations by distinguishing multiscale structures of world city networks (WCNs) reflected in the Internet webpage content in English, German, and French. Using big data sets from web crawling, we adopted a complex-network approach with both macroscale and mesoscale analyses to compare global and grouping properties in varying WCNs, by using novel methods such as the weighted stochastic block model (WSBM). The results suggest that at the macro scale, the rankings of city centralities vary across languages due to the uneven geographic distribution of languages and the variant levels of globalization of cities perceived in different languages. At the meso scale, the WSBMs infer different grouping patterns in the WCNs by language, and the specific roles of many world cities vary with language. The probability-based comparative analyses reveal that the English WCN looks more globalized, while the French and German worlds appear more territorial. Using the mesoscale structure detected in the English WCN to comprehend the city networks in other languages may be biased. These findings demonstrate the importance of scrutinizing multiplex WCNs in different cultures and languages as well as discussing mesoscale structures in comparative WCN studies.
Collapse
|
10
|
Zhang L, Liu M, Wang B, Lang B, Yang P. Discovering communities based on mention distance. Scientometrics 2021. [DOI: 10.1007/s11192-021-03863-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
11
|
Industry classification with online resume big data: A design science approach. INFORMATION & MANAGEMENT 2020. [DOI: 10.1016/j.im.2019.103182] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
12
|
|
13
|
Delineating the Regional Economic Geography of China by the Approach of Community Detection. SUSTAINABILITY 2019. [DOI: 10.3390/su11216053] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
With the obvious regionalization trend in the new period of urbanization in China, the scientific delineation of functional regions (FRs) at different scales has become a heated topic recently. Since the 20th century, western academia has formed a basic idea of metropolitan areas’ (MAs) delineation based on population density and commuting rate, for which the subjectivity of threshold setting is difficult to overcome. In this study, community detection algorithms from the field of network science are employed, namely the Louvain algorithm with adjustable resolutions and Combo with high-precision output, respectively. We take the nationwide car-hailing data set as an example to explore a bottom-up method for delineating regional economic geography at different scales based on the interconnection strength between nodes. It was found that most of the prefecture-level cities in China have a dominant commuting region and two or three secondary commuting sub-regions, while regional central cities have extended their commuting hinterlands over jurisdictional boundaries, which is not common due to the larger initial administrative divisions and the comprehensive development niveau of cities. The feasibility and limitation of community detection partitioning algorithms in the application of regional science are verified. It is supposed to be widely used in regional delimitation supported by big data. Both of the two algorithms show a shortage of ignorance of spatial proximity. It is necessary to explore new algorithms that can adjust both accuracy and spatial distance as parameters.
Collapse
|
14
|
Exploring the Spatial Characteristics of Inbound Tourist Flows in China Using Geotagged Photos. SUSTAINABILITY 2019. [DOI: 10.3390/su11205822] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
As important modern tourist destinations, cities play a critical role in developing agglomerated tourism elements and promoting urban life quality. An in-depth exploration of tourist flow patterns between destination cities can reflect the dynamic trends of the inbound tourist market. This is significant for the development of tourism markets and innovation in tourism products. To this end, photos with geographical and corresponding metadata covering the entire country from 2011 to 2017 are used to explore the spatial characteristics of China’s inbound tourist flow, the spatial patterns of tourist movement, and the tourist destination cities group based on data mining techniques, including the Markov chain, a frequent-pattern-mining algorithm, and a community detection algorithm. Our findings show that: (1) the strongest flow of inbound tourists is between Beijing and Shanghai. These two cities, along with Xi’an and Guiling, form a “double-triangle” framework, (2) the travel between emerging destination cities in Central and Western China have gradually become frequently selected itineraries, and, (3) based on the flow intensity, inbound tourist destination cities can be divided into nine groups. This study provides a valuable reference for the development of China’s inbound tourism market.
Collapse
|
15
|
|
16
|
Rahiminejad S, Maurya MR, Subramaniam S. Topological and functional comparison of community detection algorithms in biological networks. BMC Bioinformatics 2019; 20:212. [PMID: 31029085 PMCID: PMC6487005 DOI: 10.1186/s12859-019-2746-0] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2018] [Accepted: 03/18/2019] [Indexed: 11/28/2022] Open
Abstract
Background Community detection algorithms are fundamental tools to uncover important features in networks. There are several studies focused on social networks but only a few deal with biological networks. Directly or indirectly, most of the methods maximize modularity, a measure of the density of links within communities as compared to links between communities. Results Here we analyze six different community detection algorithms, namely, Combo, Conclude, Fast Greedy, Leading Eigen, Louvain and Spinglass, on two important biological networks to find their communities and evaluate the results in terms of topological and functional features through Kyoto Encyclopedia of Genes and Genomes pathway and Gene Ontology term enrichment analysis. At a high level, the main assessment criteria are 1) appropriate community size (neither too small nor too large), 2) representation within the community of only one or two broad biological functions, 3) most genes from the network belonging to a pathway should also belong to only one or two communities, and 4) performance speed. The first network in this study is a network of Protein-Protein Interactions (PPI) in Saccharomyces cerevisiae (Yeast) with 6532 nodes and 229,696 edges and the second is a network of PPI in Homo sapiens (Human) with 20,644 nodes and 241,008 edges. All six methods perform well, i.e., find reasonably sized and biologically interpretable communities, for the Yeast PPI network but the Conclude method does not find reasonably sized communities for the Human PPI network. Louvain method maximizes modularity by using an agglomerative approach, and is the fastest method for community detection. For the Yeast PPI network, the results of Spinglass method are most similar to the results of Louvain method with regard to the size of communities and core pathways they identify, whereas for the Human PPI network, Combo and Spinglass methods yield the most similar results, with Louvain being the next closest. Conclusions For Yeast and Human PPI networks, Louvain method is likely the best method to find communities in terms of detecting known core pathways in a reasonable time. Electronic supplementary material The online version of this article (10.1186/s12859-019-2746-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sara Rahiminejad
- Departments of Bioengineering and Mechanical and Aerospace Engineering, University of California, San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | - Mano R Maurya
- Department of Bioengineering and San Diego Supercomputer Center, University of California, San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA.
| | - Shankar Subramaniam
- Department of Bioengineering, Departments of Computer Science and Engineering, Cellular and Molecular Medicine, and the Graduate Program in Bioinformatics, University of California, San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA.
| |
Collapse
|
17
|
Becatti C, Caldarelli G, Saracco F. Entropy-based randomization of rating networks. Phys Rev E 2019; 99:022306. [PMID: 30934284 DOI: 10.1103/physreve.99.022306] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2018] [Indexed: 05/28/2023]
Abstract
In recent years, due to the great diffusion of e-commerce, online rating platforms quickly became a common tool for purchase recommendations. However, instruments for their analysis did not evolve at the same speed. Indeed, interesting information about users' habits and tastes can be recovered just considering the bipartite network of users and products, in which links represent products' purchases and have different weights due to the score assigned to the item in users' reviews. With respect to other weighted bipartite networks, in these systems we observe a maximum possible weight per link, that limits the variability of the outcomes. In the present article we propose an entropy-based randomization method for this type of networks (i.e., bipartite rating networks) by extending the configuration model framework: the randomized network satisfies the constraints of the degree per rating, i.e., the number of given ratings received by the specified product or assigned by the single user. We first show that such a null model is able to reproduce several nontrivial features of the real network better than other null models. Then, using our model as benchmark, we project the information contained in the real system on one of the layers: To provide an interpretation of the projection obtained, we run the Louvain community detection on the obtained network and discuss the observed division in clusters. We are able to detect groups of music albums due to the consumers' taste or communities of movies due to their audience. Finally, we show that our method is also able to handle the special case of categorical bipartite networks: we consider the bipartite categorical network of scientific journals recognized for the scientific qualification in economics and statistics. In the end, from the outcome of our method, the probability that each user appreciate every product can be easily recovered. Therefore, this information may be employed in future applications to implement a more detailed recommendation system that also takes into account information regarding the topology of the observed network.
Collapse
Affiliation(s)
- Carolina Becatti
- IMT School for Advanced Studies, Piazza S.Francesco 19, 55100 Lucca, Italy
| | - Guido Caldarelli
- IMT School for Advanced Studies, Piazza S.Francesco 19, 55100 Lucca, Italy
- Istituto dei Sistemi Complessi (ISC)-CNR UoS Università "Sapienza", Piazzale Aldo Moro 5, 00185 Roma, Italy
- ECLT San Marco 2940, 30124 Venezia, Italy
| | - Fabio Saracco
- IMT School for Advanced Studies, Piazza S.Francesco 19, 55100 Lucca, Italy
| |
Collapse
|
18
|
|
19
|
Populations, megapopulations, and the areal unit problem. Health Place 2018; 54:79-84. [DOI: 10.1016/j.healthplace.2018.09.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Revised: 08/26/2018] [Accepted: 09/12/2018] [Indexed: 11/24/2022]
|
20
|
Gao C, Liang M, Li X, Zhang Z, Wang Z, Zhou Z. Network Community Detection Based on the Physarum-Inspired Computational Framework. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:1916-1928. [PMID: 27992347 DOI: 10.1109/tcbb.2016.2638824] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Community detection is a crucial and essential problem in the structure analytics of complex networks, which can help us understand and predict the characteristics and functions of complex networks. Many methods, ranging from the optimization-based algorithms to the heuristic-based algorithms, have been proposed for solving such a problem. Due to the inherent complexity of identifying network structure, how to design an effective algorithm with a higher accuracy and a lower computational cost still remains an open problem. Inspired by the computational capability and positive feedback mechanism in the wake of foraging process of Physarum, a kind of slime, a general Physarum-based computational framework for community detection is proposed in this paper. Based on the proposed framework, the inter-community edges can be identified from the intra-community edges in a network and the positive feedback of solving process in an algorithm can be further enhanced, which are used to improve the efficiency of original optimization-based and heuristic-based community detection algorithms, respectively. Some typical algorithms (e.g., genetic algorithm, ant colony optimization algorithm, and Markov clustering algorithm) and real-world datasets have been used to estimate the efficiency of our proposed computational framework. Experiments show that the algorithms optimized by Physarum-inspired computational framework perform better than the original ones, in terms of accuracy and computational cost.
Collapse
|
21
|
Critical analysis of (Quasi-)Surprise for community detection in complex networks. Sci Rep 2018; 8:14459. [PMID: 30262896 PMCID: PMC6160439 DOI: 10.1038/s41598-018-32582-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2017] [Accepted: 05/08/2018] [Indexed: 02/07/2023] Open
Abstract
Module or community structures widely exist in complex networks, and optimizing statistical measures is one of the most popular approaches for revealing and identifying such structures in real-world applications. In this paper, we focus on critical behaviors of (Quasi-)Surprise, a type of statistical measure of interest for community structure, accompanied by a series of comparisons with other measures. Specially, the effect of various network parameters on the measures is thoroughly investigated. The critical number of dense subgraphs in partition transition is derived, and a kind of phase diagrams is provided to display and compare the phase transitions of the measures. The effect of “potential well” for (Quasi-)Surprise is revealed, which may be difficult to get across by general greedy (agglomerative or divisive) algorithms. Finally, an extension of Quasi-Surprise is introduced for the study of multi-scale structures. Experimental results are of help for understanding the critical behaviors of (Quasi-)Surprise, and may provide useful insight for the design of effective tools for community detection.
Collapse
|
22
|
Wandelt S, Sun X, Feng D, Zanin M, Havlin S. A comparative analysis of approaches to network-dismantling. Sci Rep 2018; 8:13513. [PMID: 30202039 PMCID: PMC6131543 DOI: 10.1038/s41598-018-31902-8] [Citation(s) in RCA: 61] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2018] [Accepted: 08/29/2018] [Indexed: 11/24/2022] Open
Abstract
Estimating, understanding, and improving the robustness of networks has many application areas such as bioinformatics, transportation, or computational linguistics. Accordingly, with the rise of network science for modeling complex systems, many methods for robustness estimation and network dismantling have been developed and applied to real-world problems. The state-of-the-art in this field is quite fuzzy, as results are published in various domain-specific venues and using different datasets. In this study, we report, to the best of our knowledge, on the analysis of the largest benchmark regarding network dismantling. We reimplemented and compared 13 competitors on 12 types of random networks, including ER, BA, and WS, with different network generation parameters. We find that network metrics, proposed more than 20 years ago, are often non-dominating competitors, while many recently proposed techniques perform well only on specific network types. Besides the solution quality, we also investigate the execution time. Moreover, we analyze the similarity of competitors, as induced by their node rankings. We compare and validate our results on real-world networks. Our study is aimed to be a reference for selecting a network dismantling method for a given network, considering accuracy requirements and run time constraints.
Collapse
Affiliation(s)
- Sebastian Wandelt
- National Key Laboratory of CNS/ATM, School of Electronic and Information Engineering, Beihang University, 100191, Beijing, China
- National Engineering Laboratory of Multi-Modal Transportation Big Data, 100191, Beijing, China
- Beijing Advanced Innovation Center for Big Data-based Precision Medicine, Beihang University, 100083, Beijing, China
| | - Xiaoqian Sun
- National Key Laboratory of CNS/ATM, School of Electronic and Information Engineering, Beihang University, 100191, Beijing, China.
- National Engineering Laboratory of Multi-Modal Transportation Big Data, 100191, Beijing, China.
| | - Daozhong Feng
- National Key Laboratory of CNS/ATM, School of Electronic and Information Engineering, Beihang University, 100191, Beijing, China
| | - Massimiliano Zanin
- Centro de Tecnologica Biomedica, Universidad Politecnica de Madrid, 28223, Madrid, Spain
- Faculdade de Ciecias e Tecnologia, Universidade Nova de Lisboa, 2829-516, Caparica, Portugal
| | - Shlomo Havlin
- Department of Physics, Bar-Ilan University, Ramat-Gan, 52900, Israel
| |
Collapse
|
23
|
Ren ZM, Mariani MS, Zhang YC, Medo M. Randomizing growing networks with a time-respecting null model. Phys Rev E 2018; 97:052311. [PMID: 29906916 DOI: 10.1103/physreve.97.052311] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2017] [Indexed: 11/07/2022]
Abstract
Complex networks are often used to represent systems that are not static but grow with time: People make new friendships, new papers are published and refer to the existing ones, and so forth. To assess the statistical significance of measurements made on such networks, we propose a randomization methodology-a time-respecting null model-that preserves both the network's degree sequence and the time evolution of individual nodes' degree values. By preserving the temporal linking patterns of the analyzed system, the proposed model is able to factor out the effect of the system's temporal patterns on its structure. We apply the model to the citation network of Physical Review scholarly papers and the citation network of US movies. The model reveals that the two data sets are strikingly different with respect to their degree-degree correlations, and we discuss the important implications of this finding on the information provided by paradigmatic node centrality metrics such as indegree and Google's PageRank. The randomization methodology proposed here can be used to assess the significance of any structural property in growing networks, which could bring new insights into the problems where null models play a critical role, such as the detection of communities and network motifs.
Collapse
Affiliation(s)
- Zhuo-Ming Ren
- Alibaba Research Center for Complexity Sciences, Alibaba Business School, Hangzhou Normal University, Hangzhou 311121, PR China.,Department of Physics, University of Fribourg, 1700 Fribourg, Switzerland
| | - Manuel Sebastian Mariani
- Department of Physics, University of Fribourg, 1700 Fribourg, Switzerland.,Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, PR China.,URPP Social Networks, Universität Zürich, Switzerland
| | - Yi-Cheng Zhang
- Department of Physics, University of Fribourg, 1700 Fribourg, Switzerland.,Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, PR China
| | - Matúš Medo
- Department of Physics, University of Fribourg, 1700 Fribourg, Switzerland.,Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, PR China.,Department of Radiation Oncology, Inselspital, Bern University Hospital, and University of Bern, 3010 Bern, Switzerland
| |
Collapse
|
24
|
Schlaile MP, Zeman J, Mueller M. It's a match! Simulating compatibility-based learning in a network of networks. JOURNAL OF EVOLUTIONARY ECONOMICS 2018; 28:1111-1150. [PMID: 30613126 PMCID: PMC6302144 DOI: 10.1007/s00191-018-0579-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
In this article, we develop a new way to capture knowledge diffusion and assimilation in innovation networks by means of an agent-based simulation model. The model incorporates three essential characteristics of knowledge that have not been covered entirely by previous diffusion models: the network character of knowledge, compatibility of new knowledge with already existing knowledge, and the fact that transmission of knowledge requires some form of attention. We employ a network-of- networks approach, where agents are located within an innovation network and each agent itself contains another network composed of knowledge units (KUs). Since social learning is a path-dependent process, in our model, KUs are exchanged among agents and integrated into their respective knowledge networks depending on the received KUs' compatibility with the currently focused ones. Thereby, we are also able to endogenize attributes such as absorptive capacity that have been treated as an exogenous parameter in some of the previous diffusion models. We use our model to simulate and analyze various scenarios, including cases for different degrees of knowledge diversity and cognitive distance among agents as well as knowledge exploitation vs. exploration strategies. Here, the model is able to distinguish between two levels of knowledge diversity: heterogeneity within and between agents. Additionally, our simulation results give fresh impetus to debates about the interplay of innovation network structure and knowledge diffusion. In summary, our article proposes a novel way of modeling knowledge diffusion, thereby contributing to an advancement of the economics of innovation and knowledge.
Collapse
Affiliation(s)
- Michael P. Schlaile
- Institute of Economics (520i) and Institute of Economic and Business Education (560 D), University of Hohenheim, Wollgrasweg 23, 70593 Stuttgart, Germany
| | - Johannes Zeman
- Institute for Computational Physics, University of Stuttgart, Allmandring 3, 70569 Stuttgart, Germany
| | - Matthias Mueller
- Institute of Economics (520i), University of Hohenheim, Wollgrasweg 23, 70593 Stuttgart, Germany
| |
Collapse
|
25
|
Solé-Ribalta A, Tessone CJ, Mariani MS, Borge-Holthoefer J. Revealing in-block nestedness: Detection and benchmarking. Phys Rev E 2018; 97:062302. [PMID: 30011537 DOI: 10.1103/physreve.97.062302] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2018] [Indexed: 06/08/2023]
Abstract
As new instances of nested organization-beyond ecological networks-are discovered, scholars are debating the coexistence of two apparently incompatible macroscale architectures: nestedness and modularity. The discussion is far from being solved, mainly for two reasons. First, nestedness and modularity appear to emerge from two contradictory dynamics, cooperation and competition. Second, existing methods to assess the presence of nestedness and modularity are flawed when it comes to the evaluation of concurrently nested and modular structures. In this work, we tackle the latter problem, presenting the concept of in-block nestedness, a structural property determining to what extent a network is composed of blocks whose internal connectivity exhibits nestedness. We then put forward a set of optimization methods that allow us to identify such organization successfully, in synthetic and in a large number of real networks. These findings challenge our understanding of the topology of ecological and social systems, calling for new models to explain how such patterns emerge.
Collapse
Affiliation(s)
- Albert Solé-Ribalta
- Internet Interdisciplinary Institute (IN3), Universitat Oberta de Catalunya, 08860 Barcelona, Catalonia, Spain
| | | | - Manuel S Mariani
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, 610051 Chengdu, People's Republic of China; URPP Social Networks, Universität Zürich, CH-8050 Switzerland; and Physics Department, Université de Fribourg, CH-1700 Switzerland
| | - Javier Borge-Holthoefer
- Internet Interdisciplinary Institute (IN3), Universitat Oberta de Catalunya, 08860 Barcelona, Catalonia, Spain and Institute for Biocomputation and Physics of Complex Systems (BIFI), Universidad de Zaragoza, 50018 Zaragoza, Spain
| |
Collapse
|
26
|
DASSCAN: A Density and Adjacency Expansion-Based Spatial Structural Community Detection Algorithm for Networks. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION 2018. [DOI: 10.3390/ijgi7040159] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
27
|
Hashemian B, Massaro E, Bojic I, Murillo Arias J, Sobolevsky S, Ratti C. Socioeconomic characterization of regions through the lens of individual financial transactions. PLoS One 2017; 12:e0187031. [PMID: 29190724 PMCID: PMC5708635 DOI: 10.1371/journal.pone.0187031] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2017] [Accepted: 10/12/2017] [Indexed: 11/25/2022] Open
Abstract
People are increasingly leaving digital traces of their daily activities through interacting with their digital environment. Among these traces, financial transactions are of paramount interest since they provide a panoramic view of human life through the lens of purchases, from food and clothes to sport and travel. Although many analyses have been done to study the individual preferences based on credit card transaction, characterizing human behavior at larger scales remains largely unexplored. This is mainly due to the lack of models that can relate individual transactions to macro-socioeconomic indicators. Building these models, not only can we obtain a nearly real-time information about socioeconomic characteristics of regions, usually available yearly or quarterly through official statistics, but also it can reveal hidden social and economic structures that cannot be captured by official indicators. In this paper, we aim to elucidate how macro-socioeconomic patterns could be understood based on individual financial decisions. To this end, we reveal the underlying interconnection of the network of spending leveraging anonymized individual credit/debit card transactions data, craft micro-socioeconomic indices that consists of various social and economic aspects of human life, and propose a machine learning framework to predict macro-socioeconomic indicators.
Collapse
Affiliation(s)
- Behrooz Hashemian
- Senseable City Lab, Massachusetts Institute of Technology, Cambridge, MA, United States of America
| | - Emanuele Massaro
- Senseable City Lab, Massachusetts Institute of Technology, Cambridge, MA, United States of America
- HERUS Lab, Institute of Environmental Engineering (ENAC), École Polytechinque Fédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland
| | - Iva Bojic
- Senseable City Lab, Massachusetts Institute of Technology, Cambridge, MA, United States of America
- Singapore-MIT Alliance for Research and Technology, Singapore, Singapore
| | | | - Stanislav Sobolevsky
- Senseable City Lab, Massachusetts Institute of Technology, Cambridge, MA, United States of America
- Center For Urban Science and Progress, New York University, Brooklyn, NY, United States of America
- Institute Of Design And Urban Studies of The Saint-Petersburg National Research University Of Information Technologies, Mechanics And Optics, Saint-Petersburg, Russia
| | - Carlo Ratti
- Senseable City Lab, Massachusetts Institute of Technology, Cambridge, MA, United States of America
| |
Collapse
|
28
|
Fan H, Zhong Y, Zeng G. Overlapping community detection based on discrete biogeography optimization. APPL INTELL 2017. [DOI: 10.1007/s10489-017-1073-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
29
|
From social community to spatio-temporal information: A new method for mobile data exploration. JOURNAL OF VISUAL LANGUAGES AND COMPUTING 2017. [DOI: 10.1016/j.jvlc.2017.05.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
30
|
Belyi A, Bojic I, Sobolevsky S, Sitko I, Hawelka B, Rudikova L, Kurbatski A, Ratti C. Global multi-layer network of human mobility. INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE : IJGIS 2017; 31:1381-1402. [PMID: 28553155 PMCID: PMC5426086 DOI: 10.1080/13658816.2017.1301455] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/16/2016] [Accepted: 02/28/2017] [Indexed: 05/26/2023]
Abstract
Recent availability of geo-localized data capturing individual human activity together with the statistical data on international migration opened up unprecedented opportunities for a study on global mobility. In this paper, we consider it from the perspective of a multi-layer complex network, built using a combination of three datasets: Twitter, Flickr and official migration data. Those datasets provide different, but equally important insights on the global mobility - while the first two highlight short-term visits of people from one country to another, the last one - migration - shows the long-term mobility perspective, when people relocate for good. The main purpose of the paper is to emphasize importance of this multi-layer approach capturing both aspects of human mobility at the same time. On the one hand, we show that although the general properties of different layers of the global mobility network are similar, there are important quantitative differences among them. On the other hand, we demonstrate that consideration of mobility from a multi-layer perspective can reveal important global spatial patterns in a way more consistent with those observed in other available relevant sources of international connections, in comparison to the spatial structure inferred from each network layer taken separately.
Collapse
Affiliation(s)
- Alexander Belyi
- SENSEable City Laboratory, SMART Centre, Singapore, Singapore
- Faculty of Applied Mathematics and Computer Science, Belarusian State University, Minsk, Belarus
- SENSEable City Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Iva Bojic
- SENSEable City Laboratory, SMART Centre, Singapore, Singapore
- SENSEable City Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Stanislav Sobolevsky
- SENSEable City Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
- Center for Urban Science + Progress, New York University, Brooklyn, NY, USA
| | - Izabela Sitko
- Department of Geoinformatics – Z_GIS, GISscience Doctoral College, University of Salzburg, Salzburg, Austria
| | - Bartosz Hawelka
- Department of Geoinformatics – Z_GIS, GISscience Doctoral College, University of Salzburg, Salzburg, Austria
| | - Lada Rudikova
- Department of Intelligent Software and Computer Systems, Yanka Kupala State University of Grodno, Grodno, Belarus
| | - Alexander Kurbatski
- Faculty of Applied Mathematics and Computer Science, Belarusian State University, Minsk, Belarus
| | - Carlo Ratti
- SENSEable City Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
31
|
|
32
|
Identifying and modeling the structural discontinuities of human interactions. Sci Rep 2017; 7:46677. [PMID: 28443647 PMCID: PMC5405407 DOI: 10.1038/srep46677] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Accepted: 03/27/2017] [Indexed: 11/21/2022] Open
Abstract
The idea of a hierarchical spatial organization of society lies at the core of seminal theories in human geography that have strongly influenced our understanding of social organization. Along the same line, the recent availability of large-scale human mobility and communication data has offered novel quantitative insights hinting at a strong geographical confinement of human interactions within neighboring regions, extending to local levels within countries. However, models of human interaction largely ignore this effect. Here, we analyze several country-wide networks of telephone calls - both, mobile and landline - and in either case uncover a systematic decrease of communication induced by borders which we identify as the missing variable in state-of-the-art models. Using this empirical evidence, we propose an alternative modeling framework that naturally stylizes the damping effect of borders. We show that this new notion substantially improves the predictive power of widely used interaction models. This increases our ability to understand, model and predict social activities and to plan the development of infrastructures across multiple scales.
Collapse
|
33
|
Dash Nelson G, Rae A. An Economic Geography of the United States: From Commutes to Megaregions. PLoS One 2016; 11:e0166083. [PMID: 27902707 PMCID: PMC5130203 DOI: 10.1371/journal.pone.0166083] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2016] [Accepted: 10/22/2016] [Indexed: 11/18/2022] Open
Abstract
The emergence in the United States of large-scale "megaregions" centered on major metropolitan areas is a phenomenon often taken for granted in both scholarly studies and popular accounts of contemporary economic geography. This paper uses a data set of more than 4,000,000 commuter flows as the basis for an empirical approach to the identification of such megaregions. We compare a method which uses a visual heuristic for understanding areal aggregation to a method which uses a computational partitioning algorithm, and we reflect upon the strengths and limitations of both. We discuss how choices about input parameters and scale of analysis can lead to different results, and stress the importance of comparing computational results with "common sense" interpretations of geographic coherence. The results provide a new perspective on the functional economic geography of the United States from a megaregion perspective, and shed light on the old geographic problem of the division of space into areal units.
Collapse
Affiliation(s)
- Garrett Dash Nelson
- Department of Geography and Society of Fellows, Dartmouth College, Hanover, New Hampshire, United States of America
| | - Alasdair Rae
- Department of Urban Studies and Planning, University of Sheffield, Sheffield, United Kingdom
- * E-mail:
| |
Collapse
|
34
|
Dash Nelson G, Rae A. An Economic Geography of the United States: From Commutes to Megaregions. PLoS One 2016. [PMID: 27902707 DOI: 10.0.5.91/journal.pone.0166083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/22/2023] Open
Abstract
The emergence in the United States of large-scale "megaregions" centered on major metropolitan areas is a phenomenon often taken for granted in both scholarly studies and popular accounts of contemporary economic geography. This paper uses a data set of more than 4,000,000 commuter flows as the basis for an empirical approach to the identification of such megaregions. We compare a method which uses a visual heuristic for understanding areal aggregation to a method which uses a computational partitioning algorithm, and we reflect upon the strengths and limitations of both. We discuss how choices about input parameters and scale of analysis can lead to different results, and stress the importance of comparing computational results with "common sense" interpretations of geographic coherence. The results provide a new perspective on the functional economic geography of the United States from a megaregion perspective, and shed light on the old geographic problem of the division of space into areal units.
Collapse
Affiliation(s)
- Garrett Dash Nelson
- Department of Geography and Society of Fellows, Dartmouth College, Hanover, New Hampshire, United States of America
| | - Alasdair Rae
- Department of Urban Studies and Planning, University of Sheffield, Sheffield, United Kingdom
| |
Collapse
|
35
|
|
36
|
Alvarez AJ, Sanz-Rodríguez CE, Cabrera JL. Weighting dissimilarities to detect communities in networks. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2015; 373:rsta.2015.0108. [PMID: 26527808 DOI: 10.1098/rsta.2015.0108] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 08/19/2015] [Indexed: 06/05/2023]
Abstract
Many complex systems can be described as networks exhibiting inner organization as communities of nodes. The identification of communities is a key factor to understand community-based functionality. We propose a family of measures based on the weighted sum of two dissimilarity quantifiers that facilitates efficient classification of communities by tuning the quantifiers' relative weight to the network's particularities. Additionally, two new dissimilarities are introduced and incorporated in our analysis. The effectiveness of our approach is tested by examining the Zachary's Karate Club Network and the Caenorhabditis elegans reactions network. The analysis reveals the method's classification power as confirmed by the efficient detection of intrapathway metabolic functions in C. elegans.
Collapse
Affiliation(s)
- Alejandro J Alvarez
- Stochastic Dynamics Laboratory, Center for Physics, Venezuelan Institute for Scientific Research, Caracas 1020-A, Venezuela Departamento de Física, FCFM, Universidad de Chile, Santiago, Chile
| | - Carlos E Sanz-Rodríguez
- Stochastic Dynamics Laboratory, Center for Physics, Venezuelan Institute for Scientific Research, Caracas 1020-A, Venezuela
| | - Juan Luis Cabrera
- Stochastic Dynamics Laboratory, Center for Physics, Venezuelan Institute for Scientific Research, Caracas 1020-A, Venezuela
| |
Collapse
|
37
|
Hawelka B, Sitko I, Beinat E, Sobolevsky S, Kazakopoulos P, Ratti C. Geo-located Twitter as proxy for global mobility patterns. CARTOGRAPHY AND GEOGRAPHIC INFORMATION SCIENCE 2014; 41:260-271. [PMID: 27019645 PMCID: PMC4786829 DOI: 10.1080/15230406.2014.890072] [Citation(s) in RCA: 146] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/17/2013] [Accepted: 12/30/2013] [Indexed: 05/05/2023]
Abstract
Pervasive presence of location-sharing services made it possible for researchers to gain an unprecedented access to the direct records of human activity in space and time. This article analyses geo-located Twitter messages in order to uncover global patterns of human mobility. Based on a dataset of almost a billion tweets recorded in 2012, we estimate the volume of international travelers by country of residence. Mobility profiles of different nations were examined based on such characteristics as mobility rate, radius of gyration, diversity of destinations, and inflow-outflow balance. Temporal patterns disclose the universally valid seasons of increased international mobility and the particular character of international travels of different nations. Our analysis of the community structure of the Twitter mobility network reveals spatially cohesive regions that follow the regional division of the world. We validate our result using global tourism statistics and mobility models provided by other authors and argue that Twitter is exceptionally useful for understanding and quantifying global mobility patterns.
Collapse
Affiliation(s)
- Bartosz Hawelka
- Department of Geoinformatics – Z_GIS, GISscience Doctoral College, University of Salzburg, Salzburg, Austria
- SENSEable City Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Izabela Sitko
- Department of Geoinformatics – Z_GIS, GISscience Doctoral College, University of Salzburg, Salzburg, Austria
- SENSEable City Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Euro Beinat
- Department of Geoinformatics – Z_GIS, GISscience Doctoral College, University of Salzburg, Salzburg, Austria
| | - Stanislav Sobolevsky
- SENSEable City Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Pavlos Kazakopoulos
- Department of Geoinformatics – Z_GIS, GISscience Doctoral College, University of Salzburg, Salzburg, Austria
| | - Carlo Ratti
- SENSEable City Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
38
|
Sobolevsky S, Szell M, Campari R, Couronné T, Smoreda Z, Ratti C. Delineating geographical regions with networks of human interactions in an extensive set of countries. PLoS One 2013; 8:e81707. [PMID: 24367490 PMCID: PMC3867326 DOI: 10.1371/journal.pone.0081707] [Citation(s) in RCA: 92] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2013] [Accepted: 10/25/2013] [Indexed: 11/29/2022] Open
Abstract
Large-scale networks of human interaction, in particular country-wide telephone call networks, can be used to redraw geographical maps by applying algorithms of topological community detection. The geographic projections of the emerging areas in a few recent studies on single regions have been suggested to share two distinct properties: first, they are cohesive, and second, they tend to closely follow socio-economic boundaries and are similar to existing political regions in size and number. Here we use an extended set of countries and clustering indices to quantify overlaps, providing ample additional evidence for these observations using phone data from countries of various scales across Europe, Asia, and Africa: France, the UK, Italy, Belgium, Portugal, Saudi Arabia, and Ivory Coast. In our analysis we use the known approach of partitioning country-wide networks, and an additional iterative partitioning of each of the first level communities into sub-communities, revealing that cohesiveness and matching of official regions can also be observed on a second level if spatial resolution of the data is high enough. The method has possible policy implications on the definition of the borderlines and sizes of administrative regions.
Collapse
Affiliation(s)
- Stanislav Sobolevsky
- Senseable City Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Michael Szell
- Senseable City Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Riccardo Campari
- Senseable City Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Thomas Couronné
- Sociology and Economics of Networks and Services Department, Orange Labs, Paris, France
| | - Zbigniew Smoreda
- Sociology and Economics of Networks and Services Department, Orange Labs, Paris, France
| | - Carlo Ratti
- Senseable City Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| |
Collapse
|