201
|
Rahiminejad S, Maurya MR, Subramaniam S. Topological and functional comparison of community detection algorithms in biological networks. BMC Bioinformatics 2019; 20:212. [PMID: 31029085 PMCID: PMC6487005 DOI: 10.1186/s12859-019-2746-0] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2018] [Accepted: 03/18/2019] [Indexed: 11/28/2022] Open
Abstract
Background Community detection algorithms are fundamental tools to uncover important features in networks. There are several studies focused on social networks but only a few deal with biological networks. Directly or indirectly, most of the methods maximize modularity, a measure of the density of links within communities as compared to links between communities. Results Here we analyze six different community detection algorithms, namely, Combo, Conclude, Fast Greedy, Leading Eigen, Louvain and Spinglass, on two important biological networks to find their communities and evaluate the results in terms of topological and functional features through Kyoto Encyclopedia of Genes and Genomes pathway and Gene Ontology term enrichment analysis. At a high level, the main assessment criteria are 1) appropriate community size (neither too small nor too large), 2) representation within the community of only one or two broad biological functions, 3) most genes from the network belonging to a pathway should also belong to only one or two communities, and 4) performance speed. The first network in this study is a network of Protein-Protein Interactions (PPI) in Saccharomyces cerevisiae (Yeast) with 6532 nodes and 229,696 edges and the second is a network of PPI in Homo sapiens (Human) with 20,644 nodes and 241,008 edges. All six methods perform well, i.e., find reasonably sized and biologically interpretable communities, for the Yeast PPI network but the Conclude method does not find reasonably sized communities for the Human PPI network. Louvain method maximizes modularity by using an agglomerative approach, and is the fastest method for community detection. For the Yeast PPI network, the results of Spinglass method are most similar to the results of Louvain method with regard to the size of communities and core pathways they identify, whereas for the Human PPI network, Combo and Spinglass methods yield the most similar results, with Louvain being the next closest. Conclusions For Yeast and Human PPI networks, Louvain method is likely the best method to find communities in terms of detecting known core pathways in a reasonable time. Electronic supplementary material The online version of this article (10.1186/s12859-019-2746-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sara Rahiminejad
- Departments of Bioengineering and Mechanical and Aerospace Engineering, University of California, San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA
| | - Mano R Maurya
- Department of Bioengineering and San Diego Supercomputer Center, University of California, San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA.
| | - Shankar Subramaniam
- Department of Bioengineering, Departments of Computer Science and Engineering, Cellular and Molecular Medicine, and the Graduate Program in Bioinformatics, University of California, San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA.
| |
Collapse
|
202
|
Ambrosini R, Corti M, Franzetti A, Caprioli M, Rubolini D, Motta VM, Costanzo A, Saino N, Gandolfi I. Cloacal microbiomes and ecology of individual barn swallows. FEMS Microbiol Ecol 2019; 95:5479878. [DOI: 10.1093/femsec/fiz061] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Accepted: 04/25/2019] [Indexed: 01/08/2023] Open
Affiliation(s)
- Roberto Ambrosini
- Department of Environmental Science and Policy, University of Milan, via Celoria 26, 20133 Milano, Italy
| | - Margherita Corti
- Department of Environmental Science and Policy, University of Milan, via Celoria 26, 20133 Milano, Italy
| | - Andrea Franzetti
- Department of Earth and Environmental Sciences, University of Milano-Bicocca, Piazza della Scienza 1, 20126 Milano, Italy
| | - Manuela Caprioli
- Department of Environmental Science and Policy, University of Milan, via Celoria 26, 20133 Milano, Italy
| | - Diego Rubolini
- Department of Environmental Science and Policy, University of Milan, via Celoria 26, 20133 Milano, Italy
| | - Veronica Maria Motta
- Department of Earth and Environmental Sciences, University of Milano-Bicocca, Piazza della Scienza 1, 20126 Milano, Italy
| | - Alessandra Costanzo
- Department of Environmental Science and Policy, University of Milan, via Celoria 26, 20133 Milano, Italy
| | - Nicola Saino
- Department of Environmental Science and Policy, University of Milan, via Celoria 26, 20133 Milano, Italy
| | - Isabella Gandolfi
- Department of Earth and Environmental Sciences, University of Milano-Bicocca, Piazza della Scienza 1, 20126 Milano, Italy
| |
Collapse
|
203
|
Funke T, Becker T. Stochastic block models: A comparison of variants and inference methods. PLoS One 2019; 14:e0215296. [PMID: 31013290 PMCID: PMC6478296 DOI: 10.1371/journal.pone.0215296] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Accepted: 03/30/2019] [Indexed: 11/19/2022] Open
Abstract
Finding communities in complex networks is a challenging task and one promising approach is the Stochastic Block Model (SBM). But the influences from various fields led to a diversity of variants and inference methods. Therefore, a comparison of the existing techniques and an independent analysis of their capabilities and weaknesses is needed. As a first step, we review the development of different SBM variants such as the degree-corrected SBM of Karrer and Newman or Peixoto's hierarchical SBM. Beside stating all these variants in a uniform notation, we show the reasons for their development. Knowing the variants, we discuss a variety of approaches to infer the optimal partition like the Metropolis-Hastings algorithm. We perform our analysis based on our extension of the Girvan-Newman test and the Lancichinetti-Fortunato-Radicchi benchmark as well as a selection of some real world networks. Using these results, we give some guidance to the challenging task of selecting an inference method and SBM variant. In addition, we give a simple heuristic to determine the number of steps for the Metropolis-Hastings algorithms that lack a usual stop criterion. With our comparison, we hope to guide researches in the field of SBM and highlight the problem of existing techniques to focus future research. Finally, by making our code freely available, we want to promote a faster development, integration and exchange of new ideas.
Collapse
Affiliation(s)
- Thorben Funke
- Production Systems and Logistic Systems, BIBA - Bremer Institut für Produktion und Logistik GmbH at the University of Bremen, Bremen, Bremen, Germany
- Faculty of Production Engineering, University of Bremen, Bremen, Bremen, Germany
| | - Till Becker
- Production Systems and Logistic Systems, BIBA - Bremer Institut für Produktion und Logistik GmbH at the University of Bremen, Bremen, Bremen, Germany
- Faculty of Business Studies, University of Applied Sciences Emden/Leer, Emden, Lower Saxony, Germany
| |
Collapse
|
204
|
Abstract
AIMS The Resilience Scale for Adults (RSA) is a questionnaire that measures protective factors of mental health. The aim of this paper is to perform a network analysis of the RSA in a dataset composed of 675 French-speaking Belgian university students, to identify potential targets for intervention to improve protective factors in individuals. METHODS We estimated a network structure for the 33-item questionnaire and for the six domains of resilience: perception of self, planned future, social competence, structured style, family cohesion and social competence. Node predictability (shared variance with surrounding nodes in the network) was used to assess the connectivity of items. An exploratory graph analysis (EGA) was performed to detect communities in the network: the number of communities detected being different than the original number of factors proposed in the scale, we estimated a new network with the resulting structure and verified the validity of the new construct which was proposed. We provide the anonymised dataset and code in external online materials (10.17632/64db36w8kf.2) to ensure complete reproducibility of the results. RESULTS The network composed of items from the RSA is overall positively connected with strongest connections arising among items from the same domain. The domain network reports several connections, both positive and negative. The EGA reported the existence of four communities that we propose as an additional network structure. Node predictability estimates show that connectedness varies among the items and domains of the RSA. CONCLUSIONS Network analysis is a useful tool to explore resilience and identify targets for clinical intervention. In this study, the four domains acting as components of the additional four-domain network structure may be potential targets to improve an individual's resilience. Further studies may endeavour to replicate our findings in different samples.
Collapse
|
205
|
McCaig D, Elliott MT, Siew CS, Walasek L, Meyer C. Profiling Commenters on Mental Health-Related Online Forums: A Methodological Example Focusing on Eating Disorder-Related Commenters. JMIR Ment Health 2019; 6:e12555. [PMID: 31008715 PMCID: PMC6658234 DOI: 10.2196/12555] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/19/2018] [Revised: 01/16/2019] [Accepted: 01/30/2019] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Understanding the characteristics of commenters on mental health-related online forums is vital for the development of effective psychological interventions in these communities. The way in which commenters interact can enhance our understanding of their characteristics. OBJECTIVE Using eating disorder-related (EDR) forums as an example, this study detailed a methodology that aimed to determine subtypes of mental health-related forums and profile their commenters based on the other forums to which they contributed. METHODS The researchers identified all public EDR forums (with ≥500 contributing commenters between March 2017 and February 2018) on a large Web-based discussion platform (Reddit). A mixed-methods approach comprising network analysis with community detection, text mining, and manual review identified subtypes of EDR forums. For each subtype, another network analysis with community detection was conducted using the EDR forum commenter overlap between 50 forums on which the commenters also commented. The topics of forums in each detected community were then manually reviewed to identify the shared interests of each subtype of EDR forum commenters. RESULTS Six subtypes of EDR forums were identified, to which 14,024 commenters had contributed. The results focus on 2 subtypes-proeating disorder and thinspiration-and communities of commenters within both subtypes. Within the proeating disorder subtype, 3 communities of commenters were detected that related to the body and eating, mental health, and women, appearance, and mixed topics. With regard to the thinspiration group, 78.17% (849/1086) of commenters had also commented on pornographic forums and 16.66% (181/1086) had contributed to proeating disorder forums. CONCLUSIONS The article exemplifies a methodology that provides insight into subtypes of mental health-related forums and the characteristics of their commenters. The findings have implications for future research and Web-based psychological interventions. With the publicly available data and code provided, researchers can easily reproduce the analyses or utilize the methodology to investigate other mental health-related forums.
Collapse
Affiliation(s)
- Duncan McCaig
- Warwick Manufacturing Group, University of Warwick, Coventry, United Kingdom
| | - Mark T Elliott
- Warwick Manufacturing Group, University of Warwick, Coventry, United Kingdom
| | - Cynthia Sq Siew
- Department of Psychology, University of Warwick, Coventry, United Kingdom.,Department of Psychology, National University of Singapore, Singapore, Singapore
| | - Lukasz Walasek
- Department of Psychology, University of Warwick, Coventry, United Kingdom
| | - Caroline Meyer
- Warwick Manufacturing Group, University of Warwick, Coventry, United Kingdom.,Warwick Medical School, University of Warwick, Coventry, United Kingdom.,Coventry and Warwickshire NHS Partnership Trust, Coventry, United Kingdom
| |
Collapse
|
206
|
Microbial and metabolic succession on common building materials under high humidity conditions. Nat Commun 2019; 10:1767. [PMID: 30992445 PMCID: PMC6467912 DOI: 10.1038/s41467-019-09764-z] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Accepted: 03/27/2019] [Indexed: 02/06/2023] Open
Abstract
Despite considerable efforts to characterize the microbial ecology of the built environment, the metabolic mechanisms underpinning microbial colonization and successional dynamics remain unclear, particularly at high moisture conditions. Here, we applied bacterial/viral particle counting, qPCR, amplicon sequencing of the genes encoding 16S and ITS rRNA, and metabolomics to longitudinally characterize the ecological dynamics of four common building materials maintained at high humidity. We varied the natural inoculum provided to each material and wet half of the samples to simulate a potable water leak. Wetted materials had higher growth rates and lower alpha diversity compared to non-wetted materials, and wetting described the majority of the variance in bacterial, fungal, and metabolite structure. Inoculation location was weakly associated with bacterial and fungal beta diversity. Material type influenced bacterial and viral particle abundance and bacterial and metabolic (but not fungal) diversity. Metabolites indicative of microbial activity were identified, and they too differed by material. Microbes inhabit built environments and could contribute to degradation of surfaces especially in damp conditions. Here the authors explore how communities of microbes and their metabolites affect four types of built surfaces under varying environmental conditions.
Collapse
|
207
|
Bruch EE, Newman MEJ. Structure of Online Dating Markets in U.S. Cities. SOCIOLOGICAL SCIENCE 2019; 6:219-234. [PMID: 31363485 PMCID: PMC6666423 DOI: 10.15195/v6.a9] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
We study the structure of heterosexual dating markets in the United States through an analysis of the interactions of several million users of a large online dating website, applying recently developed network analysis methods to the pattern of messages exchanged among users. Our analysis shows that the strongest driver of romantic interaction at the national level is simple geographic proximity, but at the local level, other demographic factors come into play. We find that dating markets in each city are partitioned into submarkets along lines of age and ethnicity. Sex ratio varies widely between submarkets, with younger submarkets having more men and fewer women than older ones. There is also a noticeable tendency for minorities, especially women, to be younger than the average in older submarkets, and our analysis reveals how this kind of racial stratification arises through the messaging decisions of both men and women. Our study illustrates how network techniques applied to online interactions can reveal the aggregate effects of individual behavior on social structure.
Collapse
Affiliation(s)
- Elizabeth E Bruch
- Department of Sociology and Center for the Study of Complex Systems, University of Michigan, and Santa Fe Institute
| | - M E J Newman
- Department of Physics and Center for the Study of Complex Systems, University of Michigan, and Santa Fe Institute
| |
Collapse
|
208
|
Laib M, Guignard F, Kanevski M, Telesca L. Community detection analysis in wind speed-monitoring systems using mutual information-based complex network. CHAOS (WOODBURY, N.Y.) 2019; 29:043107. [PMID: 31042944 DOI: 10.1063/1.5054724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/03/2018] [Accepted: 03/14/2019] [Indexed: 06/09/2023]
Abstract
A mutual information-based weighted network representation of a wide wind speed-monitoring system in Switzerland was analyzed in order to detect communities. Two communities have been revealed, corresponding to two clusters of sensors situated, respectively, on the Alps and on the Jura-Plateau that define the two major climatic zones of Switzerland. The silhouette measure is used to evaluate the obtained communities and confirm the membership of each sensor to its cluster.
Collapse
Affiliation(s)
- Mohamed Laib
- IDYST, Faculty of Geosciences and Environment, University of Lausanne, 1015 Lausanne, Switzerland
| | - Fabian Guignard
- IDYST, Faculty of Geosciences and Environment, University of Lausanne, 1015 Lausanne, Switzerland
| | - Mikhail Kanevski
- IDYST, Faculty of Geosciences and Environment, University of Lausanne, 1015 Lausanne, Switzerland
| | - Luciano Telesca
- CNR, Istituto di Metodologie per l'Analisi Ambientale, 85050 Tito, PZ, Italy
| |
Collapse
|
209
|
Traag VA, Waltman L, van Eck NJ. From Louvain to Leiden: guaranteeing well-connected communities. Sci Rep 2019; 9:5233. [PMID: 30914743 PMCID: PMC6435756 DOI: 10.1038/s41598-019-41695-z] [Citation(s) in RCA: 1831] [Impact Index Per Article: 305.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2018] [Accepted: 03/11/2019] [Indexed: 11/14/2022] Open
Abstract
Community detection is often used to understand the structure of large and complex networks. One of the most popular algorithms for uncovering community structure is the so-called Louvain algorithm. We show that this algorithm has a major defect that largely went unnoticed until now: the Louvain algorithm may yield arbitrarily badly connected communities. In the worst case, communities may even be disconnected, especially when running the algorithm iteratively. In our experimental analysis, we observe that up to 25% of the communities are badly connected and up to 16% are disconnected. To address this problem, we introduce the Leiden algorithm. We prove that the Leiden algorithm yields communities that are guaranteed to be connected. In addition, we prove that, when the Leiden algorithm is applied iteratively, it converges to a partition in which all subsets of all communities are locally optimally assigned. Furthermore, by relying on a fast local move approach, the Leiden algorithm runs faster than the Louvain algorithm. We demonstrate the performance of the Leiden algorithm for several benchmark and real-world networks. We find that the Leiden algorithm is faster than the Louvain algorithm and uncovers better partitions, in addition to providing explicit guarantees.
Collapse
Affiliation(s)
- V A Traag
- Centre for Science and Technology Studies, Leiden University, Leiden, The Netherlands.
| | - L Waltman
- Centre for Science and Technology Studies, Leiden University, Leiden, The Netherlands
| | - N J van Eck
- Centre for Science and Technology Studies, Leiden University, Leiden, The Netherlands
| |
Collapse
|
210
|
McCoy TH. Mapping the Delirium Literature Through Probabilistic Topic Modeling and Network Analysis: A Computational Scoping Review. PSYCHOSOMATICS 2019; 60:105-120. [PMID: 30686485 DOI: 10.1016/j.psym.2018.12.003] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Revised: 12/09/2018] [Accepted: 12/13/2018] [Indexed: 01/21/2023]
Abstract
BACKGROUND Delirium is an acute confusional state, associated with morbidity and mortality in diverse medically-ill populations. Delirium is recognized, through both professional competencies and instructional materials, as a core topic in consultation psychiatry. OBJECTIVE Conduct a computational scoping review of the delirium literature to identify the overall contours of this literature and evolution of the delirium literature over time. METHODS Algorithmic analysis of all research articles on delirium indexed in MEDLINE between 1995 and 2015 using network analysis of citation Medical Subject Headings (MeSH) tags and probabilistic topic modeling of article abstracts. RESULTS The delirium corpus included 3591 articles in 874 unique journals, of which 95 were primarily psychiatric. The annual delirium publication volume increased from 40 in 1995 to 420 in 2015 and grew as a proportion of total indexed publications from 8.9 to 38.6 per 100,000. The psychiatric journals published 720 of the delirium publications. Articles on treatment of delirium (806) outnumber articles on prevention of delirium (432). Abstract topic modeling and Medical Subject Headings graph community analysis identified similar genres in the delirium literature, including: delirium in geriatric, critically ill, palliative care, and postsurgical patients as well as diagnostic criteria or scales, and clinical risk factors. The genres identified by topic modeling and community analysis were distributed unevenly between psychiatric journals and nonpsychiatric journals. CONCLUSION The delirium literature is large and growing. Much of this growth is outside of psychiatric journals. Subtopics of the delirium literature can be algorithmically identified, and these subtopics are distributed unevenly across psychiatric journals.
Collapse
Affiliation(s)
- Thomas H McCoy
- Center for Quantitative Health, Department of Psychiatry and Department of Medicine, Massachusetts General Hospital, Boston, MA.
| |
Collapse
|
211
|
Uppal K, Ma C, Go YM, Jones DP, Wren J. xMWAS: a data-driven integration and differential network analysis tool. Bioinformatics 2019; 34:701-702. [PMID: 29069296 DOI: 10.1093/bioinformatics/btx656] [Citation(s) in RCA: 130] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2017] [Accepted: 10/17/2017] [Indexed: 12/13/2022] Open
Abstract
Summary Integrative omics is a central component of most systems biology studies. Computational methods are required for extracting meaningful relationships across different omics layers. Various tools have been developed to facilitate integration of paired heterogenous omics data; however most existing tools allow integration of only two omics datasets. Furthermore, existing data integration tools do not incorporate additional steps of identifying sub-networks or communities of highly connected entities and evaluating the topology of the integrative network under different conditions. Here we present xMWAS, a software for data integration, network visualization, clustering, and differential network analysis of data from biochemical and phenotypic assays, and two or more omics platforms. Availability and implementation https://kuppal.shinyapps.io/xmwas (Online) and https://github.com/kuppal2/xMWAS/ (R). Contact kuppal2@emory.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Karan Uppal
- Clinical Biomarkers Laboratory, Department of Medicine, Emory University, Atlanta, GA 30322, USA
| | - Chunyu Ma
- Clinical Biomarkers Laboratory, Department of Medicine, Emory University, Atlanta, GA 30322, USA
| | - Young-Mi Go
- Clinical Biomarkers Laboratory, Department of Medicine, Emory University, Atlanta, GA 30322, USA
| | - Dean P Jones
- Clinical Biomarkers Laboratory, Department of Medicine, Emory University, Atlanta, GA 30322, USA
| | | |
Collapse
|
212
|
von Stillfried D, Ermakova T, Ng F, Czihal T. [Patient-sharing networks : New approaches in the analysis and transformation of geographic variation in healthcare]. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz 2019; 60:1356-1371. [PMID: 29064035 DOI: 10.1007/s00103-017-2641-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The analysis of geographic variations has spurred arguments that area of residence determines access to and quality of healthcare. In this paper we argue that unwarranted geographic variations can be traced back to actions of individual patients and their healthcare providers (doctors, hospitals). These actors interact in a complicated web of shared responsibilities. Designing effective interventions to reduce unwarranted geographic variations may therefore depend on methods to identify these interactions and communities of providers with a shared accountability. In the US, Canada, and Germany, routine data have been used to identify self-organized informal or virtual networks of physicians and hospitals, so-called patient-sharing networks (PSNs). This is an emerging field of analysis. We attempt to provide a brief report on the state of work in progress. It can be shown that variation between PSNs in a given area is effectively greater than variation between regions. While this suggests that reducing unwarranted variation needs to start at the level of PSN, methods to identify PSNs still vary widely. We compare epidemiological approaches and approaches based on graph theory and social network analysis. We also present some preliminary findings of exploratory analyses based on comprehensive claims data of physician practices in Germany. Defining PSNs based on usual provider relationships helps to create distinctive patient populations while PSNs may not be mutually exclusive. Social network analysis, on the other hand, appears better equipped to differentiate between provider communities with stronger and weaker ties; it does not yield distinctive patient populations. To achieve accountability and to support change management, analytic methods to describe PSNs still need refinement. There are first projects in Germany which use PSNs as an intervention platform in order to achieve improved cooperation and reduce unwarranted variation in their care processes.
Collapse
Affiliation(s)
- Dominik von Stillfried
- Zentralinstitut für die kassenärztliche Versorgung, Salzufer 8, 10587, Berlin, Deutschland.
| | - Tatiana Ermakova
- Zentralinstitut für die kassenärztliche Versorgung, Salzufer 8, 10587, Berlin, Deutschland
| | - Frank Ng
- Zentralinstitut für die kassenärztliche Versorgung, Salzufer 8, 10587, Berlin, Deutschland
| | - Thomas Czihal
- Zentralinstitut für die kassenärztliche Versorgung, Salzufer 8, 10587, Berlin, Deutschland
| |
Collapse
|
213
|
Papachristou N, Barnaghi P, Cooper B, Kober KM, Maguire R, Paul SM, Hammer M, Wright F, Armes J, Furlong EP, McCann L, Conley YP, Patiraki E, Katsaragakis S, Levine JD, Miaskowski C. Network Analysis of the Multidimensional Symptom Experience of Oncology. Sci Rep 2019; 9:2258. [PMID: 30783135 PMCID: PMC6381090 DOI: 10.1038/s41598-018-36973-1] [Citation(s) in RCA: 70] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2018] [Accepted: 11/22/2018] [Indexed: 02/07/2023] Open
Abstract
Oncology patients undergoing cancer treatment experience an average of fifteen unrelieved symptoms that are highly variable in both their severity and distress. Recent advances in Network Analysis (NA) provide a novel approach to gain insights into the complex nature of co-occurring symptoms and symptom clusters and identify core symptoms. We present findings from the first study that used NA to examine the relationships among 38 common symptoms in a large sample of oncology patients undergoing chemotherapy. Using two different models of Pairwise Markov Random Fields (PMRF), we examined the nature and structure of interactions for three different dimensions of patients’ symptom experience (i.e., occurrence, severity, distress). Findings from this study provide the first direct evidence that the connections between and among symptoms differ depending on the symptom dimension used to create the network. Based on an evaluation of the centrality indices, nausea appears to be a structurally important node in all three networks. Our findings can be used to guide the development of symptom management interventions based on the identification of core symptoms and symptom clusters within a network.
Collapse
Affiliation(s)
- Nikolaos Papachristou
- Centre for Vision, Speech and Signal Processing, University of Surrey, Guildford, UK.
| | - Payam Barnaghi
- Centre for Vision, Speech and Signal Processing, University of Surrey, Guildford, UK.
| | | | | | | | | | - Marilyn Hammer
- Department of Nursing, Mount Sinai Medical Center, New York, USA
| | - Fay Wright
- School of Nursing, Yale University, New Haven, USA
| | - Jo Armes
- Centre for Vision, Speech and Signal Processing, University of Surrey, Guildford, UK.,School of Health Sciences, University of Surrey, Guildford, UK
| | - Eileen P Furlong
- School of Nursing, Midwifery and Health Systems, University College Dublin, Dublin, Ireland
| | - Lisa McCann
- University of Strathclyde, Glasgow, Scotland
| | - Yvette P Conley
- School of Nursing, University of Pittsburgh, Pittsburgh, USA
| | | | | | | | | |
Collapse
|
214
|
Ma J, Wang J, Ghoraie LS, Men X, Haibe-Kains B, Dai P. A Comparative Study of Cluster Detection Algorithms in Protein-Protein Interaction for Drug Target Discovery and Drug Repurposing. Front Pharmacol 2019; 10:109. [PMID: 30837876 PMCID: PMC6389713 DOI: 10.3389/fphar.2019.00109] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Accepted: 01/28/2019] [Indexed: 12/29/2022] Open
Abstract
The interactions between drugs and their target proteins induce altered expression of genes involved in complex intracellular networks. The properties of these functional network modules are critical for the identification of drug targets, for drug repurposing, and for understanding the underlying mode of action of the drug. The topological modules generated by a computational approach are defined as functional clusters. However, the functions inferred for these topological modules extracted from a large-scale molecular interaction network, such as a protein–protein interaction (PPI) network, could differ depending on different cluster detection algorithms. Moreover, the dynamic gene expression profiles among tissues or cell types causes differential functional interaction patterns between the molecular components. Thus, the connections in the PPI network should be modified by the transcriptomic landscape of specific cell lines before producing topological clusters. Here, we systematically investigated the clusters of a cell-based PPI network by using four cluster detection algorithms. We subsequently compared the performance of these algorithms for target gene prediction, which integrates gene perturbation data with the cell-based PPI network using two drug target prioritization methods, shortest path and diffusion correlation. In addition, we validated the proportion of perturbed genes in clusters by finding candidate anti-breast cancer drugs and confirming our predictions using literature evidence and cases in the ClinicalTrials.gov. Our results indicate that the Walktrap (CW) clustering algorithm achieved the best performance overall in our comparative study.
Collapse
Affiliation(s)
- Jun Ma
- National Engineering Research Center for Miniaturized Detection Systems, Northwest University, Xi'an, China.,Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
| | - Jenny Wang
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
| | | | - Xin Men
- Shaanxi Microbiology Institute, Xi'an, China
| | | | - Penggao Dai
- National Engineering Research Center for Miniaturized Detection Systems, Northwest University, Xi'an, China
| |
Collapse
|
215
|
An approach based on mixed hierarchical clustering and optimization for graph analysis in social media network: toward globally hierarchical community structure. Knowl Inf Syst 2019. [DOI: 10.1007/s10115-019-01329-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
216
|
Kaalia R, Rajapakse JC. Functional homogeneity and specificity of topological modules in human proteome. BMC Bioinformatics 2019; 19:553. [PMID: 30717667 PMCID: PMC7394330 DOI: 10.1186/s12859-018-2549-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2018] [Accepted: 11/30/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Functional modules in protein-protein interaction networks (PPIN) are defined by maximal sets of functionally associated proteins and are vital to understanding cellular mechanisms and identifying disease associated proteins. Topological modules of the human proteome have been shown to be related to functional modules of PPIN. However, the effects of the weights of interactions between protein pairs and the integration of physical (direct) interactions with functional (indirect expression-based) interactions have not been investigated in the detection of functional modules of the human proteome. RESULTS We investigated functional homogeneity and specificity of topological modules of the human proteome and validated them with known biological and disease pathways. Specifically, we determined the effects on functional homogeneity and heterogeneity of topological modules (i) with both physical and functional protein-protein interactions; and (ii) with incorporation of functional similarities between proteins as weights of interactions. With functional enrichment analyses and a novel measure for functional specificity, we evaluated functional relevance and specificity of topological modules of the human proteome. CONCLUSIONS The topological modules ranked using specificity scores show high enrichment with gene sets of known functions. Physical interactions in PPIN contribute to high specificity of the topological modules of the human proteome whereas functional interactions contribute to high homogeneity of the modules. Weighted networks result in more number of topological modules but did not affect their functional propensity. Modules of human proteome are more homogeneous for molecular functions than biological processes.
Collapse
Affiliation(s)
- Rama Kaalia
- School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
| | - Jagath C. Rajapakse
- School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
| |
Collapse
|
217
|
Williams N, Arnulfo G, Wang SH, Nobili L, Palva S, Palva JM. Comparison of Methods to Identify Modules in Noisy or Incomplete Brain Networks. Brain Connect 2018; 9:128-143. [PMID: 30543117 DOI: 10.1089/brain.2018.0603] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Community structure, or "modularity," is a fundamentally important aspect in the organization of structural and functional brain networks, but their identification with community detection methods is confounded by noisy or missing connections. Although several methods have been used to account for missing data, the performance of these methods has not been compared quantitatively so far. In this study, we compared four different approaches to account for missing connections when identifying modules in binary and weighted networks using both Louvain and Infomap community detection algorithms. The four methods are "zeros," "row-column mean," "common neighbors," and "consensus clustering." Using Lancichinetti-Fortunato-Radicchi benchmark-simulated binary and weighted networks, we find that "zeros," "row-column mean," and "common neighbors" approaches perform well with both Louvain and Infomap, whereas "consensus clustering" performs well with Louvain but not Infomap. A similar pattern of results was observed with empirical networks from stereotactical electroencephalography data, except that "consensus clustering" outperforms other approaches on weighted networks with Louvain. Based on these results, we recommend any of the four methods when using Louvain on binary networks, whereas "consensus clustering" is superior with Louvain clustering of weighted networks. When using Infomap, "zeros" or "common neighbors" should be used for both binary and weighted networks. These findings provide a basis to accounting for noisy or missing connections when identifying modules in brain networks.
Collapse
Affiliation(s)
- Nitin Williams
- 1 Neuroscience Center, Helsinki Institute of Life Science, University of Helsinki, Finland
| | - Gabriele Arnulfo
- 1 Neuroscience Center, Helsinki Institute of Life Science, University of Helsinki, Finland.,2 Department of Informatics, Bioengineering, Robotics and System Engineering, University of Genoa, Genoa, Italy
| | - Sheng H Wang
- 1 Neuroscience Center, Helsinki Institute of Life Science, University of Helsinki, Finland.,3 Doctoral Programme Brain & Mind, University of Helsinki, Finland
| | - Lino Nobili
- 4 Claudio Munari Epilepsy Surgery Centre, Niguarda Hospital, Milan, Italy.,5 Child Neuropsychiatry, IRCCS, Gaslini Institute, DINOGMI, University of Genoa, Genoa, Italy
| | - Satu Palva
- 1 Neuroscience Center, Helsinki Institute of Life Science, University of Helsinki, Finland.,6 BioMag laboratory, HUS Medical Imaging Center, Helsinki, Finland.,7 Center for Cognitive Neuroimaging, Institute of Neuroscience and Psychology, University of Glasgow, United Kingdom
| | - J Matias Palva
- 1 Neuroscience Center, Helsinki Institute of Life Science, University of Helsinki, Finland.,7 Center for Cognitive Neuroimaging, Institute of Neuroscience and Psychology, University of Glasgow, United Kingdom
| |
Collapse
|
218
|
Koseki SA. The geographic evolution of political cleavages in Switzerland: A network approach to assessing levels and dynamics of polarization between local populations. PLoS One 2018; 13:e0208227. [PMID: 30496319 PMCID: PMC6264899 DOI: 10.1371/journal.pone.0208227] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2017] [Accepted: 11/14/2018] [Indexed: 11/18/2022] Open
Abstract
Scholarly studies and common accounts of national politics enjoy pointing out the resilience of ideological divides among populations. Building on the image of political cleavages and geographic polarization, the regionalization of politics has become a truism across Northern democracies. Left unquestioned, this geography plays a central role in shaping electoral and referendum campaigns. In Europe and North America, observers identify recurring patterns dividing local populations during national votes. While much research describes those patterns in relation to ethnicity, religious affiliation, historic legacy and party affiliation, current approaches in political research lack the capacity to measure their evolution over time or other vote subsets. This article introduces “Dyadic Agreement Modeling” (DyAM), a transdisciplinary method to assess the evolution of geographic cleavages in vote outcomes by implementing a metric of agreement/disagreement through Network Analysis. Unlike existing approaches, DyAM offers a stable measure for political agreement and disagreement—accounting for chance, statistically robust and remaining structurally independent from the number of entries and missing data. The method opens up to a range of statistical, structural and visual tools specific to Network Analysis and its usage across disciplines. In order to illustrate DyAM, I use more than 680,000 municipal outcomes from Swiss federal popular votes and assess the evolution of political cleavages across local populations since 1981. Results suggest that political congruence between Swiss local populations increased in the last forty years, while regional political factions and linguistic alignments have lost their salience to new divides. I discuss how choices about input parameters and data subsets nuance findings, and consider confounding factors that may influence conclusions over the dynamic equilibrium of national politics and the strengthening effect of globalization on democratic institutions.
Collapse
Affiliation(s)
- Shin Alexandre Koseki
- Habitat Research Center, École polytechnique fédérale de Lausanne, Lausanne, Switzerland
- * E-mail:
| |
Collapse
|
219
|
Martin JS, Massen JJM, Šlipogor V, Bugnyar T, Jaeggi AV, Koski SE. The
EGA
+
GNM
framework: An integrative approach to modelling behavioural syndromes. Methods Ecol Evol 2018. [DOI: 10.1111/2041-210x.13100] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Affiliation(s)
- Jordan S. Martin
- Behavioural Ecology LabDepartment of AnthropologyEmory University Atlanta Georgia
- Department of Cognitive BiologyUniversity of Vienna Vienna Austria
- Department of AnthropologyMiami University Oxford Ohio
| | - Jorg J. M. Massen
- Department of Cognitive BiologyUniversity of Vienna Vienna Austria
- Cognitive Psychology UnitInstitute of PsychologyLeiden University Leiden The Netherlands
| | - Vedrana Šlipogor
- Department of Cognitive BiologyUniversity of Vienna Vienna Austria
| | - Thomas Bugnyar
- Department of Cognitive BiologyUniversity of Vienna Vienna Austria
| | - Adrian V. Jaeggi
- Behavioural Ecology LabDepartment of AnthropologyEmory University Atlanta Georgia
| | - Sonja E. Koski
- Faculty of Social SciencesUniversity of Helsinki Helsinki Finland
| |
Collapse
|
220
|
Kuntal BK, Chandrakar P, Sadhu S, Mande SS. 'NetShift': a methodology for understanding 'driver microbes' from healthy and disease microbiome datasets. ISME JOURNAL 2018; 13:442-454. [PMID: 30287886 DOI: 10.1038/s41396-018-0291-x] [Citation(s) in RCA: 88] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Revised: 09/05/2018] [Accepted: 09/14/2018] [Indexed: 12/12/2022]
Abstract
The combined effect of mutual association within the co-inhabiting microbes in human body is known to play a major role in determining health status of individuals. The differential taxonomic abundance between healthy and disease are often used to identify microbial markers. However, in order to make a microbial community based inference, it is important not only to consider microbial abundances, but also to quantify the changes observed among inter microbial associations. In the present study, we introduce a method called 'NetShift' to quantify rewiring and community changes in microbial association networks between healthy and disease. Additionally, we devise a score to identify important microbial taxa which serve as 'drivers' from the healthy to disease. We demonstrate the validity of our score on a number of scenarios and apply our methodology on two real world metagenomic datasets. The 'NetShift' methodology is also implemented as a web-based application available at https://web.rniapps.net/netshift.
Collapse
Affiliation(s)
- Bhusan K Kuntal
- Bio-Sciences R&D Division, TCS Research, Tata Consultancy Services Ltd., 54-B Hadapsar Industrial Estate, Pune, 411 013, India.,Academy of Scientific and Innovative Research (AcSIR), CSIR-National Chemical Laboratory Campus, Pune, 411 008, India
| | - Pranjal Chandrakar
- Bio-Sciences R&D Division, TCS Research, Tata Consultancy Services Ltd., 54-B Hadapsar Industrial Estate, Pune, 411 013, India.,Decision Sciences, Indian Institute of Management Bangalore, Bannerghatta Road, Bengaluru, Karnataka, 560076, India
| | - Sudipta Sadhu
- Bio-Sciences R&D Division, TCS Research, Tata Consultancy Services Ltd., 54-B Hadapsar Industrial Estate, Pune, 411 013, India
| | - Sharmila S Mande
- Bio-Sciences R&D Division, TCS Research, Tata Consultancy Services Ltd., 54-B Hadapsar Industrial Estate, Pune, 411 013, India.
| |
Collapse
|
221
|
Dannemann T, Sotomayor-Gómez B, Samaniego H. The time geography of segregation during working hours. ROYAL SOCIETY OPEN SCIENCE 2018; 5:180749. [PMID: 30473825 PMCID: PMC6227938 DOI: 10.1098/rsos.180749] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2018] [Accepted: 09/06/2018] [Indexed: 06/01/2023]
Abstract
While segregation is usually evaluated at the residential level, the recent influx of large streams of data describing urbanites' movement across the city allows to generate detailed descriptions of spatio-temporal segregation patterns across the activity space of individuals. For instance, segregation across the activity space is usually thought to be lower compared with residential segregation given the importance of social complementarity, among other factors, shaping the economies of cities. However, these new dynamic approaches to segregation convey important methodological challenges. This paper proposes a methodological framework to investigate segregation during working hours. Our approach combines three well-known mathematical tools: community detection algorithms, segregation metrics and random walk analysis. Using Santiago (Chile) as our model system, we build a detailed home-work commuting network from a large dataset of mobile phone pings and spatially partition the city into several communities. We then evaluate the probability that two persons at their work location will come from the same community. Finally, a randomization analysis of commuting distances and angles corroborates the strong segregation description for Santiago provided by the sociological literature. While our findings highlights the benefit of developing new approaches to understand dynamic processes in the urban environment, unveiling counterintuitive patterns such as segregation at our workplace also shows a specific example in which the exposure dimension of segregation is successfully studied using the growingly available streams of highly detailed anonymized mobile phone registries.
Collapse
Affiliation(s)
| | | | - Horacio Samaniego
- Laboratorio de Ecoinformática, Universidad Austral de Chile, Campus Isla Teja, Valdivia, Chile
| |
Collapse
|
222
|
Critical analysis of (Quasi-)Surprise for community detection in complex networks. Sci Rep 2018; 8:14459. [PMID: 30262896 PMCID: PMC6160439 DOI: 10.1038/s41598-018-32582-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2017] [Accepted: 05/08/2018] [Indexed: 02/07/2023] Open
Abstract
Module or community structures widely exist in complex networks, and optimizing statistical measures is one of the most popular approaches for revealing and identifying such structures in real-world applications. In this paper, we focus on critical behaviors of (Quasi-)Surprise, a type of statistical measure of interest for community structure, accompanied by a series of comparisons with other measures. Specially, the effect of various network parameters on the measures is thoroughly investigated. The critical number of dense subgraphs in partition transition is derived, and a kind of phase diagrams is provided to display and compare the phase transitions of the measures. The effect of “potential well” for (Quasi-)Surprise is revealed, which may be difficult to get across by general greedy (agglomerative or divisive) algorithms. Finally, an extension of Quasi-Surprise is introduced for the study of multi-scale structures. Experimental results are of help for understanding the critical behaviors of (Quasi-)Surprise, and may provide useful insight for the design of effective tools for community detection.
Collapse
|
223
|
Dessì D, Cirrone J, Recupero DR, Shasha D. SuperNoder: a tool to discover over-represented modular structures in networks. BMC Bioinformatics 2018; 19:318. [PMID: 30200901 PMCID: PMC6131773 DOI: 10.1186/s12859-018-2350-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2018] [Accepted: 08/29/2018] [Indexed: 11/10/2022] Open
Abstract
Background Networks whose nodes have labels can seem complex. Fortunately, many have substructures that occur often (“motifs”). A societal example of a motif might be a household. Replacing such motifs by named supernodes reduces the complexity of the network and can bring out insightful features. Doing so repeatedly may give hints about higher level structures of the network. We call this recursive process Recursive Supernode Extraction. Results This paper describes algorithms and a tool to discover disjoint (i.e. non-overlapping) motifs in a network, replacing those motifs by new nodes, and then recursing. We show applications in food-web and protein-protein interaction (PPI) networks where our methods reduce the complexity of the network and yield insights. Conclusions SuperNoder is a web-based and standalone tool which enables the simplification of big graphs based on the reduction of high frequency motifs. It applies various strategies for identifying disjoint motifs with the goal of enhancing the understandability of networks.
Collapse
Affiliation(s)
- Danilo Dessì
- Department of Mathematics and Computer Science, University of Cagliari, Cagliari, 09124, Italy.
| | - Jacopo Cirrone
- Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, New York City, 10012, USA
| | | | - Dennis Shasha
- Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, New York City, 10012, USA
| |
Collapse
|
224
|
Frinhani RDMD, de Carvalho MAM, Soma NY. A PageRank-based heuristic for the minimization of open stacks problem. PLoS One 2018; 13:e0203076. [PMID: 30161217 PMCID: PMC6117050 DOI: 10.1371/journal.pone.0203076] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2018] [Accepted: 08/14/2018] [Indexed: 11/25/2022] Open
Abstract
The minimization of open stacks problem (MOSP) aims to determine the ideal production sequence to optimize the occupation of physical space in manufacturing settings. Most of current methods for solving the MOSP were not designed to work with large instances, precluding their use in specific cases of similar modeling problems. We therefore propose a PageRank-based heuristic to solve large instances modeled in graphs. In computational experiments, both data from the literature and new datasets up to 25 times fold larger in input size than current datasets, totaling 1330 instances, were analyzed to compare the proposed heuristic with state-of-the-art methods. The results showed the competitiveness of the proposed heuristic in terms of quality, as it found optimal solutions in several cases, and in terms of shorter run times compared with the fastest available method. Furthermore, based on specific graph densities, we found that the difference in the value of solutions between methods was small, thus justifying the use of the fastest method. The proposed heuristic is scalable and is more affected by graph density than by size.
Collapse
Affiliation(s)
| | | | - Nei Yoshihiro Soma
- Technological Institute of Aeronautics, Computer Sciences Division, São José dos Campos, São Paulo, 12228-900, Brazil
| |
Collapse
|
225
|
Miranda PJ, Baptista MS, de Souza Pinto SE. The Odyssey's mythological network. PLoS One 2018; 13:e0200703. [PMID: 30059551 PMCID: PMC6066224 DOI: 10.1371/journal.pone.0200703] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2018] [Accepted: 07/02/2018] [Indexed: 11/19/2022] Open
Abstract
In this work, we study the mythological network of Odyssey of Homer. We use ordinary statistical quantifiers in order to classify the network as real or fictional. We also introduce an analysis of communities which allows us to see how network properties shall emerge. We found that Odyssey can be classified both as real and fictional network. This statement is supported as far as mythological characters are removed, which results in a network with real properties. The community analysis indicated to us that there is a power-law relationship based on the max degree of each community. These results allow us to conclude that Odyssey might be an amalgam of myth and of historical facts, with communities playing a central role.
Collapse
Affiliation(s)
| | - Murilo Silva Baptista
- Institute for Complex System and Mathematical Biology, SUPA, University of Aberdeen, Aberdeen, United Kingdom
| | | |
Collapse
|
226
|
|
227
|
Quantifying ecological impacts of mass extinctions with network analysis of fossil communities. Proc Natl Acad Sci U S A 2018; 115:5217-5222. [PMID: 29686079 PMCID: PMC5960297 DOI: 10.1073/pnas.1719976115] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The geologic record provides evidence of repeated diversification events and mass extinctions, which entailed benchmark changes in biodiversity and ecology. For insights into these events, we explore the fossil record of marine animal communities using a network-based approach to quantifying ecological change over time. The major radiations and mass extinctions of the Phanerozoic Eon resulted in the biggest ecological changes, as they involved the rise and decline of interrelated communities in relative dominance. Our analyses provide support for an ecological severity ranking of mass extinctions and illuminate the long-term consequences of the Ordovician radiation and Devonian mass depletion of biodiversity. Our work highlights the potential for irreversible ecosystem changes with species losses, both previously documented and predicted in the future. Mass extinctions documented by the fossil record provide critical benchmarks for assessing changes through time in biodiversity and ecology. Efforts to compare biotic crises of the past and present, however, encounter difficulty because taxonomic and ecological changes are decoupled, and although various metrics exist for describing taxonomic turnover, no methods have yet been proposed to quantify the ecological impacts of extinction events. To address this issue, we apply a network-based approach to exploring the evolution of marine animal communities over the Phanerozoic Eon. Network analysis of fossil co-occurrence data enables us to identify nonrandom associations of interrelated paleocommunities. These associations, or evolutionary paleocommunities, dominated total diversity during successive intervals of relative community stasis. Community turnover occurred largely during mass extinctions and radiations, when ecological reorganization resulted in the decline of one association and the rise of another. Altogether, we identify five evolutionary paleocommunities at the generic and familial levels in addition to three ordinal associations that correspond to Sepkoski’s Cambrian, Paleozoic, and Modern evolutionary faunas. In this context, we quantify magnitudes of ecological change by measuring shifts in the representation of evolutionary paleocommunities over geologic time. Our work shows that the Great Ordovician Biodiversification Event had the largest effect on ecology, followed in descending order by the Permian–Triassic, Cretaceous–Paleogene, Devonian, and Triassic–Jurassic mass extinctions. Despite its taxonomic severity, the Ordovician extinction did not strongly affect co-occurrences of taxa, affirming its limited ecological impact. Network paleoecology offers promising approaches to exploring ecological consequences of extinctions and radiations.
Collapse
|
228
|
Heeren A, Bernstein EE, McNally RJ. Deconstructing trait anxiety: a network perspective. ANXIETY STRESS AND COPING 2018; 31:262-276. [PMID: 29433339 DOI: 10.1080/10615806.2018.1439263] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
BACKGROUND AND OBJECTIVES For decades, the dominant paradigm in trait anxiety research has regarded the construct as signifying the underlying cause of the thoughts, feelings, and behaviors that supposedly reflect its presence. Recently, a network theory of personality has appeared. According to this perspective, trait anxiety is a formative construct emerging from interactions among its constitutive features (e.g., thought, feelings, behaviors); it is not a latent cause of these features. DESIGN In this study, we characterized trait anxiety as a network system of interacting elements. METHODS To do so, we estimated a graphical gaussian model via the computation of a regularized partial correlation network in an unselected sample (N = 611). We also implemented modularity-based community detection analysis to test whether the features of trait anxiety cohere as a single network system. RESULTS We find that trait anxiety can indeed be conceptualized as a single, coherent network system of interacting elements. CONCLUSIONS This radically new approach to visualizing trait anxiety may offer an especially informative view of the interplay between its constitutive features. As prior research has implicated trait anxiety as a risk factor for the development of anxiety-related psychopathology, our findings also set the scene for novel research directions.
Collapse
Affiliation(s)
- Alexandre Heeren
- a Department of Psychology , Harvard University , Cambridge , MA , USA.,b Psychological Science Research Institute , Université Catholique de Louvain , Louvain-la-Neuve , Belgium.,c Institute of Neuroscience , Université Catholique de Louvain , Brussels , Belgium
| | - Emily E Bernstein
- a Department of Psychology , Harvard University , Cambridge , MA , USA
| | - Richard J McNally
- a Department of Psychology , Harvard University , Cambridge , MA , USA
| |
Collapse
|
229
|
Pereira-Morales AJ, Adan A, Forero DA. Network analysis of multiple risk factors for mental health in young Colombian adults. J Ment Health 2017; 28:153-160. [PMID: 29265896 DOI: 10.1080/09638237.2017.1417568] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
BACKGROUND A considerable proportion of young adults are affected by psychological distress at any time and an important fraction of them may develop mental disorders. Use of novel approaches for the analysis of data from multiple psychological scales might facilitate the identification of key indicators of mental health. AIMS The aim of current study was to examine the relationship between multiple risk factors for mental illness, using a network analysis perspective. METHODS A sample of 334 young Colombian adults (mean age = 21.7) were evaluated with validated scales measuring several psychosocial factors previously associated with mental health (e.g. worry, sleep problems, suicidal ideation, childhood abuse, alcohol related-problems and personality traits). A total of 24 nodes were included in the network analysis and topology, centrality, and stability of the networks were studied. RESULTS Specific nodes that occupied critical positions in the network were identified, with worry, perceived distress and low energy being the most central nodes. CONCLUSIONS Our explorative findings suggest that a network analysis might identify risk factors that have a central role in the multiple dimensions of emotional health in young adults. These novel analyses could have important applications for the understanding of the psychological functioning affecting mental health.
Collapse
Affiliation(s)
- Angela J Pereira-Morales
- a Laboratory of Neuropsychiatric Genetics, Biomedical Sciences Research Group, School of Medicine, Universidad Antonio Nariño , Bogotá , Colombia
| | - Ana Adan
- b Department of Clinical Psychology and Psychobiology , School of Psychology, University of Barcelona , Barcelona , Spain , and.,c Institute of Neurosciences, University of Barcelona , Barcelona , Spain
| | - Diego A Forero
- a Laboratory of Neuropsychiatric Genetics, Biomedical Sciences Research Group, School of Medicine, Universidad Antonio Nariño , Bogotá , Colombia
| |
Collapse
|
230
|
Machine learning meets complex networks via coalescent embedding in the hyperbolic space. Nat Commun 2017; 8:1615. [PMID: 29151574 PMCID: PMC5694768 DOI: 10.1038/s41467-017-01825-5] [Citation(s) in RCA: 64] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2016] [Accepted: 10/19/2017] [Indexed: 01/02/2023] Open
Abstract
Physicists recently observed that realistic complex networks emerge as discrete samples from a continuous hyperbolic geometry enclosed in a circle: the radius represents the node centrality and the angular displacement between two nodes resembles their topological proximity. The hyperbolic circle aims to become a universal space of representation and analysis of many real networks. Yet, inferring the angular coordinates to map a real network back to its latent geometry remains a challenging inverse problem. Here, we show that intelligent machines for unsupervised recognition and visualization of similarities in big data can also infer the network angular coordinates of the hyperbolic model according to a geometrical organization that we term "angular coalescence." Based on this phenomenon, we propose a class of algorithms that offers fast and accurate "coalescent embedding" in the hyperbolic circle even for large networks. This computational solution to an inverse problem in physics of complex systems favors the application of network latent geometry techniques in disciplines dealing with big network data analysis including biology, medicine, and social science.
Collapse
|
231
|
Yang Z, Perotti JI, Tessone CJ. Hierarchical benchmark graphs for testing community detection algorithms. Phys Rev E 2017; 96:052311. [PMID: 29347723 DOI: 10.1103/physreve.96.052311] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2017] [Indexed: 11/07/2022]
Abstract
Hierarchical organization is an important, prevalent characteristic of complex systems; to understand their organization, the study of the underlying (generally complex) networks that describe the interactions between their constituents plays a central role. Numerous previous works have shown that many real-world networks in social, biologic, and technical systems present hierarchical organization, often in the form of a hierarchy of community structures. Many artificial benchmark graphs have been proposed to test different community detection methods, but no benchmark has been developed to thoroughly test the detection of hierarchical community structures. In this study, we fill this vacancy by extending the Lancichinetti-Fortunato-Radicchi (LFR) ensemble of benchmark graphs, adopting the rule of constructing hierarchical networks proposed by Ravasz and Barabási. We employ this benchmark to test three of the most popular community detection algorithms and quantify their accuracy using the traditional mutual information and the recently introduced hierarchical mutual information. The results indicate that the Ravasz-Barabási-Lancichinetti-Fortunato-Radicchi (RB-LFR) benchmark generates a complex hierarchical structure constituting a challenging benchmark for the considered community detection methods.
Collapse
Affiliation(s)
- Zhao Yang
- URPP Social Networks, University of Zurich, Andreasstrasse 15, CH-8050 Zürich, Switzerland
| | - Juan I Perotti
- IMT School for Advanced Studies Lucca, Piazza San Francesco 19, I-55100 Lucca, Italy.,Instituto de Física Enrique Gaviola IFEG-CONICET, Universidad Nacional de Córdoba, Ciudad Universitaria, 5000 Córdoba, Argentina
| | - Claudio J Tessone
- URPP Social Networks, University of Zurich, Andreasstrasse 15, CH-8050 Zürich, Switzerland.,IMT School for Advanced Studies Lucca, Piazza San Francesco 19, I-55100 Lucca, Italy
| |
Collapse
|
232
|
Astegiano J, Altermatt F, Massol F. Disentangling the co-structure of multilayer interaction networks: degree distribution and module composition in two-layer bipartite networks. Sci Rep 2017; 7:15465. [PMID: 29133886 PMCID: PMC5684352 DOI: 10.1038/s41598-017-15811-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2017] [Accepted: 11/02/2017] [Indexed: 11/29/2022] Open
Abstract
Species establish different interactions (e.g. antagonistic, mutualistic) with multiple species, forming multilayer ecological networks. Disentangling network co-structure in multilayer networks is crucial to predict how biodiversity loss may affect the persistence of multispecies assemblages. Existing methods to analyse multilayer networks often fail to consider network co-structure. We present a new method to evaluate the modular co-structure of multilayer networks through the assessment of species degree co-distribution and network module composition. We focus on modular structure because of its high prevalence among ecological networks. We apply our method to two Lepidoptera-plant networks, one describing caterpillar-plant herbivory interactions and one representing adult Lepidoptera nectaring on flowers, thereby possibly pollinating them. More than 50% of the species established either herbivory or visitation interactions, but not both. These species were over-represented among plants and lepidopterans, and were present in most modules in both networks. Similarity in module composition between networks was high but not different from random expectations. Our method clearly delineates the importance of interpreting multilayer module composition similarity in the light of the constraints imposed by network structure to predict the potential indirect effects of species loss through interconnected modular networks.
Collapse
Affiliation(s)
- Julia Astegiano
- Instituto Multidisciplinario de Biología Vegetal, FCEFyN, Universidad Nacional de Córdoba, CONICET, Argentina.
- Centre d'Ecologie Fonctionnelle et Evolutive (CEFE), UMR 5175, CNRS - Université de Montpellier - Université Paul Valéry Montpellier - EPHE, 1919 route de Mende, F-34293, Montpellier, France.
| | - Florian Altermatt
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, Department of Aquatic Ecology, CH-8600, Dübendorf, Switzerland
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, CH-8057, Zürich, Switzerland
| | - François Massol
- Centre d'Ecologie Fonctionnelle et Evolutive (CEFE), UMR 5175, CNRS - Université de Montpellier - Université Paul Valéry Montpellier - EPHE, 1919 route de Mende, F-34293, Montpellier, France
- CNRS, Université de Lille-Sciences et Technologies, UMR 8198 Evo-Eco-Paleo, SPICI group, F-59000, Lille, France
| |
Collapse
|
233
|
Almeira N, Schaigorodsky AL, Perotti JI, Billoni OV. Structure constrained by metadata in networks of chess players. Sci Rep 2017; 7:15186. [PMID: 29123175 PMCID: PMC5680290 DOI: 10.1038/s41598-017-15428-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Accepted: 10/27/2017] [Indexed: 11/09/2022] Open
Abstract
Chess is an emblematic sport that stands out because of its age, popularity and complexity. It has served to study human behavior from the perspective of a wide number of disciplines, from cognitive skills such as memory and learning, to aspects like innovation and decision-making. Given that an extensive documentation of chess games played throughout history is available, it is possible to perform detailed and statistically significant studies about this sport. Here we use one of the most extensive chess databases in the world to construct two networks of chess players. One of the networks includes games that were played over-the-board and the other contains games played on the Internet. We study the main topological characteristics of the networks, such as degree distribution and correlations, transitivity and community structure. We complement the structural analysis by incorporating players’ level of play as node metadata. Although both networks are topologically different, we show that in both cases players gather in communities according to their expertise and that an emergent rich-club structure, composed by the top-rated players, is also present.
Collapse
Affiliation(s)
- Nahuel Almeira
- Facultad de Matemática, Astronomía, Física y Computación, Universidad Nacional de Córdoba, Ciudad Universitaria, Córdoba, 5000, Argentina.,Instituto de Física Enrique Gaviola (IFEG-CONICET), Ciudad Universitaria, Córdoba, 5000, Argentina
| | - Ana L Schaigorodsky
- Facultad de Matemática, Astronomía, Física y Computación, Universidad Nacional de Córdoba, Ciudad Universitaria, Córdoba, 5000, Argentina.,Instituto de Física Enrique Gaviola (IFEG-CONICET), Ciudad Universitaria, Córdoba, 5000, Argentina
| | - Juan I Perotti
- Facultad de Matemática, Astronomía, Física y Computación, Universidad Nacional de Córdoba, Ciudad Universitaria, Córdoba, 5000, Argentina.,Instituto de Física Enrique Gaviola (IFEG-CONICET), Ciudad Universitaria, Córdoba, 5000, Argentina
| | - Orlando V Billoni
- Facultad de Matemática, Astronomía, Física y Computación, Universidad Nacional de Córdoba, Ciudad Universitaria, Córdoba, 5000, Argentina. .,Instituto de Física Enrique Gaviola (IFEG-CONICET), Ciudad Universitaria, Córdoba, 5000, Argentina.
| |
Collapse
|
234
|
Markov-network based latent link analysis for community detection in social behavioral interactions. APPL INTELL 2017. [DOI: 10.1007/s10489-017-1040-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
235
|
Uncovering Spatial Structures of Regional City Networks from Expressway Traffic Flow Data: A Case Study from Jiangsu Province, China. SUSTAINABILITY 2017. [DOI: 10.3390/su9091541] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
236
|
Golino HF, Epskamp S. Exploratory graph analysis: A new approach for estimating the number of dimensions in psychological research. PLoS One 2017; 12:e0174035. [PMID: 28594839 PMCID: PMC5465941 DOI: 10.1371/journal.pone.0174035] [Citation(s) in RCA: 408] [Impact Index Per Article: 51.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2016] [Accepted: 03/02/2017] [Indexed: 11/18/2022] Open
Abstract
The estimation of the correct number of dimensions is a long-standing problem in psychometrics. Several methods have been proposed, such as parallel analysis (PA), Kaiser-Guttman's eigenvalue-greater-than-one rule, multiple average partial procedure (MAP), the maximum-likelihood approaches that use fit indexes as BIC and EBIC and the less used and studied approach called very simple structure (VSS). In the present paper a new approach to estimate the number of dimensions will be introduced and compared via simulation to the traditional techniques pointed above. The approach proposed in the current paper is called exploratory graph analysis (EGA), since it is based on the graphical lasso with the regularization parameter specified using EBIC. The number of dimensions is verified using the walktrap, a random walk algorithm used to identify communities in networks. In total, 32,000 data sets were simulated to fit known factor structures, with the data sets varying across different criteria: number of factors (2 and 4), number of items (5 and 10), sample size (100, 500, 1000 and 5000) and correlation between factors (orthogonal, .20, .50 and .70), resulting in 64 different conditions. For each condition, 500 data sets were simulated using lavaan. The result shows that the EGA performs comparable to parallel analysis, EBIC, eBIC and to Kaiser-Guttman rule in a number of situations, especially when the number of factors was two. However, EGA was the only technique able to correctly estimate the number of dimensions in the four-factor structure when the correlation between factors were .7, showing an accuracy of 100% for a sample size of 5,000 observations. Finally, the EGA was used to estimate the number of factors in a real dataset, in order to compare its performance with the other six techniques tested in the simulation study.
Collapse
Affiliation(s)
- Hudson F. Golino
- Department of Psychology, University of Virginia, Charlottesville, VA, United States of America
- Graduate School of Psychology, Universidade Salgado de Oliveira, Rio de Janeiro, Brasil
| | | |
Collapse
|
237
|
Jia C, Li Y, Carson MB, Wang X, Yu J. Node Attribute-enhanced Community Detection in Complex Networks. Sci Rep 2017; 7:2626. [PMID: 28572625 PMCID: PMC5453980 DOI: 10.1038/s41598-017-02751-8] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2016] [Accepted: 04/19/2017] [Indexed: 11/09/2022] Open
Abstract
Community detection involves grouping the nodes of a network such that nodes in the same community are more densely connected to each other than to the rest of the network. Previous studies have focused mainly on identifying communities in networks using node connectivity. However, each node in a network may be associated with many attributes. Identifying communities in networks combining node attributes has become increasingly popular in recent years. Most existing methods operate on networks with attributes of binary, categorical, or numerical type only. In this study, we introduce kNN-enhance, a simple and flexible community detection approach that uses node attribute enhancement. This approach adds the k Nearest Neighbor (kNN) graph of node attributes to alleviate the sparsity and the noise effect of an original network, thereby strengthening the community structure in the network. We use two testing algorithms, kNN-nearest and kNN-Kmeans, to partition the newly generated, attribute-enhanced graph. Our analyses of synthetic and real world networks have shown that the proposed algorithms achieve better performance compared to existing state-of-the-art algorithms. Further, the algorithms are able to deal with networks containing different combinations of binary, categorical, or numerical attributes and could be easily extended to the analysis of massive networks.
Collapse
Affiliation(s)
- Caiyan Jia
- School of Computer and Information Technology & Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, China.
| | - Yafang Li
- School of Computer and Information Technology & Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, China
| | - Matthew B Carson
- Division of Health and Biomedical Informatics, Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Xiaoyang Wang
- School of Computer and Information Technology & Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, China
| | - Jian Yu
- School of Computer and Information Technology & Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, China
| |
Collapse
|
238
|
Peel L, Larremore DB, Clauset A. The ground truth about metadata and community detection in networks. SCIENCE ADVANCES 2017; 3:e1602548. [PMID: 28508065 PMCID: PMC5415338 DOI: 10.1126/sciadv.1602548] [Citation(s) in RCA: 126] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/17/2016] [Accepted: 03/08/2017] [Indexed: 05/30/2023]
Abstract
Across many scientific domains, there is a common need to automatically extract a simplified view or coarse-graining of how a complex system's components interact. This general task is called community detection in networks and is analogous to searching for clusters in independent vector data. It is common to evaluate the performance of community detection algorithms by their ability to find so-called ground truth communities. This works well in synthetic networks with planted communities because these networks' links are formed explicitly based on those known communities. However, there are no planted communities in real-world networks. Instead, it is standard practice to treat some observed discrete-valued node attributes, or metadata, as ground truth. We show that metadata are not the same as ground truth and that treating them as such induces severe theoretical and practical problems. We prove that no algorithm can uniquely solve community detection, and we prove a general No Free Lunch theorem for community detection, which implies that there can be no algorithm that is optimal for all possible community detection tasks. However, community detection remains a powerful tool and node metadata still have value, so a careful exploration of their relationship with network structure can yield insights of genuine worth. We illustrate this point by introducing two statistical techniques that can quantify the relationship between metadata and community structure for a broad class of models. We demonstrate these techniques using both synthetic and real-world networks, and for multiple types of metadata and community structures.
Collapse
Affiliation(s)
- Leto Peel
- Institute of Information and Communication Technologies, Electronics and Applied Mathematics, Université Catholique de Louvain, Louvain-la-Neuve, Belgium
- naXys, Université de Namur, Namur, Belgium
| | | | - Aaron Clauset
- Santa Fe Institute, Santa Fe, NM 87501, USA
- Department of Computer Science, University of Colorado, Boulder, CO 80309, USA
- BioFrontiers Institute, University of Colorado, Boulder, CO 80309, USA
| |
Collapse
|
239
|
Wijetunga NA, Johnston AD, Maekawa R, Delahaye F, Ulahannan N, Kim K, Greally JM. SMITE: an R/Bioconductor package that identifies network modules by integrating genomic and epigenomic information. BMC Bioinformatics 2017; 18:41. [PMID: 28100166 PMCID: PMC5242055 DOI: 10.1186/s12859-017-1477-3] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2016] [Accepted: 01/07/2017] [Indexed: 01/28/2023] Open
Abstract
BACKGROUND The molecular assays that test gene expression, transcriptional, and epigenetic regulation are increasingly diverse and numerous. The information generated by each type of assay individually gives an insight into the state of the cells tested. What should be possible is to add the information derived from separate, complementary assays to gain higher-confidence insights into cellular states. At present, the analysis of multi-dimensional, massive genome-wide data requires an initial pruning step to create manageable subsets of observations that are then used for integration, which decreases the sizes of the intersecting data sets and the potential for biological insights. Our Significance-based Modules Integrating the Transcriptome and Epigenome (SMITE) approach was developed to integrate transcriptional and epigenetic regulatory data without a loss of resolution. RESULTS SMITE combines p-values by accounting for the correlation between non-independent values within data sets, allowing genes and gene modules in an interaction network to be assigned significance values. The contribution of each type of genomic data can be weighted, permitting integration of individually under-powered data sets, increasing the overall ability to detect effects within modules of genes. We apply SMITE to a complex genomic data set including the epigenomic and transcriptomic effects of Toxoplasma gondii infection on human host cells and demonstrate that SMITE is able to identify novel subnetworks of dysregulated genes. Additionally, we show that SMITE outperforms Functional Epigenetic Modules (FEM), the current paradigm of using the spin-glass algorithm to integrate gene expression and epigenetic data. CONCLUSIONS SMITE represents a flexible, scalable tool that allows integration of transcriptional and epigenetic regulatory data from genome-wide assays to boost confidence in finding gene modules reflecting altered cellular states.
Collapse
Affiliation(s)
- N Ari Wijetunga
- Department of Genetics, Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, NY, 10461, USA
| | - Andrew D Johnston
- Department of Genetics, Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, NY, 10461, USA
| | - Ryo Maekawa
- Division of Obstetrics and Gynecology, Yamaguchi University, 677-1 Yoshida, Yamaguchi Prefecture, 753-8511, Japan
| | - Fabien Delahaye
- Department of Genetics, Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, NY, 10461, USA.,Department of Obstetrics, Gynecology and Women's Health, Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, NY, 10461, USA
| | - Netha Ulahannan
- Department of Genetics, Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, NY, 10461, USA.,Department of Microbiology and Immunology, Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, NY, 10461, USA
| | - Kami Kim
- Department of Microbiology and Immunology, Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, NY, 10461, USA.,Department of Pathology, Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, NY, 10461, USA.,Department of Medicine, Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, NY, 10461, USA
| | - John M Greally
- Department of Genetics, Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, NY, 10461, USA.
| |
Collapse
|