1
|
Fromsa A, Willgert K, Srinivasan S, Mekonnen G, Bedada W, Gumi B, Lakew M, Tadesse B, Bayissa B, Sirak A, Girma Abdela M, Gebre S, Chibssa T, Veerasami M, Vordermeier HM, Bakker D, Berg S, Ameni G, Juleff N, de Jong MCM, Wood J, Conlan A, Kapur V. BCG vaccination reduces bovine tuberculosis transmission, improving prospects for elimination. Science 2024; 383:eadl3962. [PMID: 38547287 DOI: 10.1126/science.adl3962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 01/24/2024] [Indexed: 04/02/2024]
Abstract
Bacillus Calmette-Guérin (BCG) is a routinely used vaccine for protecting children against Mycobacterium tuberculosis that comprises attenuated Mycobacterium bovis. BCG can also be used to protect livestock against M. bovis; however, its effectiveness has not been quantified for this use. We performed a natural transmission experiment to directly estimate the rate of transmission to and from vaccinated and unvaccinated calves over a 1-year exposure period. The results show a higher indirect efficacy of BCG to reduce transmission from vaccinated animals that subsequently become infected [74%; 95% credible interval (CrI): 46 to 98%] compared with direct protection against infection (58%; 95% CrI: 34 to 73%) and an estimated total efficacy of 89% (95% CrI: 74 to 96%). A mechanistic transmission model of bovine tuberculosis (bTB) spread within the Ethiopian dairy sector was developed and showed how the prospects for elimination may be enabled by routine BCG vaccination of cattle.
Collapse
Affiliation(s)
- Abebe Fromsa
- Aklilu Lemma Institutes of Pathobiology, Addis Ababa University, Addis Ababa, Ethiopia
- College of Veterinary Medicine and Agriculture, Addis Ababa University, Bishoftu, Ethiopia
| | - Katriina Willgert
- Disease Dynamics Unit, Department of Veterinary Medicine, University of Cambridge, UK
| | - Sreenidhi Srinivasan
- Huck Institutes of Life Sciences, The Pennsylvania State University, University Park, PA, USA
- Department of Animal Science, The Pennsylvania State University, University Park, PA, USA
- The Global Health Initiative, Henry Ford Health, Detroit, MI, USA
| | | | | | - Balako Gumi
- Aklilu Lemma Institutes of Pathobiology, Addis Ababa University, Addis Ababa, Ethiopia
| | | | | | - Berecha Bayissa
- Aklilu Lemma Institutes of Pathobiology, Addis Ababa University, Addis Ababa, Ethiopia
| | | | - Musse Girma Abdela
- Aklilu Lemma Institutes of Pathobiology, Addis Ababa University, Addis Ababa, Ethiopia
| | | | | | | | | | - Douwe Bakker
- Huck Institutes of Life Sciences, The Pennsylvania State University, University Park, PA, USA
- Technical Consultant and Independent Researcher, Lelystad, Netherlands
- Departamento de Sanidad Animal, Facultad de Veterinaria, Universidad Complutense, Madrid, Spain
| | - Stefan Berg
- Animal and Plant Health Agency, Weybridge, UK
| | - Gobena Ameni
- Aklilu Lemma Institutes of Pathobiology, Addis Ababa University, Addis Ababa, Ethiopia
- Department of Veterinary Medicine, College of Agriculture and Veterinary Medicine, United Arab Emirates University, United Arab Emirates
| | - Nick Juleff
- The Bill & Melinda Gates Foundation Seattle, WA, USA
| | - Mart C M de Jong
- Quantitative Veterinary Epidemiology Group, Wageningen UR, The Netherlands
| | - James Wood
- Disease Dynamics Unit, Department of Veterinary Medicine, University of Cambridge, UK
| | - Andrew Conlan
- Disease Dynamics Unit, Department of Veterinary Medicine, University of Cambridge, UK
| | - Vivek Kapur
- Huck Institutes of Life Sciences, The Pennsylvania State University, University Park, PA, USA
- Department of Animal Science, The Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
2
|
Dichio V, De Vico Fallani F. Exploration-Exploitation Paradigm for Networked Biological Systems. PHYSICAL REVIEW LETTERS 2024; 132:098402. [PMID: 38489647 DOI: 10.1103/physrevlett.132.098402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 01/24/2024] [Indexed: 03/17/2024]
Abstract
The stochastic exploration of the configuration space and the exploitation of functional states underlie many biological processes. The evolutionary dynamics stands out as a remarkable example. Here, we introduce a novel formalism that mimics evolution and encodes a general exploration-exploitation dynamics for biological networks. We apply it to the brain wiring problem, focusing on the maturation of that of the nematode Caenorhabditis elegans. We demonstrate that a parsimonious maxent description of the adult brain combined with our framework is able to track down the entire developmental trajectory.
Collapse
Affiliation(s)
- Vito Dichio
- Sorbonne Universite, Paris Brain Institute-ICM, CNRS, Inria, Inserm, AP-HP, Hopital de la Pitie Salpêtriere, F-75013, Paris, France
| | - Fabrizio De Vico Fallani
- Sorbonne Universite, Paris Brain Institute-ICM, CNRS, Inria, Inserm, AP-HP, Hopital de la Pitie Salpêtriere, F-75013, Paris, France
| |
Collapse
|
3
|
Vega Yon GG. Power and multicollinearity in small networks: A discussion of "Tale of Two Datasets: Representativeness and Generalisability of Inference for Samples of Networks" by Krivitsky, Coletti & Hens. J Am Stat Assoc 2023; 118:2228-2231. [PMID: 38385154 PMCID: PMC10881223 DOI: 10.1080/01621459.2023.2252041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Accepted: 08/04/2023] [Indexed: 02/23/2024]
Abstract
The recent work by Krivitsky, Coletti & Hens [KCH] provides an important new contribution to the Exponential-Family Random Graph Models [ERGMs], a start-to-finish approach to dealing with multi-network ERGMs. Although multi-network ERGMs have been around for a while (mostly in the form of block-diagonal models and multi-level ERGMs, see Duxbury and Wertsching (2023), Wang et al. (2013), Slaughter and Koehly (2016)), not much care has been given to the estimation and post-estimation steps. In their paper, Krivitsky, Coletti & Hens give a detailed layout of how to build, estimate, and analyze multi-ERGMs with heterogeneous data sources. In this comment, I will focus on two issues the authors did not discuss, namely, sample size requirements and multicollinearity.
Collapse
|
4
|
Dichio V, De Vico Fallani F. Statistical models of complex brain networks: a maximum entropy approach. REPORTS ON PROGRESS IN PHYSICS. PHYSICAL SOCIETY (GREAT BRITAIN) 2023; 86:102601. [PMID: 37437559 DOI: 10.1088/1361-6633/ace6bc] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/10/2022] [Accepted: 07/12/2023] [Indexed: 07/14/2023]
Abstract
The brain is a highly complex system. Most of such complexity stems from the intermingled connections between its parts, which give rise to rich dynamics and to the emergence of high-level cognitive functions. Disentangling the underlying network structure is crucial to understand the brain functioning under both healthy and pathological conditions. Yet, analyzing brain networks is challenging, in part because their structure represents only one possible realization of a generative stochastic process which is in general unknown. Having a formal way to cope with such intrinsic variability is therefore central for the characterization of brain network properties. Addressing this issue entails the development of appropriate tools mostly adapted from network science and statistics. Here, we focus on a particular class of maximum entropy models for networks, i.e. exponential random graph models, as a parsimonious approach to identify the local connection mechanisms behind observed global network structure. Efforts are reviewed on the quest for basic organizational properties of human brain networks, as well as on the identification of predictive biomarkers of neurological diseases such as stroke. We conclude with a discussion on how emerging results and tools from statistical graph modeling, associated with forthcoming improvements in experimental data acquisition, could lead to a finer probabilistic description of complex systems in network neuroscience.
Collapse
Affiliation(s)
- Vito Dichio
- Sorbonne Universite, Paris Brain Institute-ICM, CNRS, Inria, Inserm, AP-HP, Hopital de la Pitie Salpêtriere, F-75013 Paris, France
| | - Fabrizio De Vico Fallani
- Sorbonne Universite, Paris Brain Institute-ICM, CNRS, Inria, Inserm, AP-HP, Hopital de la Pitie Salpêtriere, F-75013 Paris, France
| |
Collapse
|
5
|
Koskinen J, Snijders TAB. Multilevel longitudinal analysis of social networks. JOURNAL OF THE ROYAL STATISTICAL SOCIETY. SERIES A, (STATISTICS IN SOCIETY) 2023; 186:376-400. [PMID: 37521824 PMCID: PMC10376442 DOI: 10.1093/jrsssa/qnac009] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/22/2022] [Revised: 10/31/2022] [Accepted: 11/23/2022] [Indexed: 08/01/2023]
Abstract
Stochastic actor-oriented models (SAOMs) are a modelling framework for analysing network dynamics using network panel data. This paper extends the SAOM to the analysis of multilevel network panels through a random coefficient model, estimated with a Bayesian approach. The proposed model allows testing theories about network dynamics, social influence, and interdependence of multiple networks. It is illustrated by a study of the dynamic interdependence of friendship networks and minor delinquency. Data were available for 126 classrooms in the first year of secondary school, of which 82 were used, containing relatively few missing data points and having not too much network turnover.
Collapse
Affiliation(s)
- Johan Koskinen
- University of Stockholm, Stockholm, Sweden
- University of Melbourne, Melbourne, Australia
| | - Tom A B Snijders
- Address for correspondence: Tom A.B. Snijders, Department of Sociology, Grote Kruisstraat 2/1, 9712 TS Groningen, The Netherlands.
| |
Collapse
|
6
|
Herrera JP, Moody J, Nunn CL. Predicting primate-parasite associations using exponential random graph models. J Anim Ecol 2023; 92:710-722. [PMID: 36633380 DOI: 10.1111/1365-2656.13883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Accepted: 12/07/2022] [Indexed: 01/13/2023]
Abstract
Ecological associations between hosts and parasites are influenced by host exposure and susceptibility to parasites, and by parasite traits, such as transmission mode. Advances in network analysis allow us to answer questions about the causes and consequences of traits in ecological networks in ways that could not be addressed in the past. We used a network-based framework (exponential random graph models or ERGMs) to investigate the biogeographic, phylogenetic and ecological characteristics of hosts and parasites that affect the probability of interactions among nonhuman primates and their parasites. Parasites included arthropods, bacteria, fungi, protozoa, viruses and helminths. We investigated existing hypotheses, along with new predictors and an expanded host-parasite database that included 213 primate nodes, 763 parasite nodes and 2319 edges among them. Analyses also investigated phylogenetic relatedness, sampling effort and spatial overlap among hosts. In addition to supporting some previous findings, our ERGM approach demonstrated that more threatened hosts had fewer parasites, and notably, that this effect was independent of hosts also having a smaller geographic range. Despite having fewer parasites, threatened host species shared more parasites with other hosts, consistent with loss of specialist parasites and threat arising from generalist parasites that can be maintained in other, non-threatened hosts. Viruses, protozoa and helminths had broader host ranges than bacteria, or fungi, and parasites that infect non-primates had a higher probability of infecting more primate species. The value of the ERGM approach for investigating the processes structing host-parasite networks provided a more complete view on the biogeographic, phylogenetic and ecological traits that influence parasite species richness and parasite sharing among hosts. The results supported some previous analyses and revealed new associations that warrant future research, thus revealing how hosts and parasites interact to form ecological networks.
Collapse
Affiliation(s)
- James P Herrera
- Duke Lemur Center SAVA Conservation, Duke University, Durham, North Carolina, USA
| | - James Moody
- Department of Sociology, Duke University, Durham, North Carolina, USA
| | - Charles L Nunn
- Department of Evolutionary Anthropology, Duke University, Durham, North Carolina, USA.,Duke Global Health Institute, Duke University, Durham, North Carolina, USA
| |
Collapse
|
7
|
Butts CT. Continuous Time Graph Processes with Known ERGM Equilibria: Contextual Review, Extensions, and Synthesis. THE JOURNAL OF MATHEMATICAL SOCIOLOGY 2023; 48:129-171. [PMID: 38681800 PMCID: PMC11043653 DOI: 10.1080/0022250x.2023.2180001] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Accepted: 10/17/2022] [Indexed: 05/01/2024]
Abstract
Graph processes that unfold in continuous time are of obvious theoretical and practical interest. Particularly useful are those whose long-term behavior converges to a graph distribution of known form. Here, we review some of the conditions for such convergence, and provide examples of novel and/or known processes that do so. These include subfamilies of the well-known stochastic actor oriented models, as well as continuum extensions of temporal and separable temporal exponential family random graph models. We also comment on some related threads in the broader work on network dynamics, which provide additional context for the continuous time case. Graph processes that unfold in continuous time are natural models for social network dynamics: able to directly represent changes in structure as they unfold (rather than, e.g. as snapshots at discrete intervals), such models not only offer the promise of capturing dynamics at high temporal resolution, but are also easily mapped to empirical data without the need to preselect a level of granularity with respect to which the dynamics are defined. Although relatively few general frameworks of this type have been extensively studied, at least one (the stochastic actor-oriented models, or SAOMs) is arguably among the most successful and widely used families of models in the social sciences (see, e.g., Snijders (2001); Steglich et al. (2010); Burk et al. (2007); Sijtsema et al. (2010); de la Haye et al. (2011); Weerman (2011); Schaefer and Kreager (2020) among many others). Work using other continuous time graph processes has also found applications both within (Koskinen and Snijders, 2007; Koskinen et al., 2015; Stadtfeld et al., 2017; Hoffman et al., 2020) and beyond (Grazioli et al., 2019; Yu et al., 2020) the social sciences, suggesting the potential for further advances.
Collapse
Affiliation(s)
- Carter T Butts
- Departments of Sociology, Statistics, Computer Science, and EECS and Institute for Mathematical Behavioral Sciences, University of California Irvine
| |
Collapse
|
8
|
Piperata BA, Scaggs SA, Dufour DL, Adams IK. Measuring food insecurity: An introduction to tools for human biologists and ecologists. Am J Hum Biol 2023; 35:e23821. [PMID: 36256611 DOI: 10.1002/ajhb.23821] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 08/20/2022] [Accepted: 09/29/2022] [Indexed: 11/09/2022] Open
Abstract
OBJECTIVE Food insecurity is a significant and growing concern undermining the wellbeing of 30% of the global population. Food in/security is a complex construct consisting of four dimensions: availability, access, utilization, and stability, making it challenging to measure. We provide a toolkit human biologists/ecologists can use to advance research on this topic. METHODS We review the strengths and limitations of common tools used to measure food access and utilization, the two dimensions most proximate to people's lived experience, and emphasize tools that provide data needed to best link food security with human biological outcomes. We also discuss methods that provide contextual data human biologists/ecologists will find useful for study design, ensuring instrument validity, and improving data quality. RESULTS Food access is principally measured using experience-based instruments that emphasize economic access. Social access, such as food sharing, is under-studied and we recommend using social network analysis to explore this dimension. In terms of utilization, emphasis has been on food choice measured as dietary diversity. Food preparation and intrahousehold distribution, also part of the utilization dimension, are less studied and standardized instruments for measuring both are lacking. The embodiment of food insecurity has focused on child growth, although a growing literature addresses adult mental wellbeing and chronic and infectious disease risk. CONCLUSIONS We see the potential to expand outcomes to include reproductive and immune function, physical activity, and the gut microbiome. Human biologists/ecologists are well-positioned to advance understanding of the human health impacts of food insecurity and provide data to support intervention efforts.
Collapse
Affiliation(s)
- Barbara A Piperata
- Department of Anthropology, The Ohio State University, Columbus, Ohio, USA
| | - Shane A Scaggs
- Department of Anthropology, The Ohio State University, Columbus, Ohio, USA
| | | | - Ingrid K Adams
- Department of Extension and School of Health and Rehabilitation Sciences, The Ohio State University, Columbus, Ohio, USA
| |
Collapse
|
9
|
Diessner EM, Freites JA, Tobias DJ, Butts CT. Network Hamiltonian Models for Unstructured Protein Aggregates, with Application to γD-Crystallin. J Phys Chem B 2023; 127:685-697. [PMID: 36637342 PMCID: PMC10437096 DOI: 10.1021/acs.jpcb.2c07672] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Network Hamiltonian models (NHMs) are a framework for topological coarse-graining of protein-protein interactions, in which each node corresponds to a protein, and edges are drawn between nodes representing proteins that are noncovalently bound. Here, this framework is applied to aggregates of γD-crystallin, a structural protein of the eye lens implicated in cataract disease. The NHMs in this study are generated from atomistic simulations of equilibrium distributions of wild-type and the cataract-causing variant W42R in solution, performed by Wong, E. K.; Prytkova, V.; Freites, J. A.; Butts, C. T.; Tobias, D. J. Molecular Mechanism of Aggregation of the Cataract-Related γD-Crystallin W42R Variant from Multiscale Atomistic Simulations. Biochemistry2019, 58 (35), 3691-3699. Network models are shown to successfully reproduce the aggregate size and structure observed in the atomistic simulation, and provide information about the transient protein-protein interactions therein. The system size is scaled from the original 375 monomers to a system of 10000 monomers, revealing a lowering of the upper tail of the aggregate size distribution of the W42R variant. Extrapolation to higher and lower concentrations is also performed. These results provide an example of the utility of NHMs for coarse-grained simulation of protein systems, as well as their ability to scale to large system sizes and high concentrations, reducing computational costs while retaining topological information about the system.
Collapse
Affiliation(s)
- Elizabeth M Diessner
- Department of Chemistry, University of California, Irvine, California92697, United States
| | - J Alfredo Freites
- Department of Chemistry, University of California, Irvine, California92697, United States
| | - Douglas J Tobias
- Department of Chemistry, University of California, Irvine, California92697, United States
| | - Carter T Butts
- Departments of Sociology, Statistics, Computer Science, and EECS, University of California, Irvine, California92697, United States
| |
Collapse
|
10
|
Stewart JR. On the time to identify the nodes in a random graph. Stat Probab Lett 2023. [DOI: 10.1016/j.spl.2023.109779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|
11
|
Blackburn B, Handcock MS. Practical Network Modeling via Tapered Exponential-family Random Graph Models. J Comput Graph Stat 2022; 32:388-401. [PMID: 37608920 PMCID: PMC10441622 DOI: 10.1080/10618600.2022.2116444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Accepted: 08/17/2022] [Indexed: 10/17/2022]
Abstract
Exponential-family Random Graph Models (ERGMs) have long been at the forefront of the analysis of relational data. The exponential-family form allows complex network dependencies to be represented. Models in this class are interpretable, flexible and have a strong theoretical foundation. The availability of powerful user-friendly open-source software allows broad accessibility and use. However, ERGMs sometimes suffer from a serious condition known as near-degeneracy, in which the model exhibits unrealistic probabilistic behavior or a severe lack-of-fit to real network data. Recently, Fellows and Handcock (2017) proposed a new model class, the Tapered ERGM, which circumvents the issue of near-degeneracy while maintaining the desirable features of ERGMs. However, the question of how to determine the proper amount of tapering needed for any model was heretofore left unanswered. This paper develops a new methodology for how to determine the necessary level of tapering and as such provides a new approach to inference for the Tapered ERGM class. Noting that a Tapered ERGM can always be made non-degenerate, we offer data-driven approaches for determining the amount of tapering necessary. The mean-value parameter estimates are unaffected by tapering, and we show that the natural parameter estimates are numerically weakly varying by the level of tapering. We then apply the Tapered ERGM to two published networks to demonstrate its effectiveness in cases where typical ERGMs fail and present the case for Tapered ERGMs replacing ERGMs entirely.
Collapse
Affiliation(s)
- Bart Blackburn
- University of California, Los Angeles, Statistics, Los Angeles, United States
| | - Mark S Handcock
- University of California, Los Angeles, Statistics, Los Angeles, United States
| |
Collapse
|
12
|
Social Support and Network Formation in a Small-Scale Horticulturalist Population. Sci Data 2022; 9:570. [PMID: 36109560 PMCID: PMC9477840 DOI: 10.1038/s41597-022-01516-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Accepted: 06/29/2022] [Indexed: 11/11/2022] Open
Abstract
Evolutionary studies of cooperation in traditional human societies suggest that helping family and responding in kind when helped are the primary mechanisms for informally distributing resources vital to day-to-day survival (e.g., food, knowledge, money, childcare). However, these studies generally rely on forms of regression analysis that disregard complex interdependences between aid, resulting in the implicit assumption that kinship and reciprocity drive the emergence of entire networks of supportive social bonds. Here I evaluate this assumption using individual-oriented simulations of network formation (i.e., Stochastic Actor-Oriented Models). Specifically, I test standard predictions of cooperation derived from the evolutionary theories of kin selection and reciprocal altruism alongside well-established sociological predictions around the self-organisation of asymmetric relationships. Simulations are calibrated to exceptional public data on genetic relatedness and the provision of tangible aid amongst all 108 adult residents of a village of indigenous horticulturalists in Nicaragua (11,556 ordered dyads). Results indicate that relatedness and reciprocity are markedly less important to whom one helps compared to the supra-dyadic arrangement of the tangible aid network itself.
Collapse
|
13
|
Yin F, Butts CT. Highly scalable maximum likelihood and conjugate Bayesian inference for ERGMs on graph sets with equivalent vertices. PLoS One 2022; 17:e0273039. [PMID: 36018834 PMCID: PMC9417041 DOI: 10.1371/journal.pone.0273039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Accepted: 08/02/2022] [Indexed: 11/18/2022] Open
Abstract
The exponential family random graph modeling (ERGM) framework provides a highly flexible approach for the statistical analysis of networks (i.e., graphs). As ERGMs with dyadic dependence involve normalizing factors that are extremely costly to compute, practical strategies for ERGMs inference generally employ a variety of approximations or other workarounds. Markov Chain Monte Carlo maximum likelihood (MCMC MLE) provides a powerful tool to approximate the maximum likelihood estimator (MLE) of ERGM parameters, and is generally feasible for typical models on single networks with as many as a few thousand nodes. MCMC-based algorithms for Bayesian analysis are more expensive, and high-quality answers are challenging to obtain on large graphs. For both strategies, extension to the pooled case—in which we observe multiple networks from a common generative process—adds further computational cost, with both time and memory scaling linearly in the number of graphs. This becomes prohibitive for large networks, or cases in which large numbers of graph observations are available. Here, we exploit some basic properties of the discrete exponential families to develop an approach for ERGM inference in the pooled case that (where applicable) allows an arbitrarily large number of graph observations to be fit at no additional computational cost beyond preprocessing the data itself. Moreover, a variant of our approach can also be used to perform Bayesian inference under conjugate priors, again with no additional computational cost in the estimation phase. The latter can be employed either for single graph observations, or for observations from graph sets. As we show, the conjugate prior is easily specified, and is well-suited to applications such as regularization. Simulation studies show that the pooled method leads to estimates with good frequentist properties, and posterior estimates under the conjugate prior are well-behaved. We demonstrate the usefulness of our approach with applications to pooled analysis of brain functional connectivity networks and to replicated x-ray crystal structures of hen egg-white lysozyme.
Collapse
Affiliation(s)
- Fan Yin
- Department of Statistics, University of California at Irvine, Irvine, CA, United States of America
| | - Carter T. Butts
- Department of Sociology, Statistics, Computer Science, and EECS and Institute for Mathematical Behavioral Sciences, University of California at Irvine, Irvine, CA, United States of America
- * E-mail:
| |
Collapse
|
14
|
Discussion to: Bayesian graphical models for modern biological applications by Y. Ni, V. Baladandayuthapani, M. Vannucci and F.C. Stingo. STAT METHOD APPL-GER 2022. [DOI: 10.1007/s10260-021-00600-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
15
|
Krivitsky PN, Morris M, Bojanowski M. Impact of Survey Design on Estimation of Exponential-Family Random Graph Models from Egocentrically-Sampled Data. SOCIAL NETWORKS 2022; 69:22-34. [PMID: 35400801 PMCID: PMC8993043 DOI: 10.1016/j.socnet.2020.10.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Egocentric sampling of networks selects a subset of nodes ("egos") and collects information from them on themselves and their immediate network neighbours ("alters"), leaving the rest of the nodes in the network unobserved. This design is popular because it is relatively inexpensive to implement and can be integrated into standard sample surveys. Recent methodological developments now make it possible to statistically analyse this type of network data with Exponential-family Random Graph Models (ERGMs). This provides a framework for principled statistical inference, and the fitted models can in turn be used to simulate complete networks of arbitrary size that are consistent with the observed sample data, allowing one to infer the distribution of whole-network properties generated by the observed egocentric network statistics. In this paper, we discuss how design choices for egocentric network studies impact statistical estimation and inference for ERGMs. The design choices include both measurement strategies (for ego and alter attributes, and for ego-alter and alter-alter ties) and sampling strategies (for egos and alters). We discuss the importance of harmonising measurement specifications across egos and alters, and conduct simulation studies to demonstrate the impact of sampling design on statistical inference, specifically stratified sampling and degree censoring.
Collapse
Affiliation(s)
- Pavel N Krivitsky
- School of Mathematics and Statistics, University of New South Wales, Sydney, NSW, Australia
| | - Martina Morris
- Department of Statistics, University of Washington, Seattle, WA, USA
| | - Michał Bojanowski
- Department of Quantitative Methods and Information Technology, Kozminski University, Warsaw, Poland
| |
Collapse
|
16
|
Approximate Bayesian computation using asymptotically normal point estimates. Comput Stat 2022. [DOI: 10.1007/s00180-022-01226-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
17
|
Clark DA, Handcock MS. Comparing the Real-World Performance of Exponential-family Random Graph Models and Latent Order Logistic Models for Social Network Analysis. JOURNAL OF THE ROYAL STATISTICAL SOCIETY. SERIES A, (STATISTICS IN SOCIETY) 2022; 185:566-587. [PMID: 35756390 PMCID: PMC9214294 DOI: 10.1111/rssa.12788] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Exponential-family Random Graph models (ERGM) are widely used in social network analysis when modelling data on the relations between actors. ERGMs are typically interpreted as a snapshot of a network at a given point in time or in a final state. The recently proposed Latent Order Logistic model (LOLOG) directly allows for a latent network formation process. We assess the real-world performance of these models when applied to typical networks modelled by researchers. Specifically, we model data from an ensemble of articles in the journal Social Networks with published ERGM fits, and compare the ERGM fit to a comparable LOLOG fit. We demonstrate that the LOLOG models are, in general, in qualitative agreement with the ERGM models, and provide at least as good a model fit. In addition they are typically faster and easier to fit to data, without the tendency for degeneracy that plagues ERGMs. Our results support the general use of LOLOG models in circumstances where ERGMs are considered.
Collapse
Affiliation(s)
- Duncan A Clark
- University of California - Los Angeles, Los Angeles, USA
| | | |
Collapse
|
18
|
Schweinberger M, Bomiriya RP, Babkin S. A Semiparametric Bayesian Approach to Epidemics, with Application to the Spread of the Coronavirus MERS in South Korea in 2015. J Nonparametr Stat 2022; 34:628-662. [PMID: 36172077 PMCID: PMC9512273 DOI: 10.1080/10485252.2021.1972294] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
We consider incomplete observations of stochastic processes governing the spread of infectious diseases through finite populations by way of contact. We propose a flexible semiparametric modeling framework with at least three advantages. First, it enables researchers to study the structure of a population contact network and its impact on the spread of infectious diseases. Second, it can accommodate short- and long-tailed degree distributions and detect potential superspreaders, who represent an important public health concern. Third, it addresses the important issue of incomplete data. Starting from first principles, we show when the incomplete-data generating process is ignorable for the purpose of Bayesian inference for the parameters of the population model. We demonstrate the semiparametric modeling framework by simulations and an application to the partially observed MERS epidemic in South Korea in 2015. We conclude with an extended discussion of open questions and directions for future research.
Collapse
Affiliation(s)
- Michael Schweinberger
- Corresponding author. Address: Department of Statistics, University of Missouri, 146 Middlebush Hall, Columbia, MO 65211, USA. . Phone: + 1 713-348-2278. Fax: +1 713-348-5476
| | | | | |
Collapse
|
19
|
Bayesian model selection for high-dimensional Ising models, with applications to educational data. Comput Stat Data Anal 2022. [DOI: 10.1016/j.csda.2021.107325] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
20
|
Stivala A, Lomi A. Testing biological network motif significance with exponential random graph models. APPLIED NETWORK SCIENCE 2021; 6:91. [PMID: 34841042 PMCID: PMC8608783 DOI: 10.1007/s41109-021-00434-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 11/08/2021] [Indexed: 06/13/2023]
Abstract
UNLABELLED Analysis of the structure of biological networks often uses statistical tests to establish the over-representation of motifs, which are thought to be important building blocks of such networks, related to their biological functions. However, there is disagreement as to the statistical significance of these motifs, and there are potential problems with standard methods for estimating this significance. Exponential random graph models (ERGMs) are a class of statistical model that can overcome some of the shortcomings of commonly used methods for testing the statistical significance of motifs. ERGMs were first introduced into the bioinformatics literature over 10 years ago but have had limited application to biological networks, possibly due to the practical difficulty of estimating model parameters. Advances in estimation algorithms now afford analysis of much larger networks in practical time. We illustrate the application of ERGM to both an undirected protein-protein interaction (PPI) network and directed gene regulatory networks. ERGM models indicate over-representation of triangles in the PPI network, and confirm results from previous research as to over-representation of transitive triangles (feed-forward loop) in an E. coli and a yeast regulatory network. We also confirm, using ERGMs, previous research showing that under-representation of the cyclic triangle (feedback loop) can be explained as a consequence of other topological features. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s41109-021-00434-y.
Collapse
Affiliation(s)
- Alex Stivala
- Institute of Computational Science, Università della Svizzera italiana, Via Giuseppe Buffi 13, 6900 Lugano, Switzerland
| | - Alessandro Lomi
- Institute of Computational Science, Università della Svizzera italiana, Via Giuseppe Buffi 13, 6900 Lugano, Switzerland
- The University of Exeter Business School, Rennes Drive, Exeter, EX4 4PU UK
| |
Collapse
|
21
|
Online network monitoring. STAT METHOD APPL-GER 2021; 30:1337-1364. [PMID: 34539309 PMCID: PMC8440157 DOI: 10.1007/s10260-021-00589-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/23/2021] [Indexed: 11/02/2022]
Abstract
An important problem in network analysis is the online detection of anomalous behaviour. In this paper, we introduce a network surveillance method bringing together network modelling and statistical process control. Our approach is to apply multivariate control charts based on exponential smoothing and cumulative sums in order to monitor networks generated by temporal exponential random graph models (TERGM). The latter allows us to account for temporal dependence while simultaneously reducing the number of parameters to be monitored. The performance of the considered charts is evaluated by calculating the average run length and the conditional expected delay for both simulated and real data. To justify the decision of using the TERGM to describe network data, some measures of goodness of fit are inspected. We demonstrate the effectiveness of the proposed approach by an empirical application, monitoring daily flights in the United States to detect anomalous patterns.
Collapse
|
22
|
Chen M, Kato K, Leng C. Analysis of networks via the sparse β‐model. J R Stat Soc Series B Stat Methodol 2021. [DOI: 10.1111/rssb.12444] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Mingli Chen
- Department of Economics University of Warwick Coventry UK
| | - Kengo Kato
- Department of Statistics and Data Science Cornell University Ithaca New York USA
| | - Chenlei Leng
- Department of Statistics University of Warwick Coventry UK
| |
Collapse
|
23
|
Babkin S, Stewart JR, Long X, Schweinberger M. Large-scale estimation of random graph models with local dependence. Comput Stat Data Anal 2020; 152:107029. [PMID: 32834264 PMCID: PMC7282802 DOI: 10.1016/j.csda.2020.107029] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Revised: 03/11/2020] [Accepted: 06/04/2020] [Indexed: 01/23/2023]
Abstract
A class of random graph models is considered, combining features of exponential-family models and latent structure models, with the goal of retaining the strengths of both of them while reducing the weaknesses of each of them. An open problem is how to estimate such models from large networks. A novel approach to large-scale estimation is proposed, taking advantage of the local structure of such models for the purpose of local computing. The main idea is that random graphs with local dependence can be decomposed into subgraphs, which enables parallel computing on subgraphs and suggests a two-step estimation approach. The first step estimates the local structure underlying random graphs. The second step estimates parameters given the estimated local structure of random graphs. Both steps can be implemented in parallel, which enables large-scale estimation. The advantages of the two-step estimation approach are demonstrated by simulation studies with up to 10,000 nodes and an application to a large Amazon product recommendation network with more than 10,000 products.
Collapse
Affiliation(s)
| | - Jonathan R. Stewart
- Department of Statistics, Florida State University, United States of America
| | - Xiaochen Long
- Department of Statistics, Rice University, United States of America
| | | |
Collapse
|
24
|
Yu Y, Grazioli G, Unhelkar MH, Martin RW, Butts CT. Network Hamiltonian models reveal pathways to amyloid fibril formation. Sci Rep 2020; 10:15668. [PMID: 32973286 PMCID: PMC7515878 DOI: 10.1038/s41598-020-72260-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Accepted: 08/27/2020] [Indexed: 12/26/2022] Open
Abstract
Amyloid fibril formation is central to the etiology of a wide range of serious human diseases, such as Alzheimer's disease and prion diseases. Despite an ever growing collection of amyloid fibril structures found in the Protein Data Bank (PDB) and numerous clinical trials, therapeutic strategies remain elusive. One contributing factor to the lack of progress on this challenging problem is incomplete understanding of the mechanisms by which these locally ordered protein aggregates self-assemble in solution. Many current models of amyloid deposition diseases posit that the most toxic species are oligomers that form either along the pathway to forming fibrils or in competition with their formation, making it even more critical to understand the kinetics of fibrillization. A recently introduced topological model for aggregation based on network Hamiltonians is capable of recapitulating the entire process of amyloid fibril formation, beginning with thousands of free monomers and ending with kinetically accessible and thermodynamically stable amyloid fibril structures. The model can be parameterized to match the five topological classes encompassing all amyloid fibril structures so far discovered in the PDB. This paper introduces a set of network statistical and topological metrics for quantitative analysis and characterization of the fibrillization mechanisms predicted by the network Hamiltonian model. The results not only provide insight into different mechanisms leading to similar fibril structures, but also offer targets for future experimental exploration into the mechanisms by which fibrils form.
Collapse
Affiliation(s)
- Yue Yu
- Department of Computer Science, University of California, Irvine, CA, 92697, USA
| | - Gianmarc Grazioli
- Department of Chemistry, San José State University, San Jose, CA, 95192, USA
| | - Megha H Unhelkar
- Department of Chemistry, University of California, Irvine, CA, 92697, USA
| | - Rachel W Martin
- Department of Chemistry, University of California, Irvine, CA, 92697, USA.,Department of Molecular Biology and Biochemistry, University of California, Irvine, CA, 92697, USA
| | - Carter T Butts
- Department of Computer Science, University of California, Irvine, CA, 92697, USA. .,California Institute for Telecommunications and Information Technology, University of California, Irvine, CA, 92697, USA. .,Departments of Sociology, Statistics, and EECS, University of California, Irvine, CA, 92697, USA.
| |
Collapse
|
25
|
Krivitsky PN, Koehly LM, Marcum CS. Exponential-Family Random Graph Models for Multi-Layer Networks. PSYCHOMETRIKA 2020; 85:630-659. [PMID: 33025459 PMCID: PMC9478997 DOI: 10.1007/s11336-020-09720-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Revised: 03/30/2020] [Indexed: 05/18/2023]
Abstract
Multi-layer networks arise when more than one type of relation is observed on a common set of actors. Modeling such networks within the exponential-family random graph (ERG) framework has been previously limited to special cases and, in particular, to dependence arising from just two layers. Extensions to ERGMs are introduced to address these limitations: Conway-Maxwell-Binomial distribution to model the marginal dependence among multiple layers; a "layer logic" language to translate familiar ERGM effects to substantively meaningful interactions of observed layers; and nondegenerate triadic and degree effects. The developments are demonstrated on two previously published datasets.
Collapse
Affiliation(s)
- Pavel N Krivitsky
- School of Mathematics and Statistics, The University of New South Wales, Sydney, NSW, 2052, Australia.
| | | | | |
Collapse
|
26
|
Stivala A, Robins G, Lomi A. Exponential random graph model parameter estimation for very large directed networks. PLoS One 2020; 15:e0227804. [PMID: 31978150 PMCID: PMC6980401 DOI: 10.1371/journal.pone.0227804] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2019] [Accepted: 12/31/2019] [Indexed: 12/18/2022] Open
Abstract
Exponential random graph models (ERGMs) are widely used for modeling social networks observed at one point in time. However the computational difficulty of ERGM parameter estimation has limited the practical application of this class of models to relatively small networks, up to a few thousand nodes at most, with usually only a few hundred nodes or fewer. In the case of undirected networks, snowball sampling can be used to find ERGM parameter estimates of larger networks via network samples, and recently published improvements in ERGM network distribution sampling and ERGM estimation algorithms have allowed ERGM parameter estimates of undirected networks with over one hundred thousand nodes to be made. However the implementations of these algorithms to date have been limited in their scalability, and also restricted to undirected networks. Here we describe an implementation of the recently published Equilibrium Expectation (EE) algorithm for ERGM parameter estimation of large directed networks. We test it on some simulated networks, and demonstrate its application to an online social network with over 1.6 million nodes.
Collapse
Affiliation(s)
- Alex Stivala
- Institute of Computational Science, Università della Svizzera italiana, Lugano, Ticino, Switzerland
- Centre for Transformative Innovation, Swinburne University of Technology, Melbourne, Victoria, Australia
| | - Garry Robins
- Melbourne School of Psychological Sciences, The University of Melbourne, Melbourne, Victoria, Australia
| | - Alessandro Lomi
- Institute of Computational Science, Università della Svizzera italiana, Lugano, Ticino, Switzerland
- The University of Exeter Business School, The University of Exeter, Exeter, Devon, United Kingdom
| |
Collapse
|