Dempsey W, Oselio B, Hero A. Hierarchical network models for exchangeable structured interaction processes.
J Am Stat Assoc 2022;
117:2056-2073. [PMID:
36908312 PMCID:
PMC10005504 DOI:
10.1080/01621459.2021.1896526]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Network data often arises via a series of structured interactions among a population of constituent elements. E-mail exchanges, for example, have a single sender followed by potentially multiple receivers. Scientific articles, on the other hand, may have multiple subject areas and multiple authors. We introduce a statistical model, termed the Pitman-Yor hierarchical vertex components model (PY-HVCM), that is well suited for structured interaction data. The proposed PY-HVCM effectively models complex relational data by partial pooling of local information via a latent, shared population-level distribution. The PY-HCVM is a canonical example of hierarchical vertex components models - a subfamily of models for exchangeable structured interaction-labeled networks, i.e., networks invariant to interaction relabeling. Theoretical analysis and supporting simulations provide clear model interpretation, and establish global sparsity and power law degree distribution. A computationally tractable Gibbs sampling algorithm is derived for inferring sparsity and power law properties of complex networks. We demonstrate the model on both the Enron e-mail dataset and an ArXiv dataset, showing goodness of fit of the model via posterior predictive validation.
Collapse