1
|
Graph alignment exploiting the spatial organization improves the similarity of brain networks. Hum Brain Mapp 2024; 45:e26554. [PMID: 38224543 PMCID: PMC10789220 DOI: 10.1002/hbm.26554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 10/19/2023] [Accepted: 11/22/2023] [Indexed: 01/17/2024] Open
Abstract
Every brain is unique, having its structural and functional organization shaped by both genetic and environmental factors over the course of its development. Brain image studies tend to produce results by averaging across a group of subjects, under the common assumption that it is possible to subdivide the cortex into homogeneous areas while maintaining a correspondence across subjects. We investigate this assumption: can the structural properties of a specific region of an atlas be assumed to be the same across subjects? This question is addressed by looking at the network representation of the brain, with nodes corresponding to brain regions and edges to their structural relationships. Using an unsupervised graph matching strategy, we align the structural connectomes of a set of healthy subjects, considering parcellations of different granularity, to understand the connectivity misalignment between regions. First, we compare the obtained permutations with four different algorithm initializations: Spatial Adjacency, Identity, Barycenter, and Random. Our results suggest that applying an alignment strategy improves the similarity across subjects when the number of parcels is above 100 and when using Spatial Adjacency and Identity initialization (the most plausible priors). Second, we characterize the obtained permutations, revealing that the majority of permutations happens between neighbors parcels. Lastly, we study the spatial distribution of the permutations. By visualizing the results on the cortex, we observe no clear spatial patterns on the permutations and all the regions across the context are mostly permuted with first and second order neighbors.
Collapse
|
2
|
Bayesian classification, anomaly detection, and survival analysis using network inputs with application to the microbiome. Ann Appl Stat 2023. [DOI: 10.1214/22-aoas1623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|
3
|
Extracting brain disease-related connectome subgraphs by adaptive dense subgraph discovery. Biometrics 2022; 78:1566-1578. [PMID: 34374075 PMCID: PMC10396394 DOI: 10.1111/biom.13537] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Revised: 07/23/2021] [Accepted: 07/28/2021] [Indexed: 12/30/2022]
Abstract
Group-level brain connectome analysis has attracted increasing interest in neuropsychiatric research with the goal of identifying connectomic subnetworks (subgraphs) that are systematically associated with brain disorders. However, extracting disease-related subnetworks from the whole brain connectome has been challenging, because no prior knowledge is available regarding the sizes and locations of the subnetworks. In addition, neuroimaging data are often mixed with substantial noise that can further obscure informative subnetwork detection. We propose a likelihood-based adaptive dense subgraph discovery (ADSD) model to extract disease-related subgraphs from the group-level whole brain connectome data. Our method is robust to both false positive and false negative errors of edge-wise inference and thus can lead to a more accurate discovery of latent disease-related connectomic subnetworks. We develop computationally efficient algorithms to implement the novel ADSD objective function and derive theoretical results to guarantee the convergence properties. We apply the proposed approach to a brain fMRI study for schizophrenia research and identify well-organized and biologically meaningful subnetworks that exhibit schizophrenia-related salience network centered connectivity abnormality. Analysis of synthetic data also demonstrates the superior performance of the ADSD method for latent subnetwork detection in comparison with existing methods in various settings.
Collapse
|
4
|
Outlier detection for multi-network data. Bioinformatics 2022; 38:4011-4018. [PMID: 35762974 PMCID: PMC9890313 DOI: 10.1093/bioinformatics/btac431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 05/21/2022] [Accepted: 06/27/2022] [Indexed: 02/04/2023] Open
Abstract
MOTIVATION It has become routine in neuroscience studies to measure brain networks for different individuals using neuroimaging. These networks are typically expressed as adjacency matrices, with each cell containing a summary of connectivity between a pair of brain regions. There is an emerging statistical literature describing methods for the analysis of such multi-network data in which nodes are common across networks but the edges vary. However, there has been essentially no consideration of the important problem of outlier detection. In particular, for certain subjects, the neuroimaging data are so poor quality that the network cannot be reliably reconstructed. For such subjects, the resulting adjacency matrix may be mostly zero or exhibit a bizarre pattern not consistent with a functioning brain. These outlying networks may serve as influential points, contaminating subsequent statistical analyses. We propose a simple Outlier DetectIon for Networks (ODIN) method relying on an influence measure under a hierarchical generalized linear model for the adjacency matrices. An efficient computational algorithm is described, and ODIN is illustrated through simulations and an application to data from the UK Biobank. RESULTS ODIN was successful in identifying moderate to extreme outliers. Removing such outliers can significantly change inferences in downstream applications. AVAILABILITY AND IMPLEMENTATION ODIN has been implemented in both Python and R and these implementations along with other code are publicly available at github.com/pritamdey/ODIN-python and github.com/pritamdey/ODIN-r, respectively. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
5
|
Power-enhanced simultaneous test of high-dimensional mean vectors and covariance matrices with application to gene-set testing. J Am Stat Assoc 2022. [DOI: 10.1080/01621459.2022.2061354] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
6
|
Manifold valued data analysis of samples of networks, with applications in corpus linguistics. Ann Appl Stat 2022. [DOI: 10.1214/21-aoas1480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
7
|
Valid two‐sample graph testing via optimal transport Procrustes and multiscale graph correlation with applications in connectomics. Stat (Int Stat Inst) 2022. [DOI: 10.1002/sta4.429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
8
|
Hypothesis testing for populations of networks. COMMUN STAT-THEOR M 2021. [DOI: 10.1080/03610926.2021.1977961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
9
|
Abstract
AbstractThe problem of predicting links in large networks is an important task in a variety of practical applications, including social sciences, biology and computer security. In this paper, statistical techniques for link prediction based on the popular random dot product graph model are carefully presented, analysed and extended to dynamic settings. Motivated by a practical application in cyber-security, this paper demonstrates that random dot product graphs not only represent a powerful tool for inferring differences between multiple networks, but are also efficient for prediction purposes and for understanding the temporal evolution of the network. The probabilities of links are obtained by fusing information at two stages: spectral methods provide estimates of latent positions for each node, and time series models are used to capture temporal dynamics. In this way, traditional link prediction methods, usually based on decompositions of the entire network adjacency matrix, are extended using temporal information. The methods presented in this article are applied to a number of simulated and real-world graphs, showing promising results.
Collapse
|
10
|
Statistical and Machine Learning Link Selection Methods for Brain Functional Networks: Review and Comparison. Brain Sci 2021; 11:brainsci11060735. [PMID: 34073098 PMCID: PMC8227272 DOI: 10.3390/brainsci11060735] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Revised: 05/24/2021] [Accepted: 05/28/2021] [Indexed: 11/28/2022] Open
Abstract
Network-based representations have introduced a revolution in neuroscience, expanding the understanding of the brain from the activity of individual regions to the interactions between them. This augmented network view comes at the cost of high dimensionality, which hinders both our capacity of deciphering the main mechanisms behind pathologies, and the significance of any statistical and/or machine learning task used in processing this data. A link selection method, allowing to remove irrelevant connections in a given scenario, is an obvious solution that provides improved utilization of these network representations. In this contribution we review a large set of statistical and machine learning link selection methods and evaluate them on real brain functional networks. Results indicate that most methods perform in a qualitatively similar way, with NBS (Network Based Statistics) winning in terms of quantity of retained information, AnovaNet in terms of stability and ExT (Extra Trees) in terms of lower computational cost. While machine learning methods are conceptually more complex than statistical ones, they do not yield a clear advantage. At the same time, the high heterogeneity in the set of links retained by each method suggests that they are offering complementary views to the data. The implications of these results in neuroscience tasks are finally discussed.
Collapse
|
11
|
Preventing Failures by Dataset Shift Detection in Safety-Critical Graph Applications. Front Artif Intell 2021; 4:589632. [PMID: 34179767 PMCID: PMC8223254 DOI: 10.3389/frai.2021.589632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Accepted: 04/26/2021] [Indexed: 11/13/2022] Open
Abstract
Dataset shift refers to the problem where the input data distribution may change over time (e.g., between training and test stages). Since this can be a critical bottleneck in several safety-critical applications such as healthcare, drug-discovery, etc., dataset shift detection has become an important research issue in machine learning. Though several existing efforts have focused on image/video data, applications with graph-structured data have not received sufficient attention. Therefore, in this paper, we investigate the problem of detecting shifts in graph structured data through the lens of statistical hypothesis testing. Specifically, we propose a practical two-sample test based approach for shift detection in large-scale graph structured data. Our approach is very flexible in that it is suitable for both undirected and directed graphs, and eliminates the need for equal sample sizes. Using empirical studies, we demonstrate the effectiveness of the proposed test in detecting dataset shifts. We also corroborate these findings using real-world datasets, characterized by directed graphs and a large number of nodes.
Collapse
|
12
|
|
13
|
Abstract
Network (graph) data analysis is a popular research topic in statistics and machine learning. In application, one is frequently confronted with graph two-sample hypothesis testing where the goal is to test the difference between two graph populations. Several statistical tests have been devised for this purpose in the context of binary graphs. However, many of the practical networks are weighted and existing procedures cannot be directly applied to weighted graphs. In this paper, we study the weighted graph two-sample hypothesis testing problem and propose a practical test statistic. We prove that the proposed test statistic converges in distribution to the standard normal distribution under the null hypothesis and analyze its power theoretically. The simulation study shows that the proposed test has satisfactory performance and it substantially outperforms the existing counterpart in the binary graph case. A real data application is provided to illustrate the method.
Collapse
|
14
|
Multiscale null hypothesis testing for network‐valued data: Analysis of brain networks of patients with autism. J R Stat Soc Ser C Appl Stat 2021. [DOI: 10.1111/rssc.12463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
15
|
|
16
|
Re-visiting Riemannian geometry of symmetric positive definite matrices for the analysis of functional connectivity. Neuroimage 2020; 225:117464. [PMID: 33075555 DOI: 10.1016/j.neuroimage.2020.117464] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Revised: 08/04/2020] [Accepted: 10/12/2020] [Indexed: 12/20/2022] Open
Abstract
Common representations of functional networks of resting state fMRI time series, including covariance, precision, and cross-correlation matrices, belong to the family of symmetric positive definite (SPD) matrices forming a special mathematical structure called Riemannian manifold. Due to its geometric properties, the analysis and operation of functional connectivity matrices may well be performed on the Riemannian manifold of the SPD space. Analysis of functional networks on the SPD space takes account of all the pairwise interactions (edges) as a whole, which differs from the conventional rationale of considering edges as independent from each other. Despite its geometric characteristics, only a few studies have been conducted for functional network analysis on the SPD manifold and inference methods specialized for connectivity analysis on the SPD manifold are rarely found. The current study aims to show the significance of connectivity analysis on the SPD space and introduce inference algorithms on the SPD manifold, such as regression analysis of functional networks in association with behaviors, principal geodesic analysis, clustering, state transition analysis of dynamic functional networks and statistical tests for network equality on the SPD manifold. We applied the proposed methods to both simulated data and experimental resting state fMRI data from the human connectome project and argue the importance of analyzing functional networks under the SPD geometry. All the algorithms for numerical operations and inferences on the SPD manifold are implemented as a MATLAB library, called SPDtoolbox, for public use to expediate functional network analysis on the right geometry.
Collapse
|
17
|
Modeling sparse longitudinal data on Riemannian manifolds. Biometrics 2020; 77:1328-1341. [PMID: 33034049 DOI: 10.1111/biom.13385] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2020] [Revised: 07/14/2020] [Accepted: 09/15/2020] [Indexed: 11/28/2022]
Abstract
Modern data collection often entails longitudinal repeated measurements that assume values on a Riemannian manifold. Analyzing such longitudinal Riemannian data is challenging, because of both the sparsity of the observations and the nonlinear manifold constraint. Addressing this challenge, we propose an intrinsic functional principal component analysis for longitudinal Riemannian data. Information is pooled across subjects by estimating the mean curve with local Fréchet regression and smoothing the covariance structure of the linearized data on tangent spaces around the mean. Dimension reduction and imputation of the manifold-valued trajectories are achieved by utilizing the leading principal components and applying best linear unbiased prediction. We show that the proposed mean and covariance function estimates achieve state-of-the-art convergence rates. For illustration, we study the development of brain connectivity in a longitudinal cohort of Alzheimer's disease and normal participants by modeling the connectivity on the manifold of symmetric positive definite matrices with the affine-invariant metric. In a second illustration for irregularly recorded longitudinal emotion compositional data for unemployed workers, we show that the proposed method leads to nicely interpretable eigenfunctions and principal component scores. Data used in preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative database.
Collapse
|
18
|
|
19
|
|
20
|
A random effects stochastic block model for joint community detection in multiple networks with applications to neuroimaging. Ann Appl Stat 2020. [DOI: 10.1214/20-aoas1339] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
21
|
|
22
|
|
23
|
|
24
|
|
25
|
Abstract
Summary
Fréchet mean and variance provide a way of obtaining a mean and variance for metric space-valued random variables, and can be used for statistical analysis of data objects that lie in abstract spaces devoid of algebraic structure and operations. Examples of such data objects include covariance matrices, graph Laplacians of networks and univariate probability distribution functions. We derive a central limit theorem for the Fréchet variance under mild regularity conditions, using empirical process theory, and also provide a consistent estimator of the asymptotic variance. These results lead to a test for comparing $k$ populations of metric space-valued data objects in terms of Fréchet means and variances. We examine the finite-sample performance of this novel inference procedure through simulation studies on several special cases that include probability distributions and graph Laplacians, leading to a test for comparing populations of networks. The proposed approach has good finite-sample performance in simulations for different kinds of random objects. We illustrate the proposed methods by analysing data on mortality profiles of various countries and resting-state functional magnetic resonance imaging data.
Collapse
|
26
|
Abstract
While statistical analysis of a single network has received a lot of attention in recent years, with a focus on social networks, analysis of a sample of networks presents its own challenges which require a different set of analytic tools. Here we study the problem of classification of networks with labeled nodes, motivated by applications in neuroimaging. Brain networks are constructed from imaging data to represent functional connectivity between regions of the brain, and previous work has shown the potential of such networks to distinguish between various brain disorders, giving rise to a network classification problem. Existing approaches tend to either treat all edge weights as a long vector, ignoring the network structure, or focus on graph topology as represented by summary measures while ignoring the edge weights. Our goal is to design a classification method that uses both the individual edge information and the network structure of the data in a computationally efficient way, and that can produce a parsimonious and interpretable representation of differences in brain connectivity patterns between classes. We propose a graph classification method that uses edge weights as predictors but incorporates the network nature of the data via penalties that promote sparsity in the number of nodes, in addition to the usual sparsity penalties that encourage selection of edges. We implement the method via efficient convex optimization and provide a detailed analysis of data from two fMRI studies of schizophrenia.
Collapse
|
27
|
Discussion: Object-Oriented Data Analysis, Power Metrics, and Graph Laplacians. J Am Stat Assoc 2019. [DOI: 10.1080/01621459.2019.1635477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
28
|
Connectal coding: discovering the structures linking cognitive phenotypes to individual histories. Curr Opin Neurobiol 2019; 55:199-212. [PMID: 31102987 DOI: 10.1016/j.conb.2019.04.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Revised: 04/14/2019] [Accepted: 04/16/2019] [Indexed: 01/06/2023]
Abstract
Cognitive phenotypes characterize our memories, beliefs, skills, and preferences, and arise from our ancestral, developmental, and experiential histories. These histories are written into our brain structure through the building and modification of various brain circuits. Connectal coding, by way of analogy with neural coding, is the art, study, and practice of identifying the network structures that link cognitive phenomena to individual histories. We propose a formal statistical framework for connectal coding and demonstrate its utility in several applications spanning experimental modalities and phylogeny.
Collapse
|
29
|
Multivariate Heteroscedasticity Models for Functional Brain Connectivity. Front Neurosci 2017; 11:696. [PMID: 29311777 PMCID: PMC5733000 DOI: 10.3389/fnins.2017.00696] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2016] [Accepted: 11/27/2017] [Indexed: 01/21/2023] Open
Abstract
Functional brain connectivity is the co-occurrence of brain activity in different areas during resting and while doing tasks. The data of interest are multivariate timeseries measured simultaneously across brain parcels using resting-state fMRI (rfMRI). We analyze functional connectivity using two heteroscedasticity models. Our first model is low-dimensional and scales linearly in the number of brain parcels. Our second model scales quadratically. We apply both models to data from the Human Connectome Project (HCP) comparing connectivity between short and conventional sleepers. We find stronger functional connectivity in short than conventional sleepers in brain areas consistent with previous findings. This might be due to subjects falling asleep in the scanner. Consequently, we recommend the inclusion of average sleep duration as a covariate to remove unwanted variation in rfMRI studies. A power analysis using the HCP data shows that a sample size of 40 detects 50% of the connectivity at a false discovery rate of 20%. We provide implementations using R and the probabilistic programming language Stan.
Collapse
|