1
|
Bass AJ, Bian S, Wingo AP, Wingo TS, Cutler DJ, Epstein MP. Identifying latent genetic interactions in genome-wide association studies using multiple traits. Genome Med 2024; 16:62. [PMID: 38664839 PMCID: PMC11044415 DOI: 10.1186/s13073-024-01329-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 04/02/2024] [Indexed: 04/28/2024] Open
Abstract
The "missing" heritability of complex traits may be partly explained by genetic variants interacting with other genes or environments that are difficult to specify, observe, and detect. We propose a new kernel-based method called Latent Interaction Testing (LIT) to screen for genetic interactions that leverages pleiotropy from multiple related traits without requiring the interacting variable to be specified or observed. Using simulated data, we demonstrate that LIT increases power to detect latent genetic interactions compared to univariate methods. We then apply LIT to obesity-related traits in the UK Biobank and detect variants with interactive effects near known obesity-related genes (URL: https://CRAN.R-project.org/package=lit ).
Collapse
Affiliation(s)
- Andrew J Bass
- Department of Human Genetics, Emory University, Atlanta, GA, 30322, USA.
| | - Shijia Bian
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, 30322, USA
| | - Aliza P Wingo
- Department of Psychiatry, Emory University, Atlanta, GA, 30322, USA
| | - Thomas S Wingo
- Department of Human Genetics, Emory University, Atlanta, GA, 30322, USA
- Department of Neurology, Emory University, Atlanta, GA, 30322, USA
| | - David J Cutler
- Department of Human Genetics, Emory University, Atlanta, GA, 30322, USA
| | - Michael P Epstein
- Department of Human Genetics, Emory University, Atlanta, GA, 30322, USA.
| |
Collapse
|
2
|
Mbatchou J, McPeek MS. JASPER: fast, powerful, multitrait association testing in structured samples gives insight on pleiotropy in gene expression. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.18.571948. [PMID: 38187553 PMCID: PMC10769254 DOI: 10.1101/2023.12.18.571948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Joint association analysis of multiple traits with multiple genetic variants can provide insight into genetic architecture and pleiotropy, improve trait prediction and increase power for detecting association. Furthermore, some traits are naturally high-dimensional, e.g., images, networks or longitudinally measured traits. Assessing significance for multitrait genetic association can be challenging, especially when the sample has population sub-structure and/or related individuals. Failure to adequately adjust for sample structure can lead to power loss and inflated type 1 error, and commonly used methods for assessing significance can work poorly with a large number of traits or be computationally slow. We developed JASPER, a fast, powerful, robust method for assessing significance of multitrait association with a set of genetic variants, in samples that have population sub-structure, admixture and/or relatedness. In simulations, JASPER has higher power, better type 1 error control, and faster computation than existing methods, with the power and speed advantage of JASPER increasing with the number of traits. JASPER is potentially applicable to a wide range of association testing applications, including for multiple disease traits, expression traits, image-derived traits and microbiome abundances. It allows for covariates, ascertainment and rare variants and is robust to phenotype model misspecification. We apply JASPER to analyze gene expression in the Framingham Heart Study, where, compared to alternative approaches, JASPER finds more significant associations, including several that indicate pleiotropic effects, some of which replicate previous results, while others have not previously been reported. Our results demonstrate the promise of JASPER for powerful multitrait analysis in structured samples.
Collapse
Affiliation(s)
- Joelle Mbatchou
- Regeneron Genetics Center, Tarrytown, NY 10591, USA
- Department of Statistics, The University of Chicago, Chicago, IL 60637, USA
| | - Mary Sara McPeek
- Department of Statistics, The University of Chicago, Chicago, IL 60637, USA
- Department of Human Genetics, The University of Chicago, Chicago, IL 60637, USA
| |
Collapse
|
3
|
Wylie KP, Kluger BM, Medina LD, Holden SK, Kronberg E, Tregellas JR, Buard I. Hippocampal, basal ganglia and olfactory connectivity contribute to cognitive impairments in Parkinson's disease. Eur J Neurosci 2023; 57:511-526. [PMID: 36516060 PMCID: PMC9970048 DOI: 10.1111/ejn.15899] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 12/06/2022] [Accepted: 12/12/2022] [Indexed: 12/15/2022]
Abstract
Cognitive impairment is increasingly recognized as a characteristic feature of Parkinson's disease (PD), yet relatively little is known about its underlying neurobiology. Previous investigations suggest that dementia in PD is associated with subcortical atrophy, but similar studies in PD with mild cognitive impairment have been mixed. Variability in cognitive phenotypes and diversity of PD symptoms suggest that a common neuropathological origin results in a multitude of impacts within the brain. These direct and indirect impacts of disease pathology can be investigated using network analysis. Functional connectivity, for instance, may be more sensitive than atrophy to decline in specific cognitive domains in the PD population. Fifty-eight participants with PD underwent a neuropsychological test battery and scanning with structural and resting state functional MRI in a comprehensive whole-brain association analysis. To investigate atrophy as a potential marker of impairment, structural gray matter atrophy was associated with cognitive scores in each cognitive domain using voxel-based morphometry. To investigate connectivity, large-scale networks were correlated with voxel time series and associated with cognitive scores using distance covariance. Structural atrophy was not associated with any cognitive domain, with the exception of visuospatial measures in primary sensory and motor cortices. In contrast, functional connectivity was associated with attention, executive function, language, learning and memory, visuospatial, and global cognition in the bilateral hippocampus, left putamen, olfactory cortex, and bilateral anterior temporal poles. These preliminary results suggest that cognitive domain-specific networks in PD are distinct from each other and could provide a network signature for different cognitive phenotypes.
Collapse
Affiliation(s)
- Korey P. Wylie
- Department of Psychiatry, University of Colorado School of Medicine, Aurora, CO, USA
| | - Benzi M. Kluger
- Department of Neurology, University of Rochester Medical Center, Rochester, NY, USA
| | - Luis D. Medina
- Department of Psychology, University of Houston, Houston, TX, USA
| | - Samantha K. Holden
- Department of Neurology, University of Colorado School of Medicine, Aurora, CO, USA
| | - Eugene Kronberg
- Department of Psychiatry, University of Colorado School of Medicine, Aurora, CO, USA
- Department of Neurology, University of Colorado School of Medicine, Aurora, CO, USA
| | - Jason R. Tregellas
- Department of Psychiatry, University of Colorado School of Medicine, Aurora, CO, USA
- Research Service, Rocky Mountain Regional VA Medical Center, Aurora, CO, USA
| | - Isabelle Buard
- Department of Neurology, University of Colorado School of Medicine, Aurora, CO, USA
| |
Collapse
|
4
|
Edelmann D, Goeman J. A Regression Perspective on Generalized Distance Covariance and the Hilbert–Schmidt Independence Criterion. Stat Sci 2022. [DOI: 10.1214/21-sts841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Dominic Edelmann
- Dominic Edelmann is Professor, German Cancer Research Center, Im Neuenheimer Feld 280, 69120 Heidelberg, Germany
| | - Jelle Goeman
- Jelle Goeman is Postdoctoral Researcher, Leiden University Medical Center, Albinusdreef 2, 2333 ZA Leiden, Netherlands
| |
Collapse
|
5
|
Rudra P, Baxter R, Hsieh EWY, Ghosh D. Compositional Data Analysis using Kernels in mass cytometry data. BIOINFORMATICS ADVANCES 2022; 2:vbac003. [PMID: 35224501 PMCID: PMC8867823 DOI: 10.1093/bioadv/vbac003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Revised: 12/06/2021] [Accepted: 01/12/2022] [Indexed: 01/27/2023]
Abstract
MOTIVATION Cell-type abundance data arising from mass cytometry experiments are compositional in nature. Classical association tests do not apply to the compositional data due to their non-Euclidean nature. Existing methods for analysis of cell type abundance data suffer from several limitations for high-dimensional mass cytometry data, especially when the sample size is small. RESULTS We proposed a new multivariate statistical learning methodology, Compositional Data Analysis using Kernels (CODAK), based on the kernel distance covariance (KDC) framework to test the association of the cell type compositions with important predictors (categorical or continuous) such as disease status. CODAK scales well for high-dimensional data and provides satisfactory performance for small sample sizes (n < 25). We conducted simulation studies to compare the performance of the method with existing methods of analyzing cell type abundance data from mass cytometry studies. The method is also applied to a high-dimensional dataset containing different subgroups of populations including Systemic Lupus Erythematosus (SLE) patients and healthy control subjects. AVAILABILITY AND IMPLEMENTATION CODAK is implemented using R. The codes and the data used in this manuscript are available on the web at http://github.com/GhoshLab/CODAK/. CONTACT prudra@okstate.edu. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Pratyaydipta Rudra
- Department of Statistics, Oklahoms State University, Stillwater, OK 74078, USA
- To whom correspondence should be addressed.
| | - Ryan Baxter
- Department of Immunology and Microbiology, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Elena W Y Hsieh
- Department of Immunology and Microbiology, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
- Department of Pediatrics, Section of Allergy and Immunology, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Debashis Ghosh
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| |
Collapse
|
6
|
Gong M, Liu P, Sciurba FC, Stojanov P, Tao D, Tseng GC, Zhang K, Batmanghelich K. Unpaired data empowers association tests. Bioinformatics 2021; 37:785-792. [PMID: 33070196 PMCID: PMC8098021 DOI: 10.1093/bioinformatics/btaa886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2020] [Revised: 09/07/2020] [Accepted: 10/05/2020] [Indexed: 11/25/2022] Open
Abstract
Motivation There is growing interest in the biomedical research community to incorporate retrospective data, available in healthcare systems, to shed light on associations between different biomarkers. Understanding the association between various types of biomedical data, such as genetic, blood biomarkers, imaging, etc. can provide a holistic understanding of human diseases. To formally test a hypothesized association between two types of data in Electronic Health Records (EHRs), one requires a substantial sample size with both data modalities to achieve a reasonable power. Current association test methods only allow using data from individuals who have both data modalities. Hence, researchers cannot take advantage of much larger EHR samples that includes individuals with at least one of the data types, which limits the power of the association test. Results We present a new method called the Semi-paired Association Test (SAT) that makes use of both paired and unpaired data. In contrast to classical approaches, incorporating unpaired data allows SAT to produce better control of false discovery and to improve the power of the association test. We study the properties of the new test theoretically and empirically, through a series of simulations and by applying our method on real studies in the context of Chronic Obstructive Pulmonary Disease. We are able to identify an association between the high-dimensional characterization of Computed Tomography chest images and several blood biomarkers as well as the expression of dozens of genes involved in the immune system. Availability and implementation Code is available on https://github.com/batmanlab/Semi-paired-Association-Test. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mingming Gong
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15206, USA.,Department of Philosophy, Carnegie Mellon University, Pittsburgh, PA 15213, USA.,School of Mathematics and Statistics, The University of Melbourne, Melbourne, VIC 3010, Australia
| | - Peng Liu
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15206, USA
| | - Frank C Sciurba
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15206, USA
| | - Petar Stojanov
- Department of Philosophy, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Dacheng Tao
- Australia School of Computer Science, The University of Sydney, Sydney, NSW 2006, Australia
| | - George C Tseng
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15206, USA
| | - Kun Zhang
- Department of Philosophy, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Kayhan Batmanghelich
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15206, USA
| |
Collapse
|
7
|
Zhu C, Zhang X, Yao S, Shao X. Distance-based and RKHS-based dependence metrics in high dimension. Ann Stat 2020. [DOI: 10.1214/19-aos1934] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
8
|
Solis-Lemus CR, Fischer ST, Todor A, Liu C, Leslie EJ, Cutler DJ, Ghosh D, Epstein MP. Leveraging Family History in Case-Control Analyses of Rare Variation. Genetics 2020; 214:295-303. [PMID: 31843756 PMCID: PMC7017020 DOI: 10.1534/genetics.119.302846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2019] [Accepted: 12/10/2019] [Indexed: 11/18/2022] Open
Abstract
Standard methods for case-control association studies of rare variation often treat disease outcome as a dichotomous phenotype. However, both theoretical and experimental studies have demonstrated that subjects with a family history of disease can be enriched for risk variation relative to subjects without such history. Assuming family history information is available, this observation motivates the idea of replacing the standard dichotomous outcome variable used in case-control studies with a more informative ordinal outcome variable that distinguishes controls (0), sporadic cases (1), and cases with a family history (2), with the expectation that we should observe increasing number of risk variants with increasing category of the ordinal variable. To leverage this expectation, we propose a novel rare-variant association test that incorporates family history information based on our previous GAMuT framework for rare-variant association testing of multivariate phenotypes. We use simulated data to show that, when family history information is available, our new method outperforms standard rare-variant association methods, like burden and SKAT tests, that ignore family history. We further illustrate our method using a rare-variant study of cleft lip and palate.
Collapse
Affiliation(s)
| | - S Taylor Fischer
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, 30329 Georgia
| | - Andrei Todor
- Department of Human Genetics, Emory University, Atlanta, 30030 Georgia
| | - Cuining Liu
- Department of Biostatistics and Informatics, University of Colorado, Aurora, 80045 Colorado
| | | | - David J Cutler
- Department of Human Genetics, Emory University, Atlanta, 30030 Georgia
| | - Debashis Ghosh
- Department of Biostatistics and Informatics, University of Colorado, Aurora, 80045 Colorado
| | - Michael P Epstein
- Department of Human Genetics, Emory University, Atlanta, 30030 Georgia
| |
Collapse
|
9
|
Holleman AM, Broadaway KA, Duncan R, Todor A, Almli LM, Bradley B, Ressler KJ, Ghosh D, Mulle JG, Epstein MP. Powerful and Efficient Strategies for Genetic Association Testing of Symptom and Questionnaire Data in Psychiatric Genetic Studies. Sci Rep 2019; 9:7523. [PMID: 31101869 PMCID: PMC6525248 DOI: 10.1038/s41598-019-44046-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2018] [Accepted: 05/01/2019] [Indexed: 11/09/2022] Open
Abstract
Genetic studies of psychiatric disorders often deal with phenotypes that are not directly measurable. Instead, researchers rely on multivariate symptom data from questionnaires and surveys like the PTSD Symptom Scale (PSS) and Beck Depression Inventory (BDI) to indirectly assess a latent phenotype of interest. Researchers subsequently collapse such multivariate questionnaire data into a univariate outcome to represent a surrogate for the latent phenotype. However, when a causal variant is only associated with a subset of collapsed symptoms, the effect will be challenging to detect using the univariate outcome. We describe a more powerful strategy for genetic association testing in this situation that jointly analyzes the original multivariate symptom data collectively using a statistical framework that compares similarity in multivariate symptom-scale data from questionnaires to similarity in common genetic variants across a gene. We use simulated data to demonstrate this strategy provides substantially increased power over standard approaches that collapse questionnaire data into a single surrogate outcome. We also illustrate our approach using GWAS data from the Grady Trauma Project and identify genes associated with BDI not identified using standard univariate techniques. The approach is computationally efficient, scales to genome-wide studies, and is applicable to correlated symptom data of arbitrary dimension.
Collapse
Affiliation(s)
- Aaron M Holleman
- Department of Epidemiology, Emory University, Atlanta, GA, USA.,Center for Computational and Quantitative Genetics, Emory University, Atlanta, GA, USA
| | | | - Richard Duncan
- Department of Human Genetics, Emory University, Atlanta, GA, USA
| | - Andrei Todor
- Center for Computational and Quantitative Genetics, Emory University, Atlanta, GA, USA.,Department of Human Genetics, Emory University, Atlanta, GA, USA
| | - Lynn M Almli
- Department of Psychiatry and Behavioral Sciences, Emory University, Atlanta, GA, USA
| | - Bekh Bradley
- Department of Psychiatry and Behavioral Sciences, Emory University, Atlanta, GA, USA.,Clinical Psychologist, Mental Health Service Line, Department of Veterans Affairs Medical Center, Atlanta, GA, USA
| | - Kerry J Ressler
- Department of Psychiatry, McLean Hospital, Harvard Medical School, Belmont, MA, USA
| | - Debashis Ghosh
- Department of Biostatistics and Informatics, Colorado School of Public Health, Aurora, CO, USA
| | - Jennifer G Mulle
- Center for Computational and Quantitative Genetics, Emory University, Atlanta, GA, USA.,Department of Human Genetics, Emory University, Atlanta, GA, USA
| | - Michael P Epstein
- Center for Computational and Quantitative Genetics, Emory University, Atlanta, GA, USA. .,Department of Human Genetics, Emory University, Atlanta, GA, USA.
| |
Collapse
|
10
|
Rudra P, Broadaway KA, Ware EB, Jhun MA, Bielak LF, Zhao W, Smith JA, Peyser PA, Kardia SL, Epstein MP, Ghosh D. Testing cross-phenotype effects of rare variants in longitudinal studies of complex traits. Genet Epidemiol 2018; 42:320-332. [PMID: 29601641 PMCID: PMC5980726 DOI: 10.1002/gepi.22121] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2017] [Revised: 01/19/2018] [Accepted: 02/19/2018] [Indexed: 01/09/2023]
Abstract
Many gene mapping studies of complex traits have identified genes or variants that influence multiple phenotypes. With the advent of next-generation sequencing technology, there has been substantial interest in identifying rare variants in genes that possess cross-phenotype effects. In the presence of such effects, modeling both the phenotypes and rare variants collectively using multivariate models can achieve higher statistical power compared to univariate methods that either model each phenotype separately or perform separate tests for each variant. Several studies collect phenotypic data over time and using such longitudinal data can further increase the power to detect genetic associations. Although rare-variant approaches exist for testing cross-phenotype effects at a single time point, there is no analogous method for performing such analyses using longitudinal outcomes. In order to fill this important gap, we propose an extension of Gene Association with Multiple Traits (GAMuT) test, a method for cross-phenotype analysis of rare variants using a framework based on the distance covariance. The approach allows for both binary and continuous phenotypes and can also adjust for covariates. Our simple adjustment to the GAMuT test allows it to handle longitudinal data and to gain power by exploiting temporal correlation. The approach is computationally efficient and applicable on a genome-wide scale due to the use of a closed-form test whose significance can be evaluated analytically. We use simulated data to demonstrate that our method has favorable power over competing approaches and also apply our approach to exome chip data from the Genetic Epidemiology Network of Arteriopathy.
Collapse
Affiliation(s)
- Pratyaydipta Rudra
- Department of Biostatistics and Informatics, Colorado School of Public Health, Aurora, CO
| | | | - Erin B. Ware
- Department of Epidemiology, University of Michigan, Ann Arbor, MI
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI
| | - Min A. Jhun
- Department of Epidemiology, University of Michigan, Ann Arbor, MI
| | | | - Wei Zhao
- Department of Epidemiology, University of Michigan, Ann Arbor, MI
| | | | | | | | | | - Debashis Ghosh
- Department of Biostatistics and Informatics, Colorado School of Public Health, Aurora, CO
| |
Collapse
|
11
|
Xu Z, Xu G, Pan W. Adaptive testing for association between two random vectors in moderate to high dimensions. Genet Epidemiol 2017; 41:599-609. [PMID: 28714590 PMCID: PMC5643233 DOI: 10.1002/gepi.22059] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2017] [Revised: 04/26/2017] [Accepted: 05/17/2017] [Indexed: 01/09/2023]
Abstract
Testing for association between two random vectors is a common and important task in many fields, however, existing tests, such as Escoufier's RV test, are suitable only for low-dimensional data, not for high-dimensional data. In moderate to high dimensions, it is necessary to consider sparse signals, which are often expected with only a few, but not many, variables associated with each other. We generalize the RV test to moderate-to-high dimensions. The key idea is to data adaptively weight each variable pair based on its empirical association. As the consequence, the proposed test is adaptive, alleviating the effects of noise accumulation in high-dimensional data, and thus maintaining the power for both dense and sparse alternative hypotheses. We show the connections between the proposed test with several existing tests, such as a generalized estimating equations-based adaptive test, multivariate kernel machine regression (KMR), and kernel distance methods. Furthermore, we modify the proposed adaptive test so that it can be powerful for nonlinear or nonmonotonic associations. We use both real data and simulated data to demonstrate the advantages and usefulness of the proposed new test. The new test is freely available in R package aSPC on CRAN at https://cran.r-project.org/web/packages/aSPC/index.html and https://github.com/jasonzyx/aSPC.
Collapse
Affiliation(s)
- Zhiyuan Xu
- Division of Biostatistics, University of Minnesota
| | - Gongjun Xu
- Department of Statistics, University of Michigan
| | - Wei Pan
- Division of Biostatistics, University of Minnesota
| | | |
Collapse
|
12
|
Powerful Genetic Association Analysis for Common or Rare Variants with High-Dimensional Structured Traits. Genetics 2017. [PMID: 28642271 DOI: 10.1534/genetics.116.199646] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Many genetic association studies collect a wide range of complex traits. As these traits may be correlated and share a common genetic mechanism, joint analysis can be statistically more powerful and biologically more meaningful. However, most existing tests for multiple traits cannot be used for high-dimensional and possibly structured traits, such as network-structured transcriptomic pathway expressions. To overcome potential limitations, in this article we propose the dual kernel-based association test (DKAT) for testing the association between multiple traits and multiple genetic variants, both common and rare. In DKAT, two individual kernels are used to describe the phenotypic and genotypic similarity, respectively, between pairwise subjects. Using kernels allows for capturing structure while accommodating dimensionality. Then, the association between traits and genetic variants is summarized by a coefficient which measures the association between two kernel matrices. Finally, DKAT evaluates the hypothesis of nonassociation with an analytical P-value calculation without any computationally expensive resampling procedures. By collapsing information in both traits and genetic variants using kernels, the proposed DKAT is shown to have a correct type-I error rate and higher power than other existing methods in both simulation studies and application to a study of genetic regulation of pathway gene expressions.
Collapse
|
13
|
Weiner MW, Veitch DP, Aisen PS, Beckett LA, Cairns NJ, Green RC, Harvey D, Jack CR, Jagust W, Morris JC, Petersen RC, Saykin AJ, Shaw LM, Toga AW, Trojanowski JQ. Recent publications from the Alzheimer's Disease Neuroimaging Initiative: Reviewing progress toward improved AD clinical trials. Alzheimers Dement 2017; 13:e1-e85. [PMID: 28342697 DOI: 10.1016/j.jalz.2016.11.007] [Citation(s) in RCA: 165] [Impact Index Per Article: 23.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2016] [Revised: 11/21/2016] [Accepted: 11/28/2016] [Indexed: 01/31/2023]
Abstract
INTRODUCTION The Alzheimer's Disease Neuroimaging Initiative (ADNI) has continued development and standardization of methodologies for biomarkers and has provided an increased depth and breadth of data available to qualified researchers. This review summarizes the over 400 publications using ADNI data during 2014 and 2015. METHODS We used standard searches to find publications using ADNI data. RESULTS (1) Structural and functional changes, including subtle changes to hippocampal shape and texture, atrophy in areas outside of hippocampus, and disruption to functional networks, are detectable in presymptomatic subjects before hippocampal atrophy; (2) In subjects with abnormal β-amyloid deposition (Aβ+), biomarkers become abnormal in the order predicted by the amyloid cascade hypothesis; (3) Cognitive decline is more closely linked to tau than Aβ deposition; (4) Cerebrovascular risk factors may interact with Aβ to increase white-matter (WM) abnormalities which may accelerate Alzheimer's disease (AD) progression in conjunction with tau abnormalities; (5) Different patterns of atrophy are associated with impairment of memory and executive function and may underlie psychiatric symptoms; (6) Structural, functional, and metabolic network connectivities are disrupted as AD progresses. Models of prion-like spreading of Aβ pathology along WM tracts predict known patterns of cortical Aβ deposition and declines in glucose metabolism; (7) New AD risk and protective gene loci have been identified using biologically informed approaches; (8) Cognitively normal and mild cognitive impairment (MCI) subjects are heterogeneous and include groups typified not only by "classic" AD pathology but also by normal biomarkers, accelerated decline, and suspected non-Alzheimer's pathology; (9) Selection of subjects at risk of imminent decline on the basis of one or more pathologies improves the power of clinical trials; (10) Sensitivity of cognitive outcome measures to early changes in cognition has been improved and surrogate outcome measures using longitudinal structural magnetic resonance imaging may further reduce clinical trial cost and duration; (11) Advances in machine learning techniques such as neural networks have improved diagnostic and prognostic accuracy especially in challenges involving MCI subjects; and (12) Network connectivity measures and genetic variants show promise in multimodal classification and some classifiers using single modalities are rivaling multimodal classifiers. DISCUSSION Taken together, these studies fundamentally deepen our understanding of AD progression and its underlying genetic basis, which in turn informs and improves clinical trial design.
Collapse
Affiliation(s)
- Michael W Weiner
- Department of Veterans Affairs Medical Center, Center for Imaging of Neurodegenerative Diseases, San Francisco, CA, USA; Department of Radiology, University of California, San Francisco, CA, USA; Department of Medicine, University of California, San Francisco, CA, USA; Department of Psychiatry, University of California, San Francisco, CA, USA; Department of Neurology, University of California, San Francisco, CA, USA.
| | - Dallas P Veitch
- Department of Veterans Affairs Medical Center, Center for Imaging of Neurodegenerative Diseases, San Francisco, CA, USA
| | - Paul S Aisen
- Alzheimer's Therapeutic Research Institute, University of Southern California, San Diego, CA, USA
| | - Laurel A Beckett
- Division of Biostatistics, Department of Public Health Sciences, University of California, Davis, CA, USA
| | - Nigel J Cairns
- Knight Alzheimer's Disease Research Center, Washington University School of Medicine, Saint Louis, MO, USA; Department of Neurology, Washington University School of Medicine, Saint Louis, MO, USA
| | - Robert C Green
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Danielle Harvey
- Division of Biostatistics, Department of Public Health Sciences, University of California, Davis, CA, USA
| | | | - William Jagust
- Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, CA, USA
| | - John C Morris
- Alzheimer's Therapeutic Research Institute, University of Southern California, San Diego, CA, USA
| | | | - Andrew J Saykin
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, USA; Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Leslie M Shaw
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Arthur W Toga
- Laboratory of Neuroimaging, Institute of Neuroimaging and Informatics, Keck School of Medicine of University of Southern California, Los Angeles, CA, USA
| | - John Q Trojanowski
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Institute on Aging, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Alzheimer's Disease Core Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Udall Parkinson's Research Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | | |
Collapse
|
14
|
Zhan X, Plantinga A, Zhao N, Wu MC. A fast small-sample kernel independence test for microbiome community-level association analysis. Biometrics 2017; 73:1453-1463. [PMID: 28295177 DOI: 10.1111/biom.12684] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2016] [Revised: 02/01/2017] [Accepted: 02/01/2017] [Indexed: 12/13/2022]
Abstract
To fully understand the role of microbiome in human health and diseases, researchers are increasingly interested in assessing the relationship between microbiome composition and host genomic data. The dimensionality of the data as well as complex relationships between microbiota and host genomics pose considerable challenges for analysis. In this article, we apply a kernel RV coefficient (KRV) test to evaluate the overall association between host gene expression and microbiome composition. The KRV statistic can capture nonlinear correlations and complex relationships among the individual data types and between gene expression and microbiome composition through measuring general dependency. Testing proceeds via a similar route as existing tests of the generalized RV coefficients and allows for rapid p-value calculation. Strategies to allow adjustment for confounding effects, which is crucial for avoiding misleading results, and to alleviate the problem of selecting the most favorable kernel are considered. Simulation studies show that KRV is useful in testing statistical independence with finite samples given the kernels are appropriately chosen, and can powerfully identify existing associations between microbiome composition and host genomic data while protecting type I error. We apply the KRV to a microbiome study examining the relationship between host transcriptome and microbiome composition within the context of inflammatory bowel disease and are able to derive new biological insights and provide formal inference on prior qualitative observations.
Collapse
Affiliation(s)
- Xiang Zhan
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, U.S.A
| | - Anna Plantinga
- Department of Biostatistics, University of Washington, Seattle, Washington 98195, U.S.A
| | - Ni Zhao
- Department of Biostatistics, Johns Hopkins University, Baltimore, Maryland 21205, U.S.A
| | - Michael C Wu
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, U.S.A
| |
Collapse
|
15
|
Tao C, Nichols TE, Hua X, Ching CRK, Rolls ET, Thompson PM, Feng J. Generalized reduced rank latent factor regression for high dimensional tensor fields, and neuroimaging-genetic applications. Neuroimage 2016; 144:35-57. [PMID: 27666385 DOI: 10.1016/j.neuroimage.2016.08.027] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2015] [Revised: 08/01/2016] [Accepted: 08/14/2016] [Indexed: 11/18/2022] Open
Abstract
We propose a generalized reduced rank latent factor regression model (GRRLF) for the analysis of tensor field responses and high dimensional covariates. The model is motivated by the need from imaging-genetic studies to identify genetic variants that are associated with brain imaging phenotypes, often in the form of high dimensional tensor fields. GRRLF identifies from the structure in the data the effective dimensionality of the data, and then jointly performs dimension reduction of the covariates, dynamic identification of latent factors, and nonparametric estimation of both covariate and latent response fields. After accounting for the latent and covariate effects, GRLLF performs a nonparametric test on the remaining factor of interest. GRRLF provides a better factorization of the signals compared with common solutions, and is less susceptible to overfitting because it exploits the effective dimensionality. The generality and the flexibility of GRRLF also allow various statistical models to be handled in a unified framework and solutions can be efficiently computed. Within the field of neuroimaging, it improves the sensitivity for weak signals and is a promising alternative to existing approaches. The operation of the framework is demonstrated with both synthetic datasets and a real-world neuroimaging example in which the effects of a set of genes on the structure of the brain at the voxel level were measured, and the results compared favorably with those from existing approaches.
Collapse
Affiliation(s)
- Chenyang Tao
- Centre for Computational Systems Biology and School of Mathematical Sciences, Fudan University, Shanghai, PR China; Department of Computer Science, Warwick University, Coventry, UK
| | | | - Xue Hua
- Imaging Genetics Center, Institute for Neuroimaging & Informatics, University of Southern California, Los Angeles, CA, USA
| | - Christopher R K Ching
- Imaging Genetics Center, Institute for Neuroimaging & Informatics, University of Southern California, Los Angeles, CA, USA; Interdepartmental Neuroscience Graduate Program, UCLA School of Medicine, Los Angeles, CA, USA
| | - Edmund T Rolls
- Department of Computer Science, Warwick University, Coventry, UK; Oxford Centre for Computational Neuroscience, Oxford, UK
| | - Paul M Thompson
- Imaging Genetics Center, Institute for Neuroimaging & Informatics, University of Southern California, Los Angeles, CA, USA; Departments of Neurology, Psychiatry, Radiology, Engineering, Pediatrics, and Ophthalmology, USC, Los Angeles, CA, USA
| | - Jianfeng Feng
- Centre for Computational Systems Biology and School of Mathematical Sciences, Fudan University, Shanghai, PR China; Department of Computer Science, Warwick University, Coventry, UK; School of Life Science and the Collaborative Innovation Center for Brain Science, Fudan University, Shanghai 200433, PR China.
| |
Collapse
|
16
|
Nonlinear association criterion, nonlinear Granger causality and related issues with applications to neuroimage studies. J Neurosci Methods 2016; 262:110-32. [PMID: 26791806 DOI: 10.1016/j.jneumeth.2016.01.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2015] [Revised: 12/21/2015] [Accepted: 01/02/2016] [Indexed: 11/20/2022]
Abstract
BACKGROUND Quantifying associations in neuroscience (and many other scientific disciplines) is often challenged by high-dimensionality, nonlinearity and noisy observations. Many classic methods have either poor power or poor scalability on data sets of the same or different scales such as genetical, physiological and image data. NEW METHOD Based on the framework of reproducing kernel Hilbert spaces we proposed a new nonlinear association criteria (NAC) with an efficient numerical algorithm and p-value approximation scheme. We also presented mathematical justification that links the proposed method to related methods such as kernel generalized variance, kernel canonical correlation analysis and Hilbert-Schmidt independence criteria. NAC allows the detection of association between arbitrary input domain as long as a characteristic kernel is defined. A MATLAB package was provided to facilitate applications. RESULTS Extensive simulation examples and four real world neuroscience examples including functional MRI causality, Calcium imaging and imaging genetic studies on autism [Brain, 138(5):13821393 (2015)] and alcohol addiction [PNAS, 112(30):E4085-E4093 (2015)] are used to benchmark NAC. It demonstrates the superior performance over the existing procedures we tested and also yields biologically significant results for the real world examples. COMPARISON WITH EXISTING METHOD(S) NAC beats its linear counterparts when nonlinearity is presented in the data. It also shows more robustness against different experimental setups compared with its nonlinear counterparts. CONCLUSIONS In this work we presented a new and robust statistical approach NAC for measuring associations. It could serve as an interesting alternative to the existing methods for datasets where nonlinearity and other confounding factors are present.
Collapse
|
17
|
A Statistical Approach for Testing Cross-Phenotype Effects of Rare Variants. Am J Hum Genet 2016; 98:525-540. [PMID: 26942286 DOI: 10.1016/j.ajhg.2016.01.017] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2015] [Accepted: 01/29/2016] [Indexed: 11/20/2022] Open
Abstract
Increasing empirical evidence suggests that many genetic variants influence multiple distinct phenotypes. When cross-phenotype effects exist, multivariate association methods that consider pleiotropy are often more powerful than univariate methods that model each phenotype separately. Although several statistical approaches exist for testing cross-phenotype effects for common variants, there is a lack of similar tests for gene-based analysis of rare variants. In order to fill this important gap, we introduce a statistical method for cross-phenotype analysis of rare variants using a nonparametric distance-covariance approach that compares similarity in multivariate phenotypes to similarity in rare-variant genotypes across a gene. The approach can accommodate both binary and continuous phenotypes and further can adjust for covariates. Our approach yields a closed-form test whose significance can be evaluated analytically, thereby improving computational efficiency and permitting application on a genome-wide scale. We use simulated data to demonstrate that our method, which we refer to as the Gene Association with Multiple Traits (GAMuT) test, provides increased power over competing approaches. We also illustrate our approach using exome-chip data from the Genetic Epidemiology Network of Arteriopathy.
Collapse
|
18
|
Wu B, Pankow JS. Sequence Kernel Association Test of Multiple Continuous Phenotypes. Genet Epidemiol 2016; 40:91-100. [PMID: 26782911 PMCID: PMC4724299 DOI: 10.1002/gepi.21945] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2015] [Revised: 10/28/2015] [Accepted: 11/01/2015] [Indexed: 01/12/2023]
Abstract
Genetic studies often collect multiple correlated traits, which could be analyzed jointly to increase power by aggregating multiple weak effects and provide additional insights into the etiology of complex human diseases. Existing methods for multiple trait association tests have primarily focused on common variants. There is a surprising dearth of published methods for testing the association of rare variants with multiple correlated traits. In this paper, we extend the commonly used sequence kernel association test (SKAT) for single-trait analysis to test for the joint association of rare variant sets with multiple traits. We investigate the performance of the proposed method through extensive simulation studies. We further illustrate its usefulness with application to the analysis of diabetes-related traits in the Atherosclerosis Risk in Communities (ARIC) Study. We identified an exome-wide significant rare variant set in the gene YAP1 worthy of further investigations.
Collapse
Affiliation(s)
- Baolin Wu
- Division of Biostatistics, University of Minnesota
| | - James S. Pankow
- Division of Epidemiology and Community Health School of
Public Health, University of Minnesota
| |
Collapse
|
19
|
Kim J, Pan W. Highly adaptive tests for group differences in brain functional connectivity. NEUROIMAGE-CLINICAL 2015; 9:625-39. [PMID: 26740916 PMCID: PMC4644249 DOI: 10.1016/j.nicl.2015.10.004] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/26/2015] [Revised: 09/14/2015] [Accepted: 10/05/2015] [Indexed: 01/06/2023]
Abstract
Resting-state functional magnetic resonance imaging (rs-fMRI) and other technologies have been offering evidence and insights showing that altered brain functional networks are associated with neurological illnesses such as Alzheimer's disease. Exploring brain networks of clinical populations compared to those of controls would be a key inquiry to reveal underlying neurological processes related to such illnesses. For such a purpose, group-level inference is a necessary first step in order to establish whether there are any genuinely disrupted brain subnetworks. Such an analysis is also challenging due to the high dimensionality of the parameters in a network model and high noise levels in neuroimaging data. We are still in the early stage of method development as highlighted by Varoquaux and Craddock (2013) that “there is currently no unique solution, but a spectrum of related methods and analytical strategies” to learn and compare brain connectivity. In practice the important issue of how to choose several critical parameters in estimating a network, such as what association measure to use and what is the sparsity of the estimated network, has not been carefully addressed, largely because the answers are unknown yet. For example, even though the choice of tuning parameters in model estimation has been extensively discussed in the literature, as to be shown here, an optimal choice of a parameter for network estimation may not be optimal in the current context of hypothesis testing. Arbitrarily choosing or mis-specifying such parameters may lead to extremely low-powered tests. Here we develop highly adaptive tests to detect group differences in brain connectivity while accounting for unknown optimal choices of some tuning parameters. The proposed tests combine statistical evidence against a null hypothesis from multiple sources across a range of plausible tuning parameter values reflecting uncertainty with the unknown truth. These highly adaptive tests are not only easy to use, but also high-powered robustly across various scenarios. The usage and advantages of these novel tests are demonstrated on an Alzheimer's disease dataset and simulated data. Rigorous testing for genuinely altered functional networks between two groups The proposed tests are high powered and general across a wide range of scenarios. Data-driven penalized network estimation Data-driven choice between correlations and partial correlations to describe association Some key differences between network estimation and testing are highlighted.
Collapse
Affiliation(s)
- Junghi Kim
- Division of Biostatistics, University of Minnesota, Minneapolis, MN 55455, USA
| | - Wei Pan
- Division of Biostatistics, University of Minnesota, Minneapolis, MN 55455, USA
| | | |
Collapse
|
20
|
Ge T, Nichols TE, Ghosh D, Mormino EC, Smoller JW, Sabuncu MR. A kernel machine method for detecting effects of interaction between multidimensional variable sets: an imaging genetics application. Neuroimage 2015; 109:505-514. [PMID: 25600633 DOI: 10.1016/j.neuroimage.2015.01.029] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2014] [Revised: 01/06/2015] [Accepted: 01/09/2015] [Indexed: 11/19/2022] Open
Abstract
Measurements derived from neuroimaging data can serve as markers of disease and/or healthy development, are largely heritable, and have been increasingly utilized as (intermediate) phenotypes in genetic association studies. To date, imaging genetic studies have mostly focused on discovering isolated genetic effects, typically ignoring potential interactions with non-genetic variables such as disease risk factors, environmental exposures, and epigenetic markers. However, identifying significant interaction effects is critical for revealing the true relationship between genetic and phenotypic variables, and shedding light on disease mechanisms. In this paper, we present a general kernel machine based method for detecting effects of the interaction between multidimensional variable sets. This method can model the joint and epistatic effect of a collection of single nucleotide polymorphisms (SNPs), accommodate multiple factors that potentially moderate genetic influences, and test for nonlinear interactions between sets of variables in a flexible framework. As a demonstration of application, we applied the method to the data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) to detect the effects of the interactions between candidate Alzheimer's disease (AD) risk genes and a collection of cardiovascular disease (CVD) risk factors, on hippocampal volume measurements derived from structural brain magnetic resonance imaging (MRI) scans. Our method identified that two genes, CR1 and EPHA1, demonstrate significant interactions with CVD risk factors on hippocampal volume, suggesting that CR1 and EPHA1 may play a role in influencing AD-related neurodegeneration in the presence of CVD risks.
Collapse
Affiliation(s)
- Tian Ge
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital / Harvard Medical School, Charlestown, MA 02129, USA
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Thomas E Nichols
- Department of Statistics & Warwick Manufacturing Group, The University of Warwick, Coventry CV4 7AL, UK
| | - Debashis Ghosh
- Department of Statistics, The Pennsylvania State University, PA 16802, USA
| | - Elizabeth C Mormino
- Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Jordan W Smoller
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA 02114, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02138, USA
| | - Mert R Sabuncu
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital / Harvard Medical School, Charlestown, MA 02129, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
21
|
Kim J, Wozniak JR, Mueller BA, Shen X, Pan W. Comparison of statistical tests for group differences in brain functional networks. Neuroimage 2014; 101:681-94. [PMID: 25086298 PMCID: PMC4165845 DOI: 10.1016/j.neuroimage.2014.07.031] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2014] [Revised: 06/30/2014] [Accepted: 07/21/2014] [Indexed: 01/13/2023] Open
Abstract
Brain functional connectivity has been studied by analyzing time series correlations in regional brain activities based on resting-state fMRI data. Brain functional connectivity can be depicted as a network or graph defined as a set of nodes linked by edges. Nodes represent brain regions and an edge measures the strength of functional correlation between two regions. Most of existing work focuses on estimation of such a network. A key but inadequately addressed question is how to test for possible differences of the networks between two subject groups, say between healthy controls and patients. Here we illustrate and compare the performance of several state-of-the-art statistical tests drawn from the neuroimaging, genetics, ecology and high-dimensional data literatures. Both real and simulated data were used to evaluate the methods. We found that Network Based Statistic (NBS) performed well in many but not all situations, and its performance critically depends on the choice of its threshold parameter, which is unknown and difficult to choose in practice. Importantly, two adaptive statistical tests called adaptive sum of powered score (aSPU) and its weighted version (aSPUw) are easy to use and complementary to NBS, being higher powered than NBS in some situations. The aSPU and aSPUw tests can also be applied to adjust for covariates. Between the aSPU and aSPUw tests, they often, but not always, performed similarly with neither one as a uniform winner. On the other hand, Multivariate Matrix Distance Regression (MDMR) has been applied to detect group differences for brain connectivity; with the usual choice of the Euclidean distance, MDMR is a special case of the aSPU test. Consequently NBS, aSPU and aSPUw tests are recommended to test for group differences in functional connectivity.
Collapse
Affiliation(s)
- Junghi Kim
- Division of Biostatistics, University of Minnesota, USA
| | | | | | | | - Wei Pan
- Division of Biostatistics, University of Minnesota, USA.
| |
Collapse
|