1
|
Jiao CN, Shang J, Li F, Cui X, Wang YL, Gao YL, Liu JX. Diagnosis-Guided Deep Subspace Clustering Association Study for Pathogenetic Markers Identification of Alzheimer's Disease Based on Comparative Atlases. IEEE J Biomed Health Inform 2024; 28:3029-3041. [PMID: 38427553 DOI: 10.1109/jbhi.2024.3372294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/03/2024]
Abstract
The roles of brain region activities and genotypic functions in the pathogenesis of Alzheimer's disease (AD) remain unclear. Meanwhile, current imaging genetics methods are difficult to identify potential pathogenetic markers by correlation analysis between brain network and genetic variation. To discover disease-related brain connectome from the specific brain structure and the fine-grained level, based on the Automated Anatomical Labeling (AAL) and human Brainnetome atlases, the functional brain network is first constructed for each subject. Specifically, the upper triangle elements of the functional connectivity matrix are extracted as connectivity features. The clustering coefficient and the average weighted node degree are developed to assess the significance of every brain area. Since the constructed brain network and genetic data are characterized by non-linearity, high-dimensionality, and few subjects, the deep subspace clustering algorithm is proposed to reconstruct the original data. Our multilayer neural network helps capture the non-linear manifolds, and subspace clustering learns pairwise affinities between samples. Moreover, most approaches in neuroimaging genetics are unsupervised learning, neglecting the diagnostic information related to diseases. We presented a label constraint with diagnostic status to instruct the imaging genetics correlation analysis. To this end, a diagnosis-guided deep subspace clustering association (DDSCA) method is developed to discover brain connectome and risk genetic factors by integrating genotypes with functional network phenotypes. Extensive experiments prove that DDSCA achieves superior performance to most association methods and effectively selects disease-relevant genetic markers and brain connectome at the coarse-grained and fine-grained levels.
Collapse
|
2
|
Chen J, Iraji A, Fu Z, Andrés-Camazón P, Thapaliya B, Liu J, Calhoun VD. Dynamic fusion of genomics and functional network connectivity in UK biobank reveals static and time-varying SNP manifolds. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.01.09.24301013. [PMID: 38260328 PMCID: PMC10802663 DOI: 10.1101/2024.01.09.24301013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Many psychiatric and neurological disorders show significant heritability, indicating strong genetic influence. In parallel, dynamic functional network connectivity (dFNC) measures functional temporal coupling between brain networks in a time-varying manner and has proven to identify disease-related changes in the brain. However, it remains largely unclear how genetic risk contributes to brain dysconnectivity that further manifests into clinical symptoms. The current work aimed to address this gap by proposing a novel joint ICA (jICA)-based "dynamic fusion" framework to identify dynamically tuned SNP manifolds by linking static SNPs to dynamic functional information of the brain. The sliding window approach was utilized to estimate four dFNC states and compute subject-level state-specific dFNC features. Each state of dFNC features were then combined with 12946 SZ risk SNPs for jICA decomposition, resulting in four parallel fusions in 32861 European ancestry individuals within the UK Biobank cohort. The identified joint SNP-dFNC components were further validated for SZ relevance in an aggregated SZ cohort, and compared for across-state similarity to indicate level of dynamism. The results supported that dynamic fusion yielded "static" and "dynamic" components (i.e., high and low across-state similarity, respectively) for SNP and dFNC modalities. As expected, the SNP components presented a mixture of static and dynamic manifolds, with the latter largely driven by fusion with dFNC. We also showed that some of the dynamic SNP manifolds uniquely elicited by fusion with state-specific dFNC features complemented each other in terms of biological interpretation. This dynamic fusion framework thus allows expanding the SNP modality to manifolds in the time dimension, which provides a unique lens to elicit unique SNP correlates of dFNC otherwise unseen, promising additional insights on how genetic risk links to disease-related dysconnectivity.
Collapse
Affiliation(s)
- Jiayu Chen
- Tri-Institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS): (Georgia State University, Georgia Institute of Technology, and Emory University), Atlanta, GA, USA
- Department of Computer Science, Georgia State University, Atlanta, GA, USA
| | - Armin Iraji
- Tri-Institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS): (Georgia State University, Georgia Institute of Technology, and Emory University), Atlanta, GA, USA
- Department of Computer Science, Georgia State University, Atlanta, GA, USA
| | - Zening Fu
- Tri-Institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS): (Georgia State University, Georgia Institute of Technology, and Emory University), Atlanta, GA, USA
| | - Pablo Andrés-Camazón
- Department of Child and Adolescent Psychiatry, Institute of Psychiatry and Mental Health, Hospital General Universitario Gregorio Marañón, IiSGM, Madrid, Spain
| | - Bishal Thapaliya
- Tri-Institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS): (Georgia State University, Georgia Institute of Technology, and Emory University), Atlanta, GA, USA
| | - Jingyu Liu
- Tri-Institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS): (Georgia State University, Georgia Institute of Technology, and Emory University), Atlanta, GA, USA
- Department of Computer Science, Georgia State University, Atlanta, GA, USA
| | - Vince D. Calhoun
- Tri-Institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS): (Georgia State University, Georgia Institute of Technology, and Emory University), Atlanta, GA, USA
- Department of Computer Science, Georgia State University, Atlanta, GA, USA
| |
Collapse
|
3
|
Jin Z, Kang J, Yu T. Bayesian nonparametric method for genetic dissection of brain activation region. Front Neurosci 2023; 17:1235321. [PMID: 37920300 PMCID: PMC10618557 DOI: 10.3389/fnins.2023.1235321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 09/26/2023] [Indexed: 11/04/2023] Open
Abstract
Biological evidence indicewates that the brain atrophy can be involved at the onset of neuropathological pathways of Alzheimer's disease. However, there is lack of formal statistical methods to perform genetic dissection of brain activation phenotypes such as shape and intensity. To this end, we propose a Bayesian hierarchical model which consists of two levels of hierarchy. At level 1, we develop a Bayesian nonparametric level set (BNLS) model for studying the brain activation region shape. At level 2, we construct a regression model to select genetic variants that are strongly associated with the brain activation intensity, where a spike-and-slab prior and a Gaussian prior are chosen for feature selection. We develop efficient posterior computation algorithms based on the Markov chain Monte Carlo (MCMC) method. We demonstrate the advantages of the proposed method via extensive simulation studies and analyses of imaging genetics data in the Alzheimer's disease neuroimaging initiative (ADNI) study.
Collapse
Affiliation(s)
- Zhuxuan Jin
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, United States
| | - Jian Kang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, United States
| | - Tianwei Yu
- School of Data Science, Chinese University of Hong Kong - Shenzhen, Shenzhen, China
- Guangdong Provincial Key Laboratory of Big Data Computing, Shenzhen, China
| |
Collapse
|
4
|
He B, Xie L, Varathan P, Nho K, Risacher SL, Saykin AJ, Yan J. Fused multi-modal similarity network as prior in guiding brain imaging genetic association. Front Big Data 2023; 6:1151893. [PMID: 37215688 PMCID: PMC10196480 DOI: 10.3389/fdata.2023.1151893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 04/04/2023] [Indexed: 05/24/2023] Open
Abstract
Introduction Brain imaging genetics aims to explore the genetic architecture underlying brain structure and functions. Recent studies showed that the incorporation of prior knowledge, such as subject diagnosis information and brain regional correlation, can help identify significantly stronger imaging genetic associations. However, sometimes such information may be incomplete or even unavailable. Methods In this study, we explore a new data-driven prior knowledge that captures the subject-level similarity by fusing multi-modal similarity networks. It was incorporated into the sparse canonical correlation analysis (SCCA) model, which is aimed to identify a small set of brain imaging and genetic markers that explain the similarity matrix supported by both modalities. It was applied to amyloid and tau imaging data of the ADNI cohort, respectively. Results Fused similarity matrix across imaging and genetic data was found to improve the association performance better or similarly well as diagnosis information, and therefore would be a potential substitute prior when the diagnosis information is not available (i.e., studies focused on healthy controls). Discussion Our result confirmed the value of all types of prior knowledge in improving association identification. In addition, the fused network representing the subject relationship supported by multi-modal data showed consistently the best or equally best performance compared to the diagnosis network and the co-expression network.
Collapse
Affiliation(s)
- Bing He
- Luddy School of Informatics, Computing, and Engineering, Indiana University Purdue University Indianapolis (IUPUI), Indianapolis, IN, United States
| | - Linhui Xie
- School of Engineering and Technology, Indiana University Purdue University Indianapolis (IUPUI), Indianapolis, IN, United States
| | - Pradeep Varathan
- Indiana Alzheimer's Disease Research Center, Indianapolis, IN, United States
| | - Kwangsik Nho
- Indiana Alzheimer's Disease Research Center, Indianapolis, IN, United States
| | - Shannon L. Risacher
- Indiana Alzheimer's Disease Research Center, Indianapolis, IN, United States
| | - Andrew J. Saykin
- Indiana Alzheimer's Disease Research Center, Indianapolis, IN, United States
| | - Jingwen Yan
- Luddy School of Informatics, Computing, and Engineering, Indiana University Purdue University Indianapolis (IUPUI), Indianapolis, IN, United States
| | | |
Collapse
|
5
|
Identifying Biomarkers of Alzheimer's Disease via a Novel Structured Sparse Canonical Correlation Analysis Approach. J Mol Neurosci 2021; 72:323-335. [PMID: 34570360 DOI: 10.1007/s12031-021-01915-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Accepted: 09/09/2021] [Indexed: 02/05/2023]
Abstract
Using correlation analysis to study the potential connection between brain genetics and imaging has become an effective method to understand neurodegenerative diseases. Sparse canonical correlation analysis (SCCA) makes it possible to study high-dimensional genetic information. The traditional SCCA methods can only process single-modal genetic and image data, which to some extent weaken the close connection of the brain's biological network. In some recently proposed multimodal SCCA methods, due to the limitations of penalty items, the pre-processed data needs to be further filtered to make the dimensions uniform, which may destroy the potential association of data in the same modal. In this research, in order to combine data between different modalities and to ensure that the chain relationship or graph network relationship within the same modality will not be destroyed, the original generalized fused lasso penalty was replaced with the fused pairwise group lasso (FGL) and the graph-guided pairwise group lasso (GGL) based on the method of joint sparse canonical correlation analysis (JSCCA). We used prior knowledge to construct a supervised bivariate learning model and use linear regression to select quantitative traits (QTs) of images that are strongly correlated with the Mini-mental State Examination (MMSE) scores. Compared with FGL-SCCA, the model we constructed obtained a higher gene-ROI correlation coefficient and identified more significant biomarkers, providing a theoretical basis for further understanding the complex pathology of neurodegenerative diseases.
Collapse
|
6
|
Wang M, Shao W, Hao X, Shen L, Zhang D. Identify Consistent Cross-Modality Imaging Genetic Patterns via Discriminant Sparse Canonical Correlation Analysis. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:1549-1561. [PMID: 31581090 DOI: 10.1109/tcbb.2019.2944825] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Sparse canonical correlation analysis (SCCA) is a bi-multivariate technique used in imaging genetics to identify complex multi-SNP-multi-QT associations. However, the traditional SCCA algorithm has been designed to seek a linear correlation between the SNP genotype and brain imaging phenotype, ignoring the discriminant similarity information between within-class subjects in brain imaging genetics association analysis. In addition, multi-modality brain imaging phenotypes are extracted from different perspectives and imaging markers from the same region consistently showing up in multimodalities may provide more insights for the mechanistic understanding of diseases. In this paper, a novel multi-modality discriminant SCCA algorithm (MD-SCCA) is proposed to overcome these limitations as well as to improve learning results by incorporating valuable discriminant similarity information into the SCCA algorithm. Specifically, we first extract the discriminant similarity information between within-class subjects by the sparse representation. Second, the discriminant similarity information is enforced within SCCA to construct a discriminant SCCA algorithm (D-SCCA). At last, the MD-SCCA algorithm is adopted to fully explore the relationships among different modalities of different subjects. In experiments, both synthetic dataset and real data from the Alzheimer's Disease Neuroimaging Initiative database are used to test the performance of our algorithm. The empirical results have demonstrated that the proposed algorithm not only produces improved cross-validation performances but also identifies consistent cross-modality imaging genetic biomarkers.
Collapse
|
7
|
Wang M, Shao W, Hao X, Zhang D. Identify Complex Imaging Genetic Patterns via Fusion Self-Expressive Network Analysis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:1673-1686. [PMID: 33661732 DOI: 10.1109/tmi.2021.3063785] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In the brain imaging genetic studies, it is a challenging task to estimate the association between quantitative traits (QTs) extracted from neuroimaging data and genetic markers such as single-nucleotide polymorphisms (SNPs). Most of the existing association studies are based on the extensions of sparse canonical correlation analysis (SCCA) for the identification of complex bi-multivariate associations, which can take the specific structure and group information into consideration. However, they often take the original data as input without considering its underlying complex multi-subspace structure, which will deteriorate the performance of the following integrative analysis. Accordingly, in this paper, the self-expressive property is exploited for the reconstruction of the original data before the association analysis, which can well describe the similarity structure. Specifically, we first apply the within-class similarity information to construct self-expressive networks by sparse representation. Then, we use the fusion method to iteratively fuse the self-expressive networks from multi-modality brain phenotypes into one network. Finally, we calculate the imaging genetic association based on the fused self-expressive network. We conduct the experiments on both single-modality and multi-modality phenotype data. Related experimental results validate that our method can not only better estimate the potential association between genetic markers and quantitative traits but also identify consistent multi-modality imaging genetic biomarkers to guide the interpretation of Alzheimer's disease.
Collapse
|
8
|
Huang M, Chen X, Yu Y, Lai H, Feng Q. Imaging Genetics Study Based on a Temporal Group Sparse Regression and Additive Model for Biomarker Detection of Alzheimer's Disease. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:1461-1473. [PMID: 33556003 DOI: 10.1109/tmi.2021.3057660] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Imaging genetics is an effective tool used to detect potential biomarkers of Alzheimer's disease (AD) in imaging and genetic data. Most existing imaging genetics methods analyze the association between brain imaging quantitative traits (QTs) and genetic data [e.g., single nucleotide polymorphism (SNP)] by using a linear model, ignoring correlations between a set of QTs and SNP groups, and disregarding the varied associations between longitudinal imaging QTs and SNPs. To solve these problems, we propose a novel temporal group sparsity regression and additive model (T-GSRAM) to identify associations between longitudinal imaging QTs and SNPs for detection of potential AD biomarkers. We first construct a nonparametric regression model to analyze the nonlinear association between QTs and SNPs, which can accurately model the complex influence of SNPs on QTs. We then use longitudinal QTs to identify the trajectory of imaging genetic patterns over time. Moreover, the SNP information of group and individual levels are incorporated into the proposed method to boost the power of biomarker detection. Finally, we propose an efficient algorithm to solve the whole T-GSRAM model. We evaluated our method using simulation data and real data obtained from AD neuroimaging initiative. Experimental results show that our proposed method outperforms several state-of-the-art methods in terms of the receiver operating characteristic curves and area under the curve. Moreover, the detection of AD-related genes and QTs has been confirmed in previous studies, thereby further verifying the effectiveness of our approach and helping understand the genetic basis over time during disease progression.
Collapse
|
9
|
Yoon G, Carroll RJ, Gaynanova I. Sparse semiparametric canonical correlation analysis for data of mixed types. Biometrika 2020; 107:609-625. [PMID: 34621080 PMCID: PMC8494134 DOI: 10.1093/biomet/asaa007] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Canonical correlation analysis investigates linear relationships between two sets of variables, but often works poorly on modern datasets due to high-dimensionality and mixed data types (continuous/binary/zero-inflated). We propose a new approach for sparse canonical correlation analysis of mixed data types that does not require explicit parametric assumptions. Our main contribution is the use of truncated latent Gaussian copula to model the data with excess zeroes, which allows us to derive a rank-based estimator of latent correlation matrix without the estimation of marginal transformation functions. The resulting semiparametric sparse canonical correlation analysis method works well in high-dimensional settings as demonstrated via numerical studies, and application to the analysis of association between gene expression and micro RNA data of breast cancer patients.
Collapse
Affiliation(s)
- Grace Yoon
- Department of Statistics, Texas A&M University, College Station, Texas 77843, U.S.A
| | - Raymond J Carroll
- Department of Statistics, Texas A&M University, College Station, Texas 77843, U.S.A
| | - Irina Gaynanova
- Department of Statistics, Texas A&M University, College Station, Texas 77843, U.S.A
| |
Collapse
|
10
|
Detecting genetic associations with brain imaging phenotypes in Alzheimer's disease via a novel structured SCCA approach. Med Image Anal 2020; 61:101656. [PMID: 32062154 DOI: 10.1016/j.media.2020.101656] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2018] [Revised: 11/27/2019] [Accepted: 01/22/2020] [Indexed: 01/15/2023]
Abstract
Brain imaging genetics becomes an important research topic since it can reveal complex associations between genetic factors and the structures or functions of the human brain. Sparse canonical correlation analysis (SCCA) is a popular bi-multivariate association identification method. To mine the complex genetic basis of brain imaging phenotypes, there arise many SCCA methods with a variety of norms for incorporating different structures of interest. They often use the group lasso penalty, the fused lasso or the graph/network guided fused lasso ones. However, the group lasso methods have limited capability because of the incomplete or unavailable prior knowledge in real applications. The fused lasso and graph/network guided methods are sensitive to the sign of the sample correlation which may be incorrectly estimated. In this paper, we introduce two new penalties to improve the fused lasso and the graph/network guided lasso penalties in structured sparse learning. We impose both penalties to the SCCA model and propose an optimization algorithm to solve it. The proposed SCCA method has a strong upper bound of grouping effects for both positively and negatively highly correlated variables. We show that, on both synthetic and real neuroimaging genetics data, the proposed SCCA method performs better than or equally to the conventional methods using fused lasso or graph/network guided fused lasso. In particular, the proposed method identifies higher canonical correlation coefficients and captures clearer canonical weight patterns, demonstrating its promising capability in revealing biologically meaningful imaging genetic associations.
Collapse
|
11
|
Chen J, Liu J, Calhoun VD. The Translational Potential of Neuroimaging Genomic Analyses To Diagnosis And Treatment In The Mental Disorders. PROCEEDINGS OF THE IEEE. INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS 2019; 107:912-927. [PMID: 32051642 PMCID: PMC7015534 DOI: 10.1109/jproc.2019.2913145] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Imaging genomics focuses on characterizing genomic influence on the variation of neurobiological traits, holding promise for illuminating the pathogenesis, reforming the diagnostic system, and precision medicine of mental disorders. This paper aims to provide an overall picture of the current status of neuroimaging-genomic analyses in mental disorders, and how we can increase their translational potential into clinical practice. The review is organized around three perspectives. (a) Towards reliability, generalizability and interpretability, where we summarize the multivariate models and discuss the considerations and trade-offs of using these methods and how reliable findings may be reached, to serve as ground for further delineation. (b) Towards improved diagnosis, where we outline the advantages and challenges of constructing a dimensional transdiagnostic model and how imaging genomic analyses map into this framework to aid in deconstructing heterogeneity and achieving an optimal stratification of patients that better inform treatment planning. (c) Towards improved treatment. Here we highlight recent efforts and progress in elucidating the functional annotations that bridge between genomic risk and neurobiological abnormalities, in detecting genomic predisposition and prodromal neurodevelopmental changes, as well as in identifying imaging genomic biomarkers for predicting treatment response. Providing an overview of the challenges and promises, this review hopefully motivates imaging genomic studies with multivariate, dimensional and transdiagnostic designs for generalizable and interpretable findings that facilitate development of personalized treatment.
Collapse
Affiliation(s)
- Jiayu Chen
- The Mind Research Network, Albuquerque, NM 87106 USA
| | - Jingyu Liu
- The Mind Research Network, Albuquerque, NM 87106 USA, and also with the Department of Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM 87131 USA
| | - Vince D Calhoun
- The Mind Research Network, Albuquerque, NM 87106 USA, and also with the Department of Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM 87131 USA
| |
Collapse
|
12
|
Hao X, Li C, Yan J, Yao X, Risacher SL, Saykin AJ, Shen L, Zhang D. Identification of associations between genotypes and longitudinal phenotypes via temporally-constrained group sparse canonical correlation analysis. Bioinformatics 2018; 33:i341-i349. [PMID: 28881979 PMCID: PMC5870577 DOI: 10.1093/bioinformatics/btx245] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Motivation Neuroimaging genetics identifies the relationships between genetic variants (i.e., the single nucleotide polymorphisms) and brain imaging data to reveal the associations from genotypes to phenotypes. So far, most existing machine-learning approaches are widely used to detect the effective associations between genetic variants and brain imaging data at one time-point. However, those associations are based on static phenotypes and ignore the temporal dynamics of the phenotypical changes. The phenotypes across multiple time-points may exhibit temporal patterns that can be used to facilitate the understanding of the degenerative process. In this article, we propose a novel temporally constrained group sparse canonical correlation analysis (TGSCCA) framework to identify genetic associations with longitudinal phenotypic markers. Results The proposed TGSCCA method is able to capture the temporal changes in brain from longitudinal phenotypes by incorporating the fused penalty, which requires that the differences between two consecutive canonical weight vectors from adjacent time-points should be small. A new efficient optimization algorithm is designed to solve the objective function. Furthermore, we demonstrate the effectiveness of our algorithm on both synthetic and real data (i.e., the Alzheimer’s Disease Neuroimaging Initiative cohort, including progressive mild cognitive impairment, stable MCI and Normal Control participants). In comparison with conventional SCCA, our proposed method can achieve strong associations and discover phenotypic biomarkers across multiple time-points to guide disease-progressive interpretation. Availability and implementation The Matlab code is available at https://sourceforge.net/projects/ibrain-cn/files/.
Collapse
Affiliation(s)
- Xiaoke Hao
- School of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China
| | - Chanxiu Li
- School of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China
| | - Jingwen Yan
- Department of Radiology and Imaging Sciences, School of Medicine, Indiana University, Indianapolis, IN, USA.,School of Informatics and Computing, Indiana University, Indianapolis, IN, USA
| | - Xiaohui Yao
- Department of Radiology and Imaging Sciences, School of Medicine, Indiana University, Indianapolis, IN, USA.,School of Informatics and Computing, Indiana University, Indianapolis, IN, USA
| | - Shannon L Risacher
- Department of Radiology and Imaging Sciences, School of Medicine, Indiana University, Indianapolis, IN, USA
| | - Andrew J Saykin
- Department of Radiology and Imaging Sciences, School of Medicine, Indiana University, Indianapolis, IN, USA
| | - Li Shen
- Department of Radiology and Imaging Sciences, School of Medicine, Indiana University, Indianapolis, IN, USA.,School of Informatics and Computing, Indiana University, Indianapolis, IN, USA
| | - Daoqiang Zhang
- School of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China
| | | |
Collapse
|
13
|
Seiler C, Green T, Hong D, Chromik L, Huffman L, Holmes S, Reiss AL. Multi-Table Differential Correlation Analysis of Neuroanatomical and Cognitive Interactions in Turner Syndrome. Neuroinformatics 2017; 16:81-93. [PMID: 29270892 DOI: 10.1007/s12021-017-9351-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Girls and women with Turner syndrome (TS) have a completely or partially missing X chromosome. Extensive studies on the impact of TS on neuroanatomy and cognition have been conducted. The integration of neuroanatomical and cognitive information into one consistent analysis through multi-table methods is difficult and most standard tests are underpowered. We propose a new two-sample testing procedure that compares associations between two tables in two groups. The procedure combines multi-table methods with permutation tests. In particular, we construct cluster size test statistics that incorporate spatial dependencies. We apply our new procedure to a newly collected dataset comprising of structural brain scans and cognitive test scores from girls with TS and healthy control participants (age and sex matched). We measure neuroanatomy with Tensor-Based Morphometry (TBM) and cognitive function with Wechsler IQ and NEuroPSYchological tests (NEPSY-II). We compare our multi-table testing procedure to a single-table analysis. Our new procedure reports differential correlations between two voxel clusters and a wide range of cognitive tests whereas the single-table analysis reports no differences. Our findings are consistent with the hypothesis that girls with TS have a different brain-cognition association structure than healthy controls.
Collapse
Affiliation(s)
- Christof Seiler
- Department of Statistics, Stanford University, Stanford, CA, USA.
| | - Tamar Green
- Center for Interdisciplinary Brain Sciences Research, Stanford University School of Medicine, Stanford, CA, USA.,Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - David Hong
- Center for Interdisciplinary Brain Sciences Research, Stanford University School of Medicine, Stanford, CA, USA
| | - Lindsay Chromik
- Center for Interdisciplinary Brain Sciences Research, Stanford University School of Medicine, Stanford, CA, USA
| | - Lynne Huffman
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, USA
| | - Susan Holmes
- Department of Statistics, Stanford University, Stanford, CA, USA
| | - Allan L Reiss
- Center for Interdisciplinary Brain Sciences Research, Stanford University School of Medicine, Stanford, CA, USA.,Departments of Radiology, Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA, USA
| |
Collapse
|
14
|
Bogdan R, Salmeron BJ, Carey CE, Agrawal A, Calhoun VD, Garavan H, Hariri AR, Heinz A, Hill MN, Holmes A, Kalin NH, Goldman D. Imaging Genetics and Genomics in Psychiatry: A Critical Review of Progress and Potential. Biol Psychiatry 2017; 82:165-175. [PMID: 28283186 PMCID: PMC5505787 DOI: 10.1016/j.biopsych.2016.12.030] [Citation(s) in RCA: 123] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/09/2016] [Revised: 12/21/2016] [Accepted: 12/28/2016] [Indexed: 12/17/2022]
Abstract
Imaging genetics and genomics research has begun to provide insight into the molecular and genetic architecture of neural phenotypes and the neural mechanisms through which genetic risk for psychopathology may emerge. As it approaches its third decade, imaging genetics is confronted by many challenges, including the proliferation of studies using small sample sizes and diverse designs, limited replication, problems with harmonization of neural phenotypes for meta-analysis, unclear mechanisms, and evidence that effect sizes may be more modest than originally posited, with increasing evidence of polygenicity. These concerns have encouraged the field to grow in many new directions, including the development of consortia and large-scale data collection projects and the use of novel methods (e.g., polygenic approaches, machine learning) that enhance the quality of imaging genetic studies but also introduce new challenges. We critically review progress in imaging genetics and offer suggestions and highlight potential pitfalls of novel approaches. Ultimately, the strength of imaging genetics and genomics lies in their translational and integrative potential with other research approaches (e.g., nonhuman animal models, psychiatric genetics, pharmacologic challenge) to elucidate brain-based pathways that give rise to the vast individual differences in behavior as well as risk for psychopathology.
Collapse
Affiliation(s)
- Ryan Bogdan
- BRAIN Lab, Department of Psychological and Brain Sciences, St. Louis, Missouri.
| | - Betty Jo Salmeron
- Neuroimaging Research Branch, Intramural Research Program, National Institute on Drug Abuse, Baltimore, Maryland
| | - Caitlin E Carey
- BRAIN Lab, Department of Psychological and Brain Sciences, St. Louis, Missouri
| | - Arpana Agrawal
- Department of Psychiatry, Washington University in St. Louis, St. Louis, Missouri
| | - Vince D Calhoun
- Mind Research Network and Lovelace Biomedical and Environmental Research Institute, University of New Mexico, Albuquerque, New Mexico; Departments of Psychiatry and Neuroscience, University of New Mexico, Albuquerque, New Mexico; Electronic and Computer Engineering, University of New Mexico, Albuquerque, New Mexico
| | - Hugh Garavan
- Department of Psychiatry, University of Vermont, Burlington, Vermont
| | - Ahmad R Hariri
- Laboratory of NeuroGenetics, Department of Psychology & Neuroscience, Duke University, Durham, North Carolina
| | - Andreas Heinz
- Department of Child and Adolescent Psychiatry, Psychosomatics, and Psychotherapy, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Matthew N Hill
- Hotchkiss Brain Institute, Departments of Cell Biology and Anatomy and Psychiatry, University of Calgary, Calgary, Alberta, Canada
| | - Andrew Holmes
- Laboratory of Behavioral and Genomic Neuroscience, National Institute on Alcohol Abuse and Alcoholism, Bethesda, Maryland
| | - Ned H Kalin
- Department of Psychiatry, University of Wisconsin, Madison, Wisconsin; Neuroscience Training Program (NHK, RK, PHR, DPMT, MEE), University of Wisconsin, Madison, Wisconsin; Wisconsin National Primate Research Center (NHK, MEE), Madison, Wisconsin
| | - David Goldman
- Laboratory of Neurogenetics, Intramural Research Program, National Institute on Alcohol Abuse and Alcoholism, Bethesda, Maryland
| |
Collapse
|
15
|
Yan J, Risacher SL, Nho K, Saykin AJ, Shen LI. IDENTIFICATION OF DISCRIMINATIVE IMAGING PROTEOMICS ASSOCIATIONS IN ALZHEIMER'S DISEASE VIA A NOVEL SPARSE CORRELATION MODEL. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2017; 22:94-104. [PMID: 27896965 DOI: 10.1142/9789813207813_0010] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Brain imaging and protein expression, from both cerebrospinal fluid and blood plasma, have been found to provide complementary information in predicting the clinical outcomes of Alzheimer's disease (AD). But the underlying associations that contribute to such a complementary relationship have not been previously studied yet. In this work, we will perform an imaging proteomics association analysis to explore how they are related with each other. While traditional association models, such as Sparse Canonical Correlation Analysis (SCCA), can not guarantee the selection of only disease-relevant biomarkers and associations, we propose a novel discriminative SCCA (denoted as DSCCA) model with new penalty terms to account for the disease status information. Given brain imaging, proteomic and diagnostic data, the proposed model can perform a joint association and multi-class discrimination analysis, such that we can not only identify disease-relevant multimodal biomarkers, but also reveal strong associations between them. Based on a real imaging proteomic data set, the empirical results show that DSCCA and traditional SCCA have comparable association performances. But in a further classification analysis, canonical variables of imaging and proteomic data obtained in DSCCA demonstrate much more discrimination power toward multiple pairs of diagnosis groups than those obtained in SCCA.
Collapse
Affiliation(s)
- Jingwen Yan
- Department of BioHealth Informatics, Indiana University, Indianapolis, 46202, USA2Center for Computational Biology and Bioinformatics, School of Medicine, Indiana University, Indianapolis, 46202, USA*To whom correspondence should be addressed.,
| | | | | | | | | |
Collapse
|
16
|
Mining Outcome-relevant Brain Imaging Genetic Associations via Three-way Sparse Canonical Correlation Analysis in Alzheimer's Disease. Sci Rep 2017; 7:44272. [PMID: 28291242 PMCID: PMC5349597 DOI: 10.1038/srep44272] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2016] [Accepted: 02/07/2017] [Indexed: 11/24/2022] Open
Abstract
Neuroimaging genetics is an emerging field that aims to identify the associations between genetic variants (e.g., single nucleotide polymorphisms (SNPs)) and quantitative traits (QTs) such as brain imaging phenotypes. In recent studies, in order to detect complex multi-SNP-multi-QT associations, bi-multivariate techniques such as various structured sparse canonical correlation analysis (SCCA) algorithms have been proposed and used in imaging genetics studies. However, associations between genetic markers and imaging QTs identified by existing bi-multivariate methods may not be all disease specific. To bridge this gap, we propose an analytical framework, based on three-way sparse canonical correlation analysis (T-SCCA), to explore the intrinsic associations among genetic markers, imaging QTs, and clinical scores of interest. We perform an empirical study using the Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohort to discover the relationships among SNPs from AD risk gene APOE, imaging QTs extracted from structural magnetic resonance imaging scans, and cognitive and diagnostic outcomes. The proposed T-SCCA model not only outperforms the traditional SCCA method in terms of identifying strong associations, but also discovers robust outcome-relevant imaging genetic patterns, demonstrating its promise for improving disease-related mechanistic understanding.
Collapse
|
17
|
Chekouo T, Stingo FC, Guindani M, Do KA. A Bayesian predictive model for imaging genetics with application to schizophrenia. Ann Appl Stat 2016. [DOI: 10.1214/16-aoas948] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
18
|
Du L, Huang H, Yan J, Kim S, Risacher S, Inlow M, Moore J, Saykin A, Shen L. Structured sparse CCA for brain imaging genetics via graph OSCAR. BMC SYSTEMS BIOLOGY 2016; 10 Suppl 3:68. [PMID: 27585988 PMCID: PMC5009827 DOI: 10.1186/s12918-016-0312-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
BACKGROUND Recently, structured sparse canonical correlation analysis (SCCA) has received increased attention in brain imaging genetics studies. It can identify bi-multivariate imaging genetic associations as well as select relevant features with desired structure information. These SCCA methods either use the fused lasso regularizer to induce the smoothness between ordered features, or use the signed pairwise difference which is dependent on the estimated sign of sample correlation. Besides, several other structured SCCA models use the group lasso or graph fused lasso to encourage group structure, but they require the structure/group information provided in advance which sometimes is not available. RESULTS We propose a new structured SCCA model, which employs the graph OSCAR (GOSCAR) regularizer to encourage those highly correlated features to have similar or equal canonical weights. Our GOSCAR based SCCA has two advantages: 1) It does not require to pre-define the sign of the sample correlation, and thus could reduce the estimation bias. 2) It could pull those highly correlated features together no matter whether they are positively or negatively correlated. We evaluate our method using both synthetic data and real data. Using the 191 ROI measurements of amyloid imaging data, and 58 genetic markers within the APOE gene, our method identifies a strong association between APOE SNP rs429358 and the amyloid burden measure in the frontal region. In addition, the estimated canonical weights present a clear pattern which is preferable for further investigation. CONCLUSIONS Our proposed method shows better or comparable performance on the synthetic data in terms of the estimated correlations and canonical loadings. It has successfully identified an important association between an Alzheimer's disease risk SNP rs429358 and the amyloid burden measure in the frontal region.
Collapse
Affiliation(s)
- Lei Du
- School of Medicine, Indiana University, Indianapolis, USA
| | - Heng Huang
- Computer Science & Engineering, University of Texas at Arlington, Arlington, USA
| | - Jingwen Yan
- School of Medicine, Indiana University, Indianapolis, USA
| | - Sungeun Kim
- School of Medicine, Indiana University, Indianapolis, USA
| | | | | | - Jason Moore
- School of Medicine, University of Pennsylvania, Philadelphia, USA
| | - Andrew Saykin
- School of Medicine, Indiana University, Indianapolis, USA
| | - Li Shen
- School of Medicine, Indiana University, Indianapolis, USA
| | - for the Alzheimer’s Disease Neuroimaging Initiative
- School of Medicine, Indiana University, Indianapolis, USA
- Computer Science & Engineering, University of Texas at Arlington, Arlington, USA
- Terre Haute, USA
- School of Medicine, University of Pennsylvania, Philadelphia, USA
| |
Collapse
|
19
|
Abstract
Genome-wide studies have been successful in identifying stable associations of single genes, albeit at the price of a high false negative rate. The promise of endophenotypes to increase power of genome-wide association studies has only been partially fulfilled. To optimize the investigation of genetic influences on behavioral (endo-)phenotypes, the development of novel phenotypical characterizations and methods to describe the relation between genotype and phenotype are needed. This will require the development of innovative analytical strategies, as well as corroborative approaches linking association studies with functional characterizations. The sole reliance on canonical genome-wide significance thresholds is not sufficient to describe the complex relation of genotype and phenotype.
Collapse
Affiliation(s)
- Gunter Schumann
- MRC-SGDP Centre, Institute of Psychiatry, King's College, London, London, UK
| |
Collapse
|
20
|
Abstract
Emerging integrative analysis of genomic and anatomical imaging data which has not been well developed, provides invaluable information for the holistic discovery of the genomic structure of disease and has the potential to open a new avenue for discovering novel disease susceptibility genes which cannot be identified if they are analyzed separately. A key issue to the success of imaging and genomic data analysis is how to reduce their dimensions. Most previous methods for imaging information extraction and RNA-seq data reduction do not explore imaging spatial information and often ignore gene expression variation at the genomic positional level. To overcome these limitations, we extend functional principle component analysis from one dimension to two dimensions (2DFPCA) for representing imaging data and develop a multiple functional linear model (MFLM) in which functional principal scores of images are taken as multiple quantitative traits and RNA-seq profile across a gene is taken as a function predictor for assessing the association of gene expression with images. The developed method has been applied to image and RNA-seq data of ovarian cancer and kidney renal clear cell carcinoma (KIRC) studies. We identified 24 and 84 genes whose expressions were associated with imaging variations in ovarian cancer and KIRC studies, respectively. Our results showed that many significantly associated genes with images were not differentially expressed, but revealed their morphological and metabolic functions. The results also demonstrated that the peaks of the estimated regression coefficient function in the MFLM often allowed the discovery of splicing sites and multiple isoforms of gene expressions.
Collapse
|
21
|
Yan J, Du L, Kim S, Risacher SL, Huang H, Moore JH, Saykin AJ, Shen L. Transcriptome-guided amyloid imaging genetic analysis via a novel structured sparse learning algorithm. Bioinformatics 2015; 30:i564-71. [PMID: 25161248 PMCID: PMC4147918 DOI: 10.1093/bioinformatics/btu465] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Imaging genetics is an emerging field that studies the influence of genetic variation on brain structure and function. The major task is to examine the association between genetic markers such as single-nucleotide polymorphisms (SNPs) and quantitative traits (QTs) extracted from neuroimaging data. The complexity of these datasets has presented critical bioinformatics challenges that require new enabling tools. Sparse canonical correlation analysis (SCCA) is a bi-multivariate technique used in imaging genetics to identify complex multi-SNP-multi-QT associations. However, most of the existing SCCA algorithms are designed using the soft thresholding method, which assumes that the input features are independent from one another. This assumption clearly does not hold for the imaging genetic data. In this article, we propose a new knowledge-guided SCCA algorithm (KG-SCCA) to overcome this limitation as well as improve learning results by incorporating valuable prior knowledge. RESULTS The proposed KG-SCCA method is able to model two types of prior knowledge: one as a group structure (e.g. linkage disequilibrium blocks among SNPs) and the other as a network structure (e.g. gene co-expression network among brain regions). The new model incorporates these prior structures by introducing new regularization terms to encourage weight similarity between grouped or connected features. A new algorithm is designed to solve the KG-SCCA model without imposing the independence constraint on the input features. We demonstrate the effectiveness of our algorithm with both synthetic and real data. For real data, using an Alzheimer's disease (AD) cohort, we examine the imaging genetic associations between all SNPs in the APOE gene (i.e. top AD gene) and amyloid deposition measures among cortical regions (i.e. a major AD hallmark). In comparison with a widely used SCCA implementation, our KG-SCCA algorithm produces not only improved cross-validation performances but also biologically meaningful results. AVAILABILITY Software is freely available on request.
Collapse
Affiliation(s)
- Jingwen Yan
- BioHealth, Indiana University School of Informatics & Computing, Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA, Computer Science & Engineering, The University of Texas at Arlington, TX 76019, USA and Genetics, Community & Family Medicine, Dartmouth Medical School, Lebanon, NH 03756, USA BioHealth, Indiana University School of Informatics & Computing, Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA, Computer Science & Engineering, The University of Texas at Arlington, TX 76019, USA and Genetics, Community & Family Medicine, Dartmouth Medical School, Lebanon, NH 03756, USA
| | - Lei Du
- BioHealth, Indiana University School of Informatics & Computing, Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA, Computer Science & Engineering, The University of Texas at Arlington, TX 76019, USA and Genetics, Community & Family Medicine, Dartmouth Medical School, Lebanon, NH 03756, USA
| | - Sungeun Kim
- BioHealth, Indiana University School of Informatics & Computing, Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA, Computer Science & Engineering, The University of Texas at Arlington, TX 76019, USA and Genetics, Community & Family Medicine, Dartmouth Medical School, Lebanon, NH 03756, USA
| | - Shannon L Risacher
- BioHealth, Indiana University School of Informatics & Computing, Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA, Computer Science & Engineering, The University of Texas at Arlington, TX 76019, USA and Genetics, Community & Family Medicine, Dartmouth Medical School, Lebanon, NH 03756, USA
| | - Heng Huang
- BioHealth, Indiana University School of Informatics & Computing, Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA, Computer Science & Engineering, The University of Texas at Arlington, TX 76019, USA and Genetics, Community & Family Medicine, Dartmouth Medical School, Lebanon, NH 03756, USA
| | - Jason H Moore
- BioHealth, Indiana University School of Informatics & Computing, Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA, Computer Science & Engineering, The University of Texas at Arlington, TX 76019, USA and Genetics, Community & Family Medicine, Dartmouth Medical School, Lebanon, NH 03756, USA
| | - Andrew J Saykin
- BioHealth, Indiana University School of Informatics & Computing, Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA, Computer Science & Engineering, The University of Texas at Arlington, TX 76019, USA and Genetics, Community & Family Medicine, Dartmouth Medical School, Lebanon, NH 03756, USA
| | - Li Shen
- BioHealth, Indiana University School of Informatics & Computing, Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA, Computer Science & Engineering, The University of Texas at Arlington, TX 76019, USA and Genetics, Community & Family Medicine, Dartmouth Medical School, Lebanon, NH 03756, USA
| | | |
Collapse
|
22
|
GN-SCCA: GraphNet based Sparse Canonical Correlation Analysis for Brain Imaging Genetics. BRAIN INFORMATICS AND HEALTH : 8TH INTERNATIONAL CONFERENCE, BIH 2015, LONDON, UK, AUGUST 30-SEPTEMBER 2, 2015 : PROCEEDINGS. BIH (CONFERENCE) (8TH : 2015 : LONDON, ENGLAND) 2015; 9250:275-284. [PMID: 26636135 DOI: 10.1007/978-3-319-23344-4_27] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
Identifying associations between genetic variants and neuroimaging quantitative traits (QTs) is a popular research topic in brain imaging genetics. Sparse canonical correlation analysis (SCCA) has been widely used to reveal complex multi-SNP-multi-QT associations. Several SCCA methods explicitly incorporate prior knowledge into the model and intend to uncover the hidden structure informed by the prior knowledge. We propose a novel structured SCCA method using Graph constrained Elastic-Net (GraphNet) regularizer to not only discover important associations, but also induce smoothness between coefficients that are adjacent in the graph. In addition, the proposed method incorporates the covariance structure information usually ignored by most SCCA methods. Experiments on simulated and real imaging genetic data show that, the proposed method not only outperforms a widely used SCCA method but also yields an easy-to-interpret biological findings.
Collapse
|
23
|
Sheng J, Kim S, Yan J, Moore J, Saykin A, Shen L. DATA SYNTHESIS AND METHOD EVALUATION FOR BRAIN IMAGING GENETICS. PROCEEDINGS. IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING 2014; 2014:1202-1205. [PMID: 25408823 DOI: 10.1109/isbi.2014.6868091] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Brain imaging genetics is an emergent research field where the association between genetic variations such as single nucleotide polymorphisms (SNPs) and neuroimaging quantitative traits (QTs) is evaluated. Sparse canonical correlation analysis (SCCA) is a bi-multivariate analysis method that has the potential to reveal complex multi-SNP-multi-QT associations. We present initial efforts on evaluating a few SCCA methods for brain imaging genetics. This includes a data synthesis method to create realistic imaging genetics data with known SNP-QT associations, application of three SCCA algorithms to the synthetic data, and comparative study of their performances. Our empirical results suggest, approximating covariance structure using an identity or diagonal matrix, an approach used in these SCCA algorithms, could limit the SCCA capability in identifying the underlying imaging genetics associations. An interesting future direction is to develop enhanced SCCA methods that effectively take into account the covariance structures in the imaging genetics data.
Collapse
Affiliation(s)
- Jinhua Sheng
- Radiology and Imaging Sciences, BioHealth Informatics, Indiana University, IN, USA
| | - Sungeun Kim
- Radiology and Imaging Sciences, BioHealth Informatics, Indiana University, IN, USA
| | - Jingwen Yan
- Radiology and Imaging Sciences, BioHealth Informatics, Indiana University, IN, USA
| | - Jason Moore
- Genetics, Community and Family Medicine, School of Medicine at Dartmouth College, NH, USA
| | - Andrew Saykin
- Radiology and Imaging Sciences, BioHealth Informatics, Indiana University, IN, USA
| | - Li Shen
- Radiology and Imaging Sciences, BioHealth Informatics, Indiana University, IN, USA
| |
Collapse
|
24
|
Liu J, Calhoun VD. A review of multivariate analyses in imaging genetics. Front Neuroinform 2014; 8:29. [PMID: 24723883 PMCID: PMC3972473 DOI: 10.3389/fninf.2014.00029] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2013] [Accepted: 03/04/2014] [Indexed: 12/13/2022] Open
Abstract
Recent advances in neuroimaging technology and molecular genetics provide the unique opportunity to investigate genetic influence on the variation of brain attributes. Since the year 2000, when the initial publication on brain imaging and genetics was released, imaging genetics has been a rapidly growing research approach with increasing publications every year. Several reviews have been offered to the research community focusing on various study designs. In addition to study design, analytic tools and their proper implementation are also critical to the success of a study. In this review, we survey recent publications using data from neuroimaging and genetics, focusing on methods capturing multivariate effects accommodating the large number of variables from both imaging data and genetic data. We group the analyses of genetic or genomic data into either a priori driven or data driven approach, including gene-set enrichment analysis, multifactor dimensionality reduction, principal component analysis, independent component analysis (ICA), and clustering. For the analyses of imaging data, ICA and extensions of ICA are the most widely used multivariate methods. Given detailed reviews of multivariate analyses of imaging data available elsewhere, we provide a brief summary here that includes a recently proposed method known as independent vector analysis. Finally, we review methods focused on bridging the imaging and genetic data by establishing multivariate and multiple genotype-phenotype-associations, including sparse partial least squares, sparse canonical correlation analysis, sparse reduced rank regression and parallel ICA. These methods are designed to extract latent variables from both genetic and imaging data, which become new genotypes and phenotypes, and the links between the new genotype-phenotype pairs are maximized using different cost functions. The relationship between these methods along with their assumptions, advantages, and limitations are discussed.
Collapse
Affiliation(s)
- Jingyu Liu
- The Mind Research Network and Lovelace Biomedical and Environmental Research InstituteAlbuquerque, NM, USA
- Department of Electrical and Computer Engineering, University of New MexicoAlbuquerque, NM, USA
| | - Vince D. Calhoun
- The Mind Research Network and Lovelace Biomedical and Environmental Research InstituteAlbuquerque, NM, USA
- Department of Electrical and Computer Engineering, University of New MexicoAlbuquerque, NM, USA
| |
Collapse
|
25
|
|
26
|
A novel structure-aware sparse learning algorithm for brain imaging genetics. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION : MICCAI ... INTERNATIONAL CONFERENCE ON MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION 2014; 17:329-36. [PMID: 25320816 DOI: 10.1007/978-3-319-10443-0_42] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Brain imaging genetics is an emergent research field where the association between genetic variations such as single nucleotide polymorphisms (SNPs) and neuroimaging quantitative traits (QTs) is evaluated. Sparse canonical correlation analysis (SCCA) is a bi-multivariate analysis method that has the potential to reveal complex multi-SNP-multi-QT associations. Most existing SCCA algorithms are designed using the soft threshold strategy, which assumes that the features in the data are independent from each other. This independence assumption usually does not hold in imaging genetic data, and thus inevitably limits the capability of yielding optimal solutions. We propose a novel structure-aware SCCA (denoted as S2CCA) algorithm to not only eliminate the independence assumption for the input data, but also incorporate group-like structure in the model. Empirical comparison with a widely used SCCA implementation, on both simulated and real imaging genetic data, demonstrated that S2CCA could yield improved prediction performance and biologically meaningful findings.
Collapse
|
27
|
Thompson PM, Ge T, Glahn DC, Jahanshad N, Nichols TE. Genetics of the connectome. Neuroimage 2013; 80:475-88. [PMID: 23707675 PMCID: PMC3905600 DOI: 10.1016/j.neuroimage.2013.05.013] [Citation(s) in RCA: 132] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2013] [Revised: 05/05/2013] [Accepted: 05/08/2013] [Indexed: 11/24/2022] Open
Abstract
Connectome genetics attempts to discover how genetic factors affect brain connectivity. Here we review a variety of genetic analysis methods--such as genome-wide association studies (GWAS), linkage and candidate gene studies--that have been fruitfully adapted to imaging data to implicate specific variants in the genome for brain-related traits. Studies that emphasized the genetic influences on brain connectivity. Some of these analyses of brain integrity and connectivity using diffusion MRI, and others have mapped genetic effects on functional networks using resting state functional MRI. Connectome-wide genome-wide scans have also been conducted, and we review the multivariate methods required to handle the extremely high dimension of the genomic and network data. We also review some consortium efforts, such as ENIGMA, that offer the power to detect robust common genetic associations using phenotypic harmonization procedures and meta-analysis. Current work on connectome genetics is advancing on many fronts and promises to shed light on how disease risk genes affect the brain. It is already discovering new genetic loci and even entire genetic networks that affect brain organization and connectivity.
Collapse
Affiliation(s)
- Paul M Thompson
- Imaging Genetics Center, Laboratory of NeuroImaging, Dept. of Neurology, UCLA School of Medicine, Los Angeles, CA 90095, USA.
| | | | | | | | | |
Collapse
|