1
|
Das P, Peterson CB, Ni Y, Reuben A, Zhang J, Zhang J, Do KA, Baladandayuthapani V. Bayesian hierarchical quantile regression with application to characterizing the immune architecture of lung cancer. Biometrics 2023; 79:2474-2488. [PMID: 36239535 PMCID: PMC10102253 DOI: 10.1111/biom.13774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Accepted: 08/18/2022] [Indexed: 12/14/2022]
Abstract
The successful development and implementation of precision immuno-oncology therapies requires a deeper understanding of the immune architecture at a patient level. T-cell receptor (TCR) repertoire sequencing is a relatively new technology that enables monitoring of T-cells, a subset of immune cells that play a central role in modulating immune response. These immunologic relationships are complex and are governed by various distributional aspects of an individual patient's tumor profile. We propose Bayesian QUANTIle regression for hierarchical COvariates (QUANTICO) that allows simultaneous modeling of hierarchical relationships between multilevel covariates, conducts explicit variable selection, estimates quantile and patient-specific coefficient effects, to induce individualized inference. We show QUANTICO outperforms existing approaches in multiple simulation scenarios. We demonstrate the utility of QUANTICO to investigate the effect of TCR variables on immune response in a cohort of lung cancer patients. At population level, our analyses reveal the mechanistic role of T-cell proportion on the immune cell abundance, with tumor mutation burden as an important factor modulating this relationship. At a patient level, we find several outlier patients based on their quantile-specific coefficient functions, who have higher mutational rates and different smoking history.
Collapse
Affiliation(s)
- Priyam Das
- Dept. of Biomedical Informatics, Harvard Medical School
| | | | - Yang Ni
- Dept. of Statistics, Texas A&M University
| | - Alexandre Reuben
- Dept. of Thoracic Head and Neck Medical Oncology, University of Texas MD Anderson Cancer Center
| | - Jiexin Zhang
- Dept. of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center
| | - Jianjun Zhang
- Dept. of Thoracic Head and Neck Medical Oncology, University of Texas MD Anderson Cancer Center
| | - Kim-Anh Do
- Dept. of Biostatistics, University of Texas MD Anderson Cancer Center
| | | |
Collapse
|
2
|
Beaulac C, Wu S, Gibson E, Miranda MF, Cao J, Rocha L, Beg MF, Nathoo FS. Neuroimaging feature extraction using a neural network classifier for imaging genetics. BMC Bioinformatics 2023; 24:271. [PMID: 37391692 DOI: 10.1186/s12859-023-05394-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 06/21/2023] [Indexed: 07/02/2023] Open
Abstract
BACKGROUND Dealing with the high dimension of both neuroimaging data and genetic data is a difficult problem in the association of genetic data to neuroimaging. In this article, we tackle the latter problem with an eye toward developing solutions that are relevant for disease prediction. Supported by a vast literature on the predictive power of neural networks, our proposed solution uses neural networks to extract from neuroimaging data features that are relevant for predicting Alzheimer's Disease (AD) for subsequent relation to genetics. The neuroimaging-genetic pipeline we propose is comprised of image processing, neuroimaging feature extraction and genetic association steps. We present a neural network classifier for extracting neuroimaging features that are related with the disease. The proposed method is data-driven and requires no expert advice or a priori selection of regions of interest. We further propose a multivariate regression with priors specified in the Bayesian framework that allows for group sparsity at multiple levels including SNPs and genes. RESULTS We find the features extracted with our proposed method are better predictors of AD than features used previously in the literature suggesting that single nucleotide polymorphisms (SNPs) related to the features extracted by our proposed method are also more relevant for AD. Our neuroimaging-genetic pipeline lead to the identification of some overlapping and more importantly some different SNPs when compared to those identified with previously used features. CONCLUSIONS The pipeline we propose combines machine learning and statistical methods to benefit from the strong predictive performance of blackbox models to extract relevant features while preserving the interpretation provided by Bayesian models for genetic association. Finally, we argue in favour of using automatic feature extraction, such as the method we propose, in addition to ROI or voxelwise analysis to find potentially novel disease-relevant SNPs that may not be detected when using ROIs or voxels alone.
Collapse
Affiliation(s)
- Cédric Beaulac
- School of Engineering Science, Simon Fraser University, Burnaby, Canada.
- Department of Mathematics and Statistics, University of Victoria, Victoria, Canada.
| | - Sidi Wu
- Department of Statistics and Actuarial Sciences, Simon Fraser University, Burnaby, Canada
| | - Erin Gibson
- School of Engineering Science, Simon Fraser University, Burnaby, Canada
| | - Michelle F Miranda
- Department of Mathematics and Statistics, University of Victoria, Victoria, Canada
| | - Jiguo Cao
- Department of Statistics and Actuarial Sciences, Simon Fraser University, Burnaby, Canada
| | - Leno Rocha
- Department of Mathematics and Statistics, University of Victoria, Victoria, Canada
| | - Mirza Faisal Beg
- School of Engineering Science, Simon Fraser University, Burnaby, Canada
| | - Farouk S Nathoo
- Department of Mathematics and Statistics, University of Victoria, Victoria, Canada
| |
Collapse
|
3
|
Shoemaker K, Ger R, Court LE, Aerts H, Vannucci M, Peterson CB. Bayesian feature selection for radiomics using reliability metrics. Front Genet 2023; 14:1112914. [PMID: 36968604 PMCID: PMC10030957 DOI: 10.3389/fgene.2023.1112914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 02/23/2023] [Indexed: 03/10/2023] Open
Abstract
Introduction: Imaging of tumors is a standard step in diagnosing cancer and making subsequent treatment decisions. The field of radiomics aims to develop imaging based biomarkers using methods rooted in artificial intelligence applied to medical imaging. However, a challenging aspect of developing predictive models for clinical use is that many quantitative features derived from image data exhibit instability or lack of reproducibility across different imaging systems or image-processing pipelines.Methods: To address this challenge, we propose a Bayesian sparse modeling approach for image classification based on radiomic features, where the inclusion of more reliable features is favored via a probit prior formulation.Results: We verify through simulation studies that this approach can improve feature selection and prediction given correct prior information. Finally, we illustrate the method with an application to the classification of head and neck cancer patients by human papillomavirus status, using as our prior information a reliability metric quantifying feature stability across different imaging systems.
Collapse
Affiliation(s)
- Katherine Shoemaker
- Department of Mathematics and Statistics, University of Houston-Downtown, Houston, TX, United States
- *Correspondence: Katherine Shoemaker,
| | - Rachel Ger
- Department of Radiation Oncology and Molecular Radiation Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, United States
| | - Laurence E. Court
- Department of Radiation Physics, The University of Texas MD Anderson Cancer Center, Houston, TX, United States
| | - Hugo Aerts
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, United States
- Department of Radiation Oncology, Brigham and Women’s Hospital, Harvard Medical School, Dana-Farber Cancer Institute, Boston, MA, United States
- Radiology and Nuclear Medicine, CARIM & GROW, Maastricht University, Maastricht, Netherlands
| | - Marina Vannucci
- Department of Statistics, Rice University, Houston, TX, United States
| | - Christine B. Peterson
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, United States
| |
Collapse
|
4
|
Huang W, Tan K, Zhang Z, Hu J, Dong S. A Review of Fusion Methods for Omics and Imaging Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:74-93. [PMID: 35044920 DOI: 10.1109/tcbb.2022.3143900] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
The development of omics data and biomedical images has greatly advanced the progress of precision medicine in diagnosis, treatment, and prognosis. The fusion of omics and imaging data, i.e., omics-imaging fusion, offers a new strategy for understanding complex diseases. However, due to a variety of issues such as the limited number of samples, high dimensionality of features, and heterogeneity of different data types, efficiently learning complementary or associated discriminative fusion information from omics and imaging data remains a challenge. Recently, numerous machine learning methods have been proposed to alleviate these problems. In this review, from the perspective of fusion levels and fusion methods, we first provide an overview of preprocessing and feature extraction methods for omics and imaging data, and comprehensively analyze and summarize the basic forms and variations of commonly used and newly emerging fusion methods, along with their advantages, disadvantages and the applicable scope. We then describe public datasets and compare experimental results of various fusion methods on the ADNI and TCGA datasets. Finally, we discuss future prospects and highlight remaining challenges in the field.
Collapse
|
5
|
Song Y, Ge S, Cao J, Wang L, Nathoo FS. A Bayesian spatial model for imaging genetics. Biometrics 2021; 78:742-753. [PMID: 33765325 DOI: 10.1111/biom.13460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2019] [Revised: 02/08/2021] [Accepted: 02/24/2021] [Indexed: 11/29/2022]
Abstract
We develop a Bayesian bivariate spatial model for multivariate regression analysis applicable to studies examining the influence of genetic variation on brain structure. Our model is motivated by an imaging genetics study of the Alzheimer's Disease Neuroimaging Initiative (ADNI), where the objective is to examine the association between images of volumetric and cortical thickness values summarizing the structure of the brain as measured by magnetic resonance imaging (MRI) and a set of 486 single nucleotide polymorphism (SNPs) from 33 Alzheimer's disease (AD) candidate genes obtained from 632 subjects. A bivariate spatial process model is developed to accommodate the correlation structures typically seen in structural brain imaging data. First, we allow for spatial correlation on a graph structure in the imaging phenotypes obtained from a neighborhood matrix for measures on the same hemisphere of the brain. Second, we allow for correlation in the same measures obtained from different hemispheres (left/right) of the brain. We develop a mean-field variational Bayes algorithm and a Gibbs sampling algorithm to fit the model. We also incorporate Bayesian false discovery rate (FDR) procedures to select SNPs. We implement the methodology in a new release of the R package bgsmtr. We show that the new spatial model demonstrates superior performance over a standard model in our application. Data used in the preparation of this article were obtained from the ADNI database (https://adni.loni.usc.edu).
Collapse
Affiliation(s)
- Yin Song
- Department of Mathematics and Statistics, University of Victoria, British Columbia, Canada
| | - Shufei Ge
- Institute of Mathematical Sciences, ShanghaiTech University, Shanghai, China
| | - Jiguo Cao
- Statistics and Actuarial Science, Simon Fraser University, British Columbia, Canada
| | - Liangliang Wang
- Statistics and Actuarial Science, Simon Fraser University, British Columbia, Canada
| | - Farouk S Nathoo
- Department of Mathematics and Statistics, University of Victoria, British Columbia, Canada
| |
Collapse
|
6
|
Nie Y, Opoku E, Yasmin L, Song Y, Wang J, Wu S, Scarapicchia V, Gawryluk J, Wang L, Cao J, Nathoo FS. Spectral dynamic causal modelling of resting-state fMRI: an exploratory study relating effective brain connectivity in the default mode network to genetics. Stat Appl Genet Mol Biol 2020; 19:/j/sagmb.ahead-of-print/sagmb-2019-0058/sagmb-2019-0058.xml. [PMID: 32866136 DOI: 10.1515/sagmb-2019-0058] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2019] [Accepted: 07/27/2020] [Indexed: 11/15/2022]
Abstract
We conduct an imaging genetics study to explore how effective brain connectivity in the default mode network (DMN) may be related to genetics within the context of Alzheimer's disease and mild cognitive impairment. We develop an analysis of longitudinal resting-state functional magnetic resonance imaging (rs-fMRI) and genetic data obtained from a sample of 111 subjects with a total of 319 rs-fMRI scans from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. A Dynamic Causal Model (DCM) is fit to the rs-fMRI scans to estimate effective brain connectivity within the DMN and related to a set of single nucleotide polymorphisms (SNPs) contained in an empirical disease-constrained set which is obtained out-of-sample from 663 ADNI subjects having only genome-wide data. We relate longitudinal effective brain connectivity estimated using spectral DCM to SNPs using both linear mixed effect (LME) models as well as function-on-scalar regression (FSR). In both cases we implement a parametric bootstrap for testing SNP coefficients and make comparisons with p-values obtained from asymptotic null distributions. In both networks at an initial q-value threshold of 0.1 no effects are found. We report on exploratory patterns of associations with relatively high ranks that exhibit stability to the differing assumptions made by both FSR and LME.
Collapse
Affiliation(s)
- Yunlong Nie
- Department of Statistics and Actuarial Science, Simon Fraser University, Room SC K10545 8888 University Drive, Burnaby, BCV5A 1S6,Canada
| | - Eugene Opoku
- Department of Mathematics and Statistics, University of Victoria, Victoria, Canada
| | - Laila Yasmin
- Department of Mathematics and Statistics, University of Victoria, Victoria, Canada
| | - Yin Song
- Department of Mathematics and Statistics, University of Victoria, Victoria, Canada
| | - Jie Wang
- Department of Statistics and Actuarial Science, Simon Fraser University, Room SC K10545 8888 University Drive, Burnaby, BCV5A 1S6,Canada
| | - Sidi Wu
- Department of Statistics and Actuarial Science, Simon Fraser University, Room SC K10545 8888 University Drive, Burnaby, BCV5A 1S6,Canada
| | - Vanessa Scarapicchia
- Department of Psychology, University of Victoria, P. O. Box 1700 STN CSC, Victoria, British Columbia, V8W 2Y2Canada
| | - Jodie Gawryluk
- Department of Psychology, University of Victoria, P. O. Box 1700 STN CSC, Victoria, British Columbia, V8W 2Y2Canada
| | - Liangliang Wang
- Department of Statistics and Actuarial Science, Simon Fraser University, Room SC K10545 8888 University Drive, Burnaby, BCV5A 1S6,Canada
| | - Jiguo Cao
- Department of Statistics and Actuarial Science, Simon Fraser University, Room SC K10545 8888 University Drive, Burnaby, BCV5A 1S6,Canada
| | - Farouk S Nathoo
- Department of Mathematics and Statistics, University of Victoria, Victoria, Canada
| |
Collapse
|
7
|
Li Q, Wang X, Liang F, Yi F, Xie Y, Gazdar A, Xiao G. A Bayesian hidden Potts mixture model for analyzing lung cancer pathology images. Biostatistics 2020; 20:565-581. [PMID: 29788035 DOI: 10.1093/biostatistics/kxy019] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2017] [Accepted: 03/18/2018] [Indexed: 01/27/2023] Open
Abstract
Digital pathology imaging of tumor tissues, which captures histological details in high resolution, is fast becoming a routine clinical procedure. Recent developments in deep-learning methods have enabled the identification, characterization, and classification of individual cells from pathology images analysis at a large scale. This creates new opportunities to study the spatial patterns of and interactions among different types of cells. Reliable statistical approaches to modeling such spatial patterns and interactions can provide insight into tumor progression and shed light on the biological mechanisms of cancer. In this article, we consider the problem of modeling a pathology image with irregular locations of three different types of cells: lymphocyte, stromal, and tumor cells. We propose a novel Bayesian hierarchical model, which incorporates a hidden Potts model to project the irregularly distributed cells to a square lattice and a Markov random field prior model to identify regions in a heterogeneous pathology image. The model allows us to quantify the interactions between different types of cells, some of which are clinically meaningful. We use Markov chain Monte Carlo sampling techniques, combined with a double Metropolis-Hastings algorithm, in order to simulate samples approximately from a distribution with an intractable normalizing constant. The proposed model was applied to the pathology images of $205$ lung cancer patients from the National Lung Screening trial, and the results show that the interaction strength between tumor and stromal cells predicts patient prognosis (P = $0.005$). This statistical methodology provides a new perspective for understanding the role of cell-cell interactions in cancer progression.
Collapse
Affiliation(s)
- Qiwei Li
- Department of Clinical Sciences, UT Southwestern Medical Center, Dallas, TX, USA
| | - Xinlei Wang
- Department of Statistics, Southern Methodist University, Dallas, TX, USA
| | - Faming Liang
- Department of Statistics, Purdue University, West Lafayette, IN, USA
| | - Faliu Yi
- Department of Clinical Sciences, UT Southwestern Medical Center, Dallas, TX, USA
| | - Yang Xie
- Department of Clinical Sciences, UT Southwestern Medical Center, Dallas, TX, USA
| | - Adi Gazdar
- Department of Pathology, UT Southwestern Medical Center, Dallas, TX, USA and Hamon Center for Therapeutic Oncology Research, UT Southwestern Medical Center, Dallas, TX, USA
| | - Guanghua Xiao
- Department of Clinical Sciences, UT Southwestern Medical Center, Dallas, TX, USA
| |
Collapse
|
8
|
Jiang S, Xiao G, Koh AY, Kim J, Li Q, Zhan X. A Bayesian zero-inflated negative binomial regression model for the integrative analysis of microbiome data. Biostatistics 2019; 22:522-540. [PMID: 31844880 DOI: 10.1093/biostatistics/kxz050] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Revised: 10/07/2019] [Accepted: 10/09/2019] [Indexed: 12/13/2022] Open
Abstract
Microbiome omics approaches can reveal intriguing relationships between the human microbiome and certain disease states. Along with identification of specific bacteria taxa associated with diseases, recent scientific advancements provide mounting evidence that metabolism, genetics, and environmental factors can all modulate these microbial effects. However, the current methods for integrating microbiome data and other covariates are severely lacking. Hence, we present an integrative Bayesian zero-inflated negative binomial regression model that can both distinguish differentially abundant taxa with distinct phenotypes and quantify covariate-taxa effects. Our model demonstrates good performance using simulated data. Furthermore, we successfully integrated microbiome taxonomies and metabolomics in two real microbiome datasets to provide biologically interpretable findings. In all, we proposed a novel integrative Bayesian regression model that features bacterial differential abundance analysis and microbiome-covariate effects quantifications, which makes it suitable for general microbiome studies.
Collapse
Affiliation(s)
- Shuang Jiang
- Department of Statistical Science, Southern Methodist University, Dallas, TX 75275, USA
| | - Guanghua Xiao
- Quantitative Biomedical Research Center, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Andrew Y Koh
- Department of Pediatrics, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA and Department of Microbiology, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Jiwoong Kim
- Quantitative Biomedical Research Center, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Qiwei Li
- Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, TX 75080, USA
| | - Xiaowei Zhan
- Quantitative Biomedical Research Center, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| |
Collapse
|
9
|
Cremaschi A, Argiento R, Shoemaker K, Peterson C, Vannucci M. Hierarchical Normalized Completely Random Measures for Robust Graphical Modeling. BAYESIAN ANALYSIS 2019; 14:1271-1301. [PMID: 32431780 PMCID: PMC7237071 DOI: 10.1214/19-ba1153] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Gaussian graphical models are useful tools for exploring network structures in multivariate normal data. In this paper we are interested in situations where data show departures from Gaussianity, therefore requiring alternative modeling distributions. The multivariate t-distribution, obtained by dividing each component of the data vector by a gamma random variable, is a straightforward generalization to accommodate deviations from normality such as heavy tails. Since different groups of variables may be contaminated to a different extent, Finegold and Drton (2014) introduced the Dirichlet t-distribution, where the divisors are clustered using a Dirichlet process. In this work, we consider a more general class of nonparametric distributions as the prior on the divisor terms, namely the class of normalized completely random measures (NormCRMs). To improve the effectiveness of the clustering, we propose modeling the dependence among the divisors through a nonparametric hierarchical structure, which allows for the sharing of parameters across the samples in the data set. This desirable feature enables us to cluster together different components of multivariate data in a parsimonious way. We demonstrate through simulations that this approach provides accurate graphical model inference, and apply it to a case study examining the dependence structure in radiomics data derived from The Cancer Imaging Atlas.
Collapse
Affiliation(s)
- Andrea Cremaschi
- Department of Cancer Immunology, Institute of Cancer Research, Oslo University Hospital, Oslo, Norway
- Oslo Centre for Biostatistics and Epidemiology (OCBE), University of Oslo, Oslo, Norway
| | - Raffaele Argiento
- ESOMAS Department, University of Torino, Torino, Italy
- Collegio Carlo Alberto, Torino, Italy
| | - Katherine Shoemaker
- Department of Statistics, Rice University, Houston, TX, USA
- Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Christine Peterson
- Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | | |
Collapse
|
10
|
Zhang Y, Morris JS, Aerry SN, Rao AU, Baladandayuthapani V. RADIO-IBAG: RADIOMICS-BASED INTEGRATIVE BAYESIAN ANALYSIS OF MULTIPLATFORM GENOMIC DATA. Ann Appl Stat 2019; 13:1957-1988. [PMID: 33224404 PMCID: PMC7678720 DOI: 10.1214/19-aoas1238] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Technological innovations have produced large multi-modal datasets that include imaging and multi-platform genomics data. Integrative analyses of such data have the potential to reveal important biological and clinical insights into complex diseases like cancer. In this paper, we present Bayesian approaches for integrative analysis of radiological imaging and multi-platform genomic data, wherein our goals are to simultaneously identify genomic and radiomic, i.e., radiology-based imaging markers, along with the latent associations between these two modalities, and to detect the overall prognostic relevance of the combined markers. For this task, we propose Radio-iBAG: Radiomics-based Integrative Bayesian Analysis of Multiplatform Genomic Data, a multi-scale Bayesian hierarchical model that involves several innovative strategies: it incorporates integrative analysis of multi-platform genomic data sets to capture fundamental biological relationships; explores the associations between radiomic markers accompanying genomic information with clinical outcomes; and detects genomic and radiomic markers associated with clinical prognosis. We also introduce the use of sparse Principal Component Analysis (sPCA) to extract a sparse set of approximately orthogonal meta-features each containing information from a set of related individual radiomic features, reducing dimensionality and combining like features. Our methods are motivated by and applied to The Cancer Genome Atlas glioblastoma multiforme data set, where-in we integrate magnetic resonance imaging-based biomarkers along with genomic, epigenomic and transcriptomic data. Our model identifies important magnetic resonance imaging features and the associated genomic platforms that are related with patient survival times.
Collapse
Affiliation(s)
- Youyi Zhang
- The University of Texas MD Anderson Cancer Center
| | | | | | | | | |
Collapse
|
11
|
Nathoo FS, Kong L, Zhu H. A Review of Statistical Methods in Imaging Genetics. CAN J STAT 2019; 47:108-131. [PMID: 31274952 PMCID: PMC6605768 DOI: 10.1002/cjs.11487] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2017] [Accepted: 10/08/2018] [Indexed: 12/24/2022]
Abstract
With the rapid growth of modern technology, many biomedical studies are being conducted to collect massive datasets with volumes of multi-modality imaging, genetic, neurocognitive, and clinical information from increasingly large cohorts. Simultaneously extracting and integrating rich and diverse heterogeneous information in neuroimaging and/or genomics from these big datasets could transform our understanding of how genetic variants impact brain structure and function, cognitive function, and brain-related disease risk across the lifespan. Such understanding is critical for diagnosis, prevention, and treatment of numerous complex brain-related disorders (e.g., schizophrenia and Alzheimer's disease). However, the development of analytical methods for the joint analysis of both high-dimensional imaging phenotypes and high-dimensional genetic data, a big data squared (BD2) problem, presents major computational and theoretical challenges for existing analytical methods. Besides the high-dimensional nature of BD2, various neuroimaging measures often exhibit strong spatial smoothness and dependence and genetic markers may have a natural dependence structure arising from linkage disequilibrium. We review some recent developments of various statistical techniques for imaging genetics, including massive univariate and voxel-wise approaches, reduced rank regression, mixture models, and group sparse multi-task regression. By doing so, we hope that this review may encourage others in the statistical community to enter into this new and exciting field of research.
Collapse
Affiliation(s)
- Farouk S Nathoo
- Department of Mathematics and Statistics, University of Victoria
| | - Linglong Kong
- Department of Mathematical and Statistical Sciences, University of Alberta
| | - Hongtu Zhu
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center
| |
Collapse
|
12
|
Huisman SMH, Mahfouz A, Batmanghelich NK, Lelieveldt BPF, Reinders MJT. A structural equation model for imaging genetics using spatial transcriptomics. Brain Inform 2018; 5:13. [PMID: 30390165 PMCID: PMC6429169 DOI: 10.1186/s40708-018-0091-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Accepted: 10/21/2018] [Indexed: 11/10/2022] Open
Abstract
Imaging genetics deals with relationships between genetic variation and imaging variables, often in a disease context. The complex relationships between brain volumes and genetic variants have been explored with both dimension reduction methods and model-based approaches. However, these models usually do not make use of the extensive knowledge of the spatio-anatomical patterns of gene activity. We present a method for integrating genetic markers (single nucleotide polymorphisms) and imaging features, which is based on a causal model and, at the same time, uses the power of dimension reduction. We use structural equation models to find latent variables that explain brain volume changes in a disease context, and which are in turn affected by genetic variants. We make use of publicly available spatial transcriptome data from the Allen Human Brain Atlas to specify the model structure, which reduces noise and improves interpretability. The model is tested in a simulation setting and applied on a case study of the Alzheimer’s Disease Neuroimaging Initiative.
Collapse
Affiliation(s)
- Sjoerd M H Huisman
- Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands.,Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands
| | - Ahmed Mahfouz
- Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands.,Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands
| | | | - Boudewijn P F Lelieveldt
- Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands.,Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands.,Division of Image Processing, Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Marcel J T Reinders
- Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands. .,Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands.
| | | |
Collapse
|
13
|
Vilor-Tejedor N, Alemany S, Cáceres A, Bustamante M, Pujol J, Sunyer J, González JR. Strategies for integrated analysis in imaging genetics studies. Neurosci Biobehav Rev 2018; 93:57-70. [PMID: 29944960 DOI: 10.1016/j.neubiorev.2018.06.013] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2017] [Revised: 04/30/2018] [Accepted: 06/15/2018] [Indexed: 02/06/2023]
Abstract
Imaging Genetics (IG) integrates neuroimaging and genomic data from the same individual, deepening our knowledge of the biological mechanisms behind neurodevelopmental domains and neurological disorders. Although the literature on IG has exponentially grown over the past years, the majority of studies have mainly analyzed associations between candidate brain regions and individual genetic variants. However, this strategy is not designed to deal with the complexity of neurobiological mechanisms underlying behavioral and neurodevelopmental domains. Moreover, larger sample sizes and increased multidimensionality of this type of data represents a challenge for standardizing modeling procedures in IG research. This review provides a systematic update of the methods and strategies currently used in IG studies, and serves as an analytical framework for researchers working in this field. To complement the functionalities of the Neuroconductor framework, we also describe existing R packages that implement these methodologies. In addition, we present an overview of how these methodological approaches are applied in integrating neuroimaging and genetic data.
Collapse
Affiliation(s)
- Natàlia Vilor-Tejedor
- Barcelona Research Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain; Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain; Barcelona Beta Brain Research Center (BBRC) - Pasqual Maragall Foundation, Barcelona, Spain.
| | - Silvia Alemany
- Barcelona Research Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain
| | - Alejandro Cáceres
- Barcelona Research Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain
| | - Mariona Bustamante
- Barcelona Research Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain; Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Jesús Pujol
- MRI Research Unit, Hospital del Mar, Centro de Investigación Biomédica en Red de Salud Mental, CIBERSAM G21, Barcelona, Spain
| | - Jordi Sunyer
- Barcelona Research Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain; IMIM (Hospital del Mar Medical Research Institute), Barcelona, Spain
| | - Juan R González
- Barcelona Research Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain.
| |
Collapse
|
14
|
Warnick R, Guindani M, Erhardt E, Allen E, Calhoun V, Vannucci M. A Bayesian Approach for Estimating Dynamic Functional Network Connectivity in fMRI Data. J Am Stat Assoc 2018; 113:134-151. [PMID: 30853734 PMCID: PMC6405235 DOI: 10.1080/01621459.2017.1379404] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2016] [Revised: 08/01/2017] [Indexed: 01/22/2023]
Abstract
Dynamic functional connectivity, i.e., the study of how interactions among brain regions change dynamically over the course of an fMRI experiment, has recently received wide interest in the neuroimaging literature. Current approaches for studying dynamic connectivity often rely on ad-hoc approaches for inference, with the fMRI time courses segmented by a sequence of sliding windows. We propose a principled Bayesian approach to dynamic functional connectivity, which is based on the estimation of time varying networks. Our method utilizes a hidden Markov model for classification of latent cognitive states, achieving estimation of the networks in an integrated framework that borrows strength over the entire time course of the experiment. Furthermore, we assume that the graph structures, which define the connectivity states at each time point, are related within a super-graph, to encourage the selection of the same edges among related graphs. We apply our method to simulated task-based fMRI data, where we show how our approach allows the decoupling of the task-related activations and the functional connectivity states. We also analyze data from an fMRI sensorimotor task experiment on an individual healthy subject and obtain results that support the role of particular anatomical regions in modulating interaction between executive control and attention networks.
Collapse
Affiliation(s)
- Ryan Warnick
- Department of Statistics, Rice University, Houston, TX
| | - Michele Guindani
- Department of Statistics, University of California at Irvine, Irvine, CA
| | - Erik Erhardt
- Department of Mathematics and Statistics, University of New Mexico, Albuquerque, NM
| | - Elena Allen
- Research Scientist, Medici Technologies, Albuquerque, NM
| | - Vince Calhoun
- Distinguished Professor, Departments of Electrical and Computer Engineering, University of New Mexico
| | - Marina Vannucci
- Noah Harding Professor and Chair, Department of Statistics, Rice University
| |
Collapse
|
15
|
Chiang S, Guindani M, Yeh HJ, Dewar S, Haneef Z, Stern JM, Vannucci M. A Hierarchical Bayesian Model for the Identification of PET Markers Associated to the Prediction of Surgical Outcome after Anterior Temporal Lobe Resection. Front Neurosci 2017; 11:669. [PMID: 29259537 PMCID: PMC5723403 DOI: 10.3389/fnins.2017.00669] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2017] [Accepted: 11/17/2017] [Indexed: 01/19/2023] Open
Abstract
We develop an integrative Bayesian predictive modeling framework that identifies individual pathological brain states based on the selection of fluoro-deoxyglucose positron emission tomography (PET) imaging biomarkers and evaluates the association of those states with a clinical outcome. We consider data from a study on temporal lobe epilepsy (TLE) patients who subsequently underwent anterior temporal lobe resection. Our modeling framework looks at the observed profiles of regional glucose metabolism in PET as the phenotypic manifestation of a latent individual pathologic state, which is assumed to vary across the population. The modeling strategy we adopt allows the identification of patient subgroups characterized by latent pathologies differentially associated to the clinical outcome of interest. It also identifies imaging biomarkers characterizing the pathological states of the subjects. In the data application, we identify a subgroup of TLE patients at high risk for post-surgical seizure recurrence after anterior temporal lobe resection, together with a set of discriminatory brain regions that can be used to distinguish the latent subgroups. We show that the proposed method achieves high cross-validated accuracy in predicting post-surgical seizure recurrence.
Collapse
Affiliation(s)
- Sharon Chiang
- Department of Statistics, Rice University, Houston, TX, United States.,School of Medicine, Baylor College of Medicine, Houston, TX, United States
| | - Michele Guindani
- Department of Statistics, University of California, Irvine, Irvine, CA, United States
| | - Hsiang J Yeh
- Department of Neurology, University of California, Los Angeles, Los Angeles, CA, United States
| | - Sandra Dewar
- Department of Neurology, University of California, Los Angeles, Los Angeles, CA, United States
| | - Zulfi Haneef
- Department of Neurology, Baylor College of Medicine, Houston, TX, United States
| | - John M Stern
- Department of Neurology, University of California, Los Angeles, Los Angeles, CA, United States
| | - Marina Vannucci
- Department of Statistics, Rice University, Houston, TX, United States
| |
Collapse
|
16
|
Greenlaw K, Szefer E, Graham J, Lesperance M, Nathoo FS. A Bayesian group sparse multi-task regression model for imaging genetics. Bioinformatics 2017; 33:2513-2522. [PMID: 28419235 PMCID: PMC5870710 DOI: 10.1093/bioinformatics/btx215] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2016] [Revised: 02/20/2017] [Accepted: 04/12/2017] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Recent advances in technology for brain imaging and high-throughput genotyping have motivated studies examining the influence of genetic variation on brain structure. Wang et al. have developed an approach for the analysis of imaging genomic studies using penalized multi-task regression with regularization based on a novel group l2,1-norm penalty which encourages structured sparsity at both the gene level and SNP level. While incorporating a number of useful features, the proposed method only furnishes a point estimate of the regression coefficients; techniques for conducting statistical inference are not provided. A new Bayesian method is proposed here to overcome this limitation. RESULTS We develop a Bayesian hierarchical modeling formulation where the posterior mode corresponds to the estimator proposed by Wang et al. and an approach that allows for full posterior inference including the construction of interval estimates for the regression parameters. We show that the proposed hierarchical model can be expressed as a three-level Gaussian scale mixture and this representation facilitates the use of a Gibbs sampling algorithm for posterior simulation. Simulation studies demonstrate that the interval estimates obtained using our approach achieve adequate coverage probabilities that outperform those obtained from the nonparametric bootstrap. Our proposed methodology is applied to the analysis of neuroimaging and genetic data collected as part of the Alzheimer's Disease Neuroimaging Initiative (ADNI), and this analysis of the ADNI cohort demonstrates clearly the value added of incorporating interval estimation beyond only point estimation when relating SNPs to brain imaging endophenotypes. AVAILABILITY AND IMPLEMENTATION Software and sample data is available as an R package 'bgsmtr' that can be downloaded from The Comprehensive R Archive Network (CRAN). CONTACT nathoo@uvic.ca. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Keelin Greenlaw
- Mathematics and Statistics, University of Victoria, Victoria, BC, Canada
| | - Elena Szefer
- Statistics and Actuarial Science, Simon Fraser University, Burnaby, BC, Canada
| | - Jinko Graham
- Statistics and Actuarial Science, Simon Fraser University, Burnaby, BC, Canada
| | - Mary Lesperance
- Mathematics and Statistics, University of Victoria, Victoria, BC, Canada
| | - Farouk S Nathoo
- Mathematics and Statistics, University of Victoria, Victoria, BC, Canada
| | | |
Collapse
|
17
|
Kundu S, Kang J. Semiparametric Bayes conditional graphical models for imaging genetics applications. Stat (Int Stat Inst) 2016; 5:322-337. [PMID: 28616224 DOI: 10.1002/sta4.119] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Motivated by the need for understanding neurological disorders, large-scale imaging genetic studies are being increasingly conducted. A salient objective in such studies is to identify important neuroimaging biomarkers such as the brain functional connectivity, as well as genetic biomarkers, which are predictive of disorders. However, typical approaches for estimating the group level brain functional connectivity do not account for potential variation, resulting from demographic and genetic factors, while usual methods for discovering genetic biomarkers do not factor in the influence of the brain network on the imaging phenotype. We propose a novel semiparametric Bayesian conditional graphical model for joint variable selection and graph estimation, which simultaneously estimates the brain network after accounting for heterogeneity, and infers significant genetic biomarkers. The proposed approach specifies priors on the regression coefficients, which clusters brain regions having similar activation patterns depending on covariates, leading to dimension reduction. A novel graphical prior is proposed, which encourages modularity in brain organization by specifying denser and sparse connections within and across clusters, respectively. The posterior computation proceeds via a Markov chain Monte Carlo. We apply the approach to data obtained from the Alzheimer's disease neuroimaging initiative and demonstrate numerical advantages via simulation studies.
Collapse
Affiliation(s)
- Suprateek Kundu
- Department of Biostatistics, Emory University, 1518 Clifton Road NE, Atlanta, GA 30322, USA
| | - Jian Kang
- Department of Biostatistics, University of Michigan, 3651 Tower, 1415 Washington Heights, Ann Arbor, MI 48019, USA
| |
Collapse
|
18
|
Tao C, Nichols TE, Hua X, Ching CRK, Rolls ET, Thompson PM, Feng J. Generalized reduced rank latent factor regression for high dimensional tensor fields, and neuroimaging-genetic applications. Neuroimage 2016; 144:35-57. [PMID: 27666385 DOI: 10.1016/j.neuroimage.2016.08.027] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2015] [Revised: 08/01/2016] [Accepted: 08/14/2016] [Indexed: 11/18/2022] Open
Abstract
We propose a generalized reduced rank latent factor regression model (GRRLF) for the analysis of tensor field responses and high dimensional covariates. The model is motivated by the need from imaging-genetic studies to identify genetic variants that are associated with brain imaging phenotypes, often in the form of high dimensional tensor fields. GRRLF identifies from the structure in the data the effective dimensionality of the data, and then jointly performs dimension reduction of the covariates, dynamic identification of latent factors, and nonparametric estimation of both covariate and latent response fields. After accounting for the latent and covariate effects, GRLLF performs a nonparametric test on the remaining factor of interest. GRRLF provides a better factorization of the signals compared with common solutions, and is less susceptible to overfitting because it exploits the effective dimensionality. The generality and the flexibility of GRRLF also allow various statistical models to be handled in a unified framework and solutions can be efficiently computed. Within the field of neuroimaging, it improves the sensitivity for weak signals and is a promising alternative to existing approaches. The operation of the framework is demonstrated with both synthetic datasets and a real-world neuroimaging example in which the effects of a set of genes on the structure of the brain at the voxel level were measured, and the results compared favorably with those from existing approaches.
Collapse
Affiliation(s)
- Chenyang Tao
- Centre for Computational Systems Biology and School of Mathematical Sciences, Fudan University, Shanghai, PR China; Department of Computer Science, Warwick University, Coventry, UK
| | | | - Xue Hua
- Imaging Genetics Center, Institute for Neuroimaging & Informatics, University of Southern California, Los Angeles, CA, USA
| | - Christopher R K Ching
- Imaging Genetics Center, Institute for Neuroimaging & Informatics, University of Southern California, Los Angeles, CA, USA; Interdepartmental Neuroscience Graduate Program, UCLA School of Medicine, Los Angeles, CA, USA
| | - Edmund T Rolls
- Department of Computer Science, Warwick University, Coventry, UK; Oxford Centre for Computational Neuroscience, Oxford, UK
| | - Paul M Thompson
- Imaging Genetics Center, Institute for Neuroimaging & Informatics, University of Southern California, Los Angeles, CA, USA; Departments of Neurology, Psychiatry, Radiology, Engineering, Pediatrics, and Ophthalmology, USC, Los Angeles, CA, USA
| | - Jianfeng Feng
- Centre for Computational Systems Biology and School of Mathematical Sciences, Fudan University, Shanghai, PR China; Department of Computer Science, Warwick University, Coventry, UK; School of Life Science and the Collaborative Innovation Center for Brain Science, Fudan University, Shanghai 200433, PR China.
| |
Collapse
|
19
|
Chekouo T, Stingo FC, Guindani M, Do KA. A Bayesian predictive model for imaging genetics with application to schizophrenia. Ann Appl Stat 2016. [DOI: 10.1214/16-aoas948] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
20
|
Raman S, Deserno L, Schlagenhauf F, Stephan KE. A hierarchical model for integrating unsupervised generative embedding and empirical Bayes. J Neurosci Methods 2016; 269:6-20. [DOI: 10.1016/j.jneumeth.2016.04.022] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2015] [Revised: 04/23/2016] [Accepted: 04/24/2016] [Indexed: 11/25/2022]
|
21
|
Abstract
Emerging integrative analysis of genomic and anatomical imaging data which has not been well developed, provides invaluable information for the holistic discovery of the genomic structure of disease and has the potential to open a new avenue for discovering novel disease susceptibility genes which cannot be identified if they are analyzed separately. A key issue to the success of imaging and genomic data analysis is how to reduce their dimensions. Most previous methods for imaging information extraction and RNA-seq data reduction do not explore imaging spatial information and often ignore gene expression variation at the genomic positional level. To overcome these limitations, we extend functional principle component analysis from one dimension to two dimensions (2DFPCA) for representing imaging data and develop a multiple functional linear model (MFLM) in which functional principal scores of images are taken as multiple quantitative traits and RNA-seq profile across a gene is taken as a function predictor for assessing the association of gene expression with images. The developed method has been applied to image and RNA-seq data of ovarian cancer and kidney renal clear cell carcinoma (KIRC) studies. We identified 24 and 84 genes whose expressions were associated with imaging variations in ovarian cancer and KIRC studies, respectively. Our results showed that many significantly associated genes with images were not differentially expressed, but revealed their morphological and metabolic functions. The results also demonstrated that the peaks of the estimated regression coefficient function in the MFLM often allowed the discovery of splicing sites and multiple isoforms of gene expressions.
Collapse
|
22
|
Zhang L, Guindani M, Vannucci M. Bayesian Models for fMRI Data Analysis. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL STATISTICS 2015; 7:21-41. [PMID: 25750690 PMCID: PMC4346370 DOI: 10.1002/wics.1339] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Functional magnetic resonance imaging (fMRI), a noninvasive neuroimaging method that provides an indirect measure of neuronal activity by detecting blood flow changes, has experienced an explosive growth in the past years. Statistical methods play a crucial role in understanding and analyzing fMRI data. Bayesian approaches, in particular, have shown great promise in applications. A remarkable feature of fully Bayesian approaches is that they allow a flexible modeling of spatial and temporal correlations in the data. This paper provides a review of the most relevant models developed in recent years. We divide methods according to the objective of the analysis. We start from spatio-temporal models for fMRI data that detect task-related activation patterns. We then address the very important problem of estimating brain connectivity. We also touch upon methods that focus on making predictions of an individual's brain activity or a clinical or behavioral response. We conclude with a discussion of recent integrative models that aim at combining fMRI data with other imaging modalities, such as EEG/MEG and DTI data, measured on the same subjects. We also briefly discuss the emerging field of imaging genetics.
Collapse
Affiliation(s)
- Linlin Zhang
- Department of Statistics, Rice University, Houston, TX 77005, USA
| | - Michele Guindani
- Department of Biostatistics, UT M.D. Anderson Cancer Center, Houston, TX 77230, USA
| | - Marina Vannucci
- Department of Statistics, Rice University, Houston, TX 77005, USA
| |
Collapse
|
23
|
Lin D, Cao H, Calhoun VD, Wang YP. Sparse models for correlative and integrative analysis of imaging and genetic data. J Neurosci Methods 2014; 237:69-78. [PMID: 25218561 DOI: 10.1016/j.jneumeth.2014.09.001] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2014] [Revised: 08/27/2014] [Accepted: 09/01/2014] [Indexed: 11/29/2022]
Abstract
The development of advanced medical imaging technologies and high-throughput genomic measurements has enhanced our ability to understand their interplay as well as their relationship with human behavior by integrating these two types of datasets. However, the high dimensionality and heterogeneity of these datasets presents a challenge to conventional statistical methods; there is a high demand for the development of both correlative and integrative analysis approaches. Here, we review our recent work on developing sparse representation based approaches to address this challenge. We show how sparse models are applied to the correlation and integration of imaging and genetic data for biomarker identification. We present examples on how these approaches are used for the detection of risk genes and classification of complex diseases such as schizophrenia. Finally, we discuss future directions on the integration of multiple imaging and genomic datasets including their interactions such as epistasis.
Collapse
Affiliation(s)
- Dongdong Lin
- Department of Biomedical Engineering, Tulane University, New Orleans, LA, 70118, USA; Center of Genomics and Bioinformatics, Tulane University, New Orleans, LA, 70112, USA.
| | - Hongbao Cao
- Unit on Statistical Genomics, Intramural Program of Research, National Institute of Mental Health, NIH, Bethesda 20852, USA.
| | - Vince D Calhoun
- The Mind Research Network & LBERI, Albuquerque, NM 87106, USA; Department of Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM 87131, USA.
| | - Yu-Ping Wang
- Department of Biomedical Engineering, Tulane University, New Orleans, LA, 70118, USA; Center of Genomics and Bioinformatics, Tulane University, New Orleans, LA, 70112, USA.
| |
Collapse
|
24
|
|