1
|
Kawaguchi A, Yamashita F. Multivariate Analyses with Two-Step Dimension Reduction for an Association Study Between 11C-Pittsburgh Compound B and Magnetic Resonance Imaging in Alzheimer's Disease. Bioengineering (Basel) 2025; 12:48. [PMID: 39851322 PMCID: PMC11759775 DOI: 10.3390/bioengineering12010048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2024] [Revised: 12/05/2024] [Accepted: 12/19/2024] [Indexed: 01/26/2025] Open
Abstract
The neuropathological diagnosis of Alzheimer's disease (AD) relies on amyloid beta (Aβ) deposition in brain tissues. To study the relationship between Aβ deposition and brain structure, as determined using 11C-Pittsburgh compound B (PiB) and magnetic resonance imaging (MRI), respectively, we developed a regression model with PiB and MRI data as the predictor and response variables, respectively, and proposed a regression method for studying the association between them based on a supervised sparse multivariate analysis with dimension reduction based on a composite paired basis function. By applying this method to imaging data of 61 patients with AD (age: 55-85), the first component showed the strongest correlation with the composite score, owing to the supervised feature. The spatial pattern included the hippocampal and parahippocampal regions for MRI. The peak value was observed in the posterior cingulate and precuneus for PiB. The differences in PiB scores among the diagnosis groups 12 months after PiB imaging were significant between the normal and AD groups (p = 0.0284), but not between the normal and mild cognitive impairment (MCI) groups or the MCI and AD groups (p = 0.3508). Our method may facilitate the development of a dementia biomarker from brain imaging data. Scoring imaging data allows for visualization and the application of traditional analysis, facilitating clinical analysis for better interpretation of results.
Collapse
Affiliation(s)
- Atsushi Kawaguchi
- Faculty of Medicine, Saga University, 5-1-1 Nabeshima, Saga 849-8501, Japan
| | - Fumio Yamashita
- Division of Ultrahigh Field MRI, Iwate Medical University, 1-1-1 Idaidori, Yahaba 028-3694, Japan
| |
Collapse
|
2
|
Kawaguchi A. Network-based diagnostic probability estimation from resting-state functional magnetic resonance imaging. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:17702-17725. [PMID: 38052533 DOI: 10.3934/mbe.2023787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Brain functional connectivity is a useful biomarker for diagnosing brain disorders. Connectivity is measured using resting-state functional magnetic resonance imaging (rs-fMRI). Previous studies have used a sequential application of the graphical model for network estimation and machine learning to construct predictive formulas for determining outcomes (e.g., disease or health) from the estimated network. However, the resulting network had limited utility for diagnosis because it was estimated independent of the outcome. In this study, we proposed a regression method with scores from rs-fMRI based on supervised sparse hierarchical components analysis (SSHCA). SSHCA has a hierarchical structure that consists of a network model (block scores at the individual level) and a scoring model (super scores at the population level). A regression model, such as the multiple logistic regression model with super scores as the predictor, was used to estimate diagnostic probabilities. An advantage of the proposed method was that the outcome-related (supervised) network connections and multiple scores corresponding to the sub-network estimation were helpful for interpreting the results. Our results in the simulation study and application to real data show that it is possible to predict diseases with high accuracy using the constructed model.
Collapse
|
3
|
Zhang X, Hao Y, Zhang J, Ji Y, Zou S, Zhao S, Xie S, Du L. A multi-task SCCA method for brain imaging genetics and its application in neurodegenerative diseases. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 232:107450. [PMID: 36905750 DOI: 10.1016/j.cmpb.2023.107450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Revised: 02/24/2023] [Accepted: 02/24/2023] [Indexed: 06/18/2023]
Abstract
BACKGROUND AND OBJECTIVES In brain imaging genetics, multi-task sparse canonical correlation analysis (MTSCCA) is effective to study the bi-multivariate associations between genetic variations such as single nucleotide polymorphisms (SNPs) and multi-modal imaging quantitative traits (QTs). However, most existing MTSCCA methods are neither supervised nor capable of distinguishing the shared patterns of multi-modal imaging QTs from the specific patterns. METHODS A new diagnosis-guided MTSCCA (DDG-MTSCCA) with parameter decomposition and graph-guided pairwise group lasso penalty was proposed. Specifically, the multi-tasking modeling paradigm enables us to comprehensively identify risk genetic loci by jointly incorporating multi-modal imaging QTs. The regression sub-task was raised to guide the selection of diagnosis-related imaging QTs. To reveal the diverse genetic mechanisms, the parameter decomposition and different constraints were utilized to facilitate the identification of modality-consistent and -specific genotypic variations. Besides, a network constraint was added to find out meaningful brain networks. The proposed method was applied to synthetic data and two real neuroimaging data sets respectively from Alzheimer's disease neuroimaging initiative (ADNI) and Parkinson's progression marker initiative (PPMI) databases. RESULTS Compared with the competitive methods, the proposed method exhibited higher or comparable canonical correlation coefficients (CCCs) and better feature selection results. In particular, in the simulation study, DDG-MTSCCA showed the best anti-noise ability and achieved the highest average hit rate, about 25% higher than MTSCCA. On the real data of Alzheimer's disease (AD) and Parkinson's disease (PD), our method obtained the highest average testing CCCs, about 40% ∼ 50% higher than MTSCCA. Especially, our method could select more comprehensive feature subsets, and the top five SNPs and imaging QTs were all disease-related. The ablation experimental results also demonstrated the significance of each component in the model, i.e., the diagnosis guidance, parameter decomposition, and network constraint. CONCLUSIONS These results on simulated data, ADNI and PPMI cohorts suggested the effectiveness and generalizability of our method in identifying meaningful disease-related markers. DDG-MTSCCA could be a powerful tool in brain imaging genetics, worthy of in-depth study.
Collapse
Affiliation(s)
- Xin Zhang
- Institute of Medical Research, Northwestern Polytechnical University, Xi'an, Shannxi 710072, China
| | - Yipeng Hao
- Institute of Medical Research, Northwestern Polytechnical University, Xi'an, Shannxi 710072, China
| | - Jin Zhang
- School of Automation, Northwestern Polytechnical University, Xi'an, Shannxi 710072, China
| | - Yanuo Ji
- Institute of Medical Research, Northwestern Polytechnical University, Xi'an, Shannxi 710072, China
| | - Shihong Zou
- Institute of Medical Research, Northwestern Polytechnical University, Xi'an, Shannxi 710072, China
| | - Shijie Zhao
- School of Automation, Northwestern Polytechnical University, Xi'an, Shannxi 710072, China
| | - Songyun Xie
- School of Electronics and Information, Northwestern Polytechnical University, Xi'an, Shannxi 710072, China
| | - Lei Du
- School of Automation, Northwestern Polytechnical University, Xi'an, Shannxi 710072, China.
| |
Collapse
|
4
|
Xin Y, Sheng J, Miao M, Wang L, Yang Z, Huang H. A review ofimaging genetics in Alzheimer's disease. J Clin Neurosci 2022; 100:155-163. [PMID: 35487021 DOI: 10.1016/j.jocn.2022.04.017] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2022] [Revised: 03/01/2022] [Accepted: 04/15/2022] [Indexed: 01/18/2023]
Abstract
Determining the association between genetic variation and phenotype is a key step to study the mechanism of Alzheimer's disease (AD), laying the foundation for studying drug therapies and biomarkers. AD is the most common type of dementia in the aged population. At present, three early-onset AD genes (APP, PSEN1, PSEN2) and one late-onset AD susceptibility gene apolipoprotein E (APOE) have been determined. However, the pathogenesis of AD remains unknown. Imaging genetics, an emerging interdisciplinary field, is able to reveal the complex mechanisms from the genetic level to human cognition and mental disorders via macroscopic intermediates. This paper reviews methods of establishing genotype-phenotype to explore correlations, including sparse canonical correlation analysis, sparse reduced rank regression, sparse partial least squares and so on. We found that most research work did poorly in supervised learning and exploring the nonlinear relationship between SNP-QT.
Collapse
Affiliation(s)
- Yu Xin
- College of Computer Science, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China
| | - Jinhua Sheng
- College of Computer Science, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China.
| | - Miao Miao
- Beijing Hospital, Beijing 100730, China; National Center of Gerontology, Beijing 100730, China; Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing 100730, China
| | - Luyun Wang
- College of Computer Science, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China; Hangzhou Vocational & Technical College, Hangzhou, Zhejiang 310018, China
| | - Ze Yang
- College of Computer Science, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China
| | - He Huang
- College of Computer Science, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China
| |
Collapse
|
5
|
Sheng J, Wang L, Cheng H, Zhang Q, Zhou R, Shi Y. Strategies for multivariate analyses of imaging genetics study in Alzheimer's disease. Neurosci Lett 2021; 762:136147. [PMID: 34332030 DOI: 10.1016/j.neulet.2021.136147] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 03/27/2021] [Accepted: 07/26/2021] [Indexed: 11/16/2022]
Abstract
Alzheimer's disease (AD) is an incurable neurodegenerative disease primarily affecting the elderly population. Early diagnosis of AD is critical for the management of this disease. Imaging genetics examines the influence of genetic variants (i.e., single nucleotide polymorphisms (SNPs)) on brain structure and function and many novel approaches of imaging genetics are proposed for studying AD. We review and synthesize the Alzheimer's Disease Neuroimaging Initiative (ADNI) genetic associations with quantitative disease endophenotypes including structural and functional neuroimaging, diffusion tensor imaging (DTI), positron emission tomography (PET), and fluid biomarker assays. In this review, we survey recent publications using neuroimaging and genetic data of AD, with a focus on methods capturing multivariate effects accommodating the large number variables from both imaging data and genetic data. We review methods focused on bridging the imaging and genetic data by establishing genotype-phenotype association, including sparse canonical correlation analysis, parallel independent component analysis, sparse reduced rank regression, sparse partial least squares, genome-wide association study, and so on. The broad availability and wide scope of ADNI genetic and phenotypic data has advanced our understanding of the genetic basis of AD and has nominated novel targets for future pharmaceutical therapy and biomarker development.
Collapse
Affiliation(s)
- Jinhua Sheng
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China.
| | - Luyun Wang
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China; College of Information Engineering, Hangzhou Vocational & Technical College, Hangzhou, Zhejiang 310018, China
| | - Hu Cheng
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN 47405, USA
| | | | - Rougang Zhou
- Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China; School of Mechanical Engineering, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Mstar Technologies Inc., Hangzhou, Zhejiang 310018, China
| | - Yuchen Shi
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China
| |
Collapse
|
6
|
Vilor-Tejedor N, Ikram MA, Roshchupkin GV, Cáceres A, Alemany S, Vernooij MW, Niessen WJ, van Duijn CM, Sunyer J, Adams HH, González JR. Independent Multiple Factor Association Analysis for Multiblock Data in Imaging Genetics. Neuroinformatics 2020; 17:583-592. [PMID: 30903541 DOI: 10.1007/s12021-019-09416-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Multivariate methods have the potential to better capture complex relationships that may exist between different biological levels. Multiple Factor Analysis (MFA) is one of the most popular methods to obtain factor scores and measures of discrepancy between data sets. However, singular value decomposition in MFA is based on PCA, which is adequate only if the data is normally distributed, linear or stationary. In addition, including strongly correlated variables can overemphasize the contribution of the estimated components. In this work, we introduced a novel method referred as Independent Multifactorial Analysis (ICA-MFA) to derive relevant features from multiscale data. This method is an extended implementation of MFA, where the component value decomposition is based on Independent Component Analysis. In addition, ICA-MFA incorporates a predictive step based on an Independent Component Regression. We evaluated and compared the performance of ICA-MFA with both, the MFA method and traditional univariate analyses, in a simulation study. We showed how ICA-MFA explained up to 10-fold more variance than MFA and univariate methods. We applied the proposed algorithm in a study of 4057 individuals belonging to the population-based Rotterdam Study with available genetic and neuroimaging data, as well as information about executive cognitive functioning. Specifically, we used ICA-MFA to detect relevant genetic features related to structural brain regions, which in turn were involved, in the mechanisms of executive cognitive function. The proposed strategy makes it possible to determine the degree to which the whole set of genetic and/or neuroimaging markers contribute to the variability of the symptomatology jointly, rather than individually. While univariate results and MFA combinations only explained a limited proportion of variance (less than 2%), our method increased the explained variance (10%) and allowed the identification of significant components that maximize the variance explained in the model. The potential application of the ICA-MFA algorithm constitutes an important aspect of integrating multivariate multiscale data, specifically in the field of Neurogenetics.
Collapse
Affiliation(s)
- Natalia Vilor-Tejedor
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology., C. Doctor Aiguader 88, Edif. PRBB, 08003, Barcelona, Spain. .,BarcelonaBeta Brain Research Center (BBRC), Pasqual Maragall Foundation, Barcelona, Spain. .,Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain. .,Universitat Pompeu Fabra (UPF), Barcelona, Spain. .,CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain.
| | | | - Gennady V Roshchupkin
- Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, the Netherlands.,Department of Medical Informatics, Erasmus MC, Rotterdam, the Netherlands
| | - Alejandro Cáceres
- Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain.,CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain
| | - Silvia Alemany
- Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Meike W Vernooij
- Department of Epidemiology, Erasmus MC, Rotterdam, the Netherlands.,Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, the Netherlands
| | - Wiro J Niessen
- Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, the Netherlands.,Department of Medical Informatics, Erasmus MC, Rotterdam, the Netherlands.,Faculty of Applied Sciences, Delft University of Technology, Delft, The Netherlands
| | | | - Jordi Sunyer
- Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain.,CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain.,IMIM (Hospital del Mar Medical Research Institute), Barcelona, Spain
| | - Hieab H Adams
- Department of Epidemiology, Erasmus MC, Rotterdam, the Netherlands.,Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, the Netherlands.,Department of Medical Informatics, Erasmus MC, Rotterdam, the Netherlands
| | - Juan R González
- Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain.,CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain
| |
Collapse
|
7
|
Csala A, Hof MH, Zwinderman AH. Multiset sparse redundancy analysis for high-dimensional omics data. Biom J 2018; 61:406-423. [PMID: 30506971 PMCID: PMC6587877 DOI: 10.1002/bimj.201700248] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2017] [Revised: 09/28/2018] [Accepted: 10/02/2018] [Indexed: 11/23/2022]
Abstract
Redundancy Analysis (RDA) is a well‐known method used to describe the directional relationship between related data sets. Recently, we proposed sparse Redundancy Analysis (sRDA) for high‐dimensional genomic data analysis to find explanatory variables that explain the most variance of the response variables. As more and more biomolecular data become available from different biological levels, such as genotypic and phenotypic data from different omics domains, a natural research direction is to apply an integrated analysis approach in order to explore the underlying biological mechanism of certain phenotypes of the given organism. We show that the multiset sparse Redundancy Analysis (multi‐sRDA) framework is a prominent candidate for high‐dimensional omics data analysis since it accounts for the directional information transfer between omics sets, and, through its sparse solutions, the interpretability of the result is improved. In this paper, we also describe a software implementation for multi‐sRDA, based on the Partial Least Squares Path Modeling algorithm. We test our method through simulation and real omics data analysis with data sets of 364,134 methylation markers, 18,424 gene expression markers, and 47 cytokine markers measured on 37 patients with Marfan syndrome.
Collapse
Affiliation(s)
- Attila Csala
- Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Center, Amsterdam, The Netherlands
| | - Michel H Hof
- Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Center, Amsterdam, The Netherlands
| | - Aeilko H Zwinderman
- Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Center, Amsterdam, The Netherlands
| |
Collapse
|
8
|
Sakai K, Yamada K. Machine learning studies on major brain diseases: 5-year trends of 2014–2018. Jpn J Radiol 2018; 37:34-72. [DOI: 10.1007/s11604-018-0794-4] [Citation(s) in RCA: 73] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2018] [Accepted: 11/14/2018] [Indexed: 12/17/2022]
|
9
|
Rohart F, Gautier B, Singh A, Lê Cao KA. mixOmics: An R package for 'omics feature selection and multiple data integration. PLoS Comput Biol 2017; 13:e1005752. [PMID: 29099853 PMCID: PMC5687754 DOI: 10.1371/journal.pcbi.1005752] [Citation(s) in RCA: 2002] [Impact Index Per Article: 250.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2017] [Revised: 11/15/2017] [Accepted: 08/31/2017] [Indexed: 02/07/2023] Open
Abstract
The advent of high throughput technologies has led to a wealth of publicly available 'omics data coming from different sources, such as transcriptomics, proteomics, metabolomics. Combining such large-scale biological data sets can lead to the discovery of important biological insights, provided that relevant information can be extracted in a holistic manner. Current statistical approaches have been focusing on identifying small subsets of molecules (a 'molecular signature') to explain or predict biological conditions, but mainly for a single type of 'omics. In addition, commonly used methods are univariate and consider each biological feature independently. We introduce mixOmics, an R package dedicated to the multivariate analysis of biological data sets with a specific focus on data exploration, dimension reduction and visualisation. By adopting a systems biology approach, the toolkit provides a wide range of methods that statistically integrate several data sets at once to probe relationships between heterogeneous 'omics data sets. Our recent methods extend Projection to Latent Structure (PLS) models for discriminant analysis, for data integration across multiple 'omics data or across independent studies, and for the identification of molecular signatures. We illustrate our latest mixOmics integrative frameworks for the multivariate analyses of 'omics data available from the package.
Collapse
Affiliation(s)
- Florian Rohart
- The University of Queensland Diamantina Institute, Translational Research Institute, Brisbane, Queensland, Australia
| | - Benoît Gautier
- The University of Queensland Diamantina Institute, Translational Research Institute, Brisbane, Queensland, Australia
| | - Amrit Singh
- Prevention of Organ Failure (PROOF) Centre of Excellence, Vancouver, British Columbia, Canada
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, British Columbia, Canada
| | - Kim-Anh Lê Cao
- The University of Queensland Diamantina Institute, Translational Research Institute, Brisbane, Queensland, Australia
- Melbourne Integrative Genomics and School of Mathematics and Statistics, University of Melbourne, Melbourne, Victoria, Australia
| |
Collapse
|