1
|
Gao M, Kong W, Liu K, Wen G, Yu Y, Zhu Y, Jiang Z, Wei K. Exploring Brain Imaging and Genetic Risk Factors in Different Progression States of Alzheimer's Disease Through OSnetNMF-Based Methods. J Mol Neurosci 2025; 75:7. [PMID: 39815147 DOI: 10.1007/s12031-024-02274-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Accepted: 09/29/2024] [Indexed: 01/18/2025]
Abstract
Alzheimer's disease (AD) is a neurodegenerative disease with no effective treatment, often preceded by mild cognitive impairment (MCI). Multimodal imaging genetics integrates imaging and genetic data to gain a deeper understanding of disease progression and individual variations. This study focuses on exploring the mechanisms that drive the transition from normal cognition to MCI and ultimately to AD. As an effective joint feature extraction and dimensionality reduction method, non-negative matrix factorization (NMF) and its improved variants, particularly the network-based non-negative matrix factorization (netNMF), have been widely used in multimodal analysis to mine brain imaging and genetic data by considering the interactions between different features. However, many of these methods overlook the importance of the coefficient matrix and do not address issues related to data accuracy and feature redundancy. To address these limitations, we propose an orthogonal sparse network non-negative matrix factorization (OSnetNMF) algorithm, which introduces orthogonal and sparse constraints based on netNMF. By establishing linear relationships between structural magnetic resonance imaging (sMRI) and corresponding gene expression data, OSnetNMF reduces feature redundancy and decreases correlation between data, resulting in more accurate and reliable biomarker extraction. Experiments demonstrate that the OSnetNMF algorithm can accurately identify risk regions of interest (ROIs) and key genes that characterize AD progression, revealing significant trends in ROI pairs such as l4thVen-HIF1A, rBst-MPO, and rBst-PTK2B. Comparative experiments show that the improved algorithm outperforms traditional methods, identifying more disease-related biomarkers and achieving better reconstruction performance.
Collapse
Affiliation(s)
- Min Gao
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Ave., Shanghai, 201306, P. R. China
| | - Wei Kong
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Ave., Shanghai, 201306, P. R. China.
| | - Kun Liu
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Ave., Shanghai, 201306, P. R. China
| | - Gen Wen
- Department of Orthopedic Surgery, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200233, China
| | - Yaling Yu
- Department of Orthopedic Surgery, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200233, China
- Institute of Microsurgery on Extremities, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200233, China
| | - Yuemin Zhu
- CREATIS, University of Lyon, INSA Lyon, CNRS UMR 5220, Inserm U1294, Lyon, 69621, France
| | - Zhihan Jiang
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Ave., Shanghai, 201306, P. R. China
| | - Kai Wei
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| |
Collapse
|
2
|
Khalid MU, Nauman MM. A novel subject-wise dictionary learning approach using multi-subject fMRI spatial and temporal components. Sci Rep 2023; 13:20201. [PMID: 37980391 PMCID: PMC10657419 DOI: 10.1038/s41598-023-47420-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 11/13/2023] [Indexed: 11/20/2023] Open
Abstract
The conventional dictionary learning (DL) algorithms aim to adapt the dictionary/sparse code to individual functional magnetic resonance imaging (fMRI) data. Thus, lacking the capability to consolidate the spatiotemporal diversities offered by other subjects. Considering that subject-wise (sw) data matrix can be decomposed into the sparse linear combination of multi-subject (MS) time courses and MS spatial maps, two new algorithms, sw sequential DL (swsDL) and sw block DL (swbDL), have been proposed. They are based on the novel framework, defined by the mixing model, where base matrices prepared by operating a computationally fast sparse spatiotemporal blind source separation method over multiple subjects are employed to adapt the mixing matrices to sw training data. They solve the optimization models formulated using [Formula: see text]/[Formula: see text]-norm penalization/constraints through dictionary/sparse code pair update and alternating minimization approach. They are unique because no existing sparse DL method can incorporate MS spatiotemporal components while updating sw atoms/sparse codes, which can eventually be assembled using neuroscience knowledge to extract group-level dynamics. Various fMRI datasets are used to evaluate and compare the performance of the proposed algorithms with existing state-of-the-art algorithms. Specifically, overall, a [Formula: see text] increase in the mean correlation value and [Formula: see text] reduction in the mean computation time exhibited by swsDL and swbDL, respectively, over the adaptive consistent sequential dictionary algorithm.
Collapse
Affiliation(s)
- Muhammad Usman Khalid
- College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University, 11564, Riyadh, Saudi Arabia
| | - Malik Muhammad Nauman
- Faculty of Integrated Technologies, Universiti Brunei Darussalam, Bandar Seri Begawan, BE1410, Brunei.
| |
Collapse
|
3
|
Li N, Xu L, Li H, Liu Z, Mo H, Wu Y. UPLC-Q-Exactive Orbitrap-MS-Based Untargeted Lipidomic Analysis of Lipid Molecular Species in Spinal Cords from Different Domesticated Animals. Foods 2023; 12:3634. [PMID: 37835287 PMCID: PMC10572684 DOI: 10.3390/foods12193634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 09/25/2023] [Accepted: 09/28/2023] [Indexed: 10/15/2023] Open
Abstract
Lipids are crucial components for the maintenance oof normal structure and function in the nervous system. Elucidating the diversity of lipids in spinal cords may contribute to our understanding of neurodevelopment. This study comprehensively analyzed the fatty acid (FA) compositions and lipidomes of the spinal cords of eight domesticated animal species: pig, cattle, yak, goat, horse, donkey, camel, and sika deer. Gas chromatography-mass spectrometry (GC-MS) analysis revealed that saturated fatty acids (SFAs) and monounsaturated fatty acids (MUFAs) were the primary FAs in the spinal cords of these domesticated animals, accounting for 72.54-94.23% of total FAs. Notably, oleic acid, stearic acid and palmitic acid emerged as the most abundant FA species. Moreover, untargeted lipidomics by UPLC-Q-Exactive Orbitrap-MS demonstrated that five lipid classes, including glycerophospholipids (GPs), sphingolipids (SPs), glycerolipids (GLs), FAs and saccharolipids (SLs), were identified in the investigated spinal cords, with phosphatidylcholine (PC) being the most abundant among all identified lipid classes. Furthermore, canonical correlation analysis showed that PC, PE, TAG, HexCer-NS and SM were significantly associated with genome sequence data. These informative data provide insight into the structure and function of mammalian nervous tissues and represent a novel contribution to lipidomics.
Collapse
Affiliation(s)
- Na Li
- College of Food Science and Engineering, Central South University of Forestry and Technology, Changsha 410004, China;
- School of Food Science and Engineering, Shaanxi University of Science and Technology, Xi’an 710021, China; (H.L.); (Z.L.); (H.M.)
| | - Long Xu
- College of Food Science and Technology, Henan Agricultural University, Zhengzhou 450002, China;
| | - Hongbo Li
- School of Food Science and Engineering, Shaanxi University of Science and Technology, Xi’an 710021, China; (H.L.); (Z.L.); (H.M.)
| | - Zhenbin Liu
- School of Food Science and Engineering, Shaanxi University of Science and Technology, Xi’an 710021, China; (H.L.); (Z.L.); (H.M.)
| | - Haizhen Mo
- School of Food Science and Engineering, Shaanxi University of Science and Technology, Xi’an 710021, China; (H.L.); (Z.L.); (H.M.)
| | - Yue Wu
- College of Food Science and Engineering, Central South University of Forestry and Technology, Changsha 410004, China;
| |
Collapse
|
4
|
Mandal A, Maji P. Multiview Regularized Discriminant Canonical Correlation Analysis: Sequential Extraction of Relevant Features From Multiblock Data. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:5497-5509. [PMID: 35417362 DOI: 10.1109/tcyb.2022.3155875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
One of the important issues associated with real-life high-dimensional data analysis is how to extract significant and relevant features from multiview data. The multiset canonical correlation analysis (MCCA) is a well-known statistical method for multiview data integration. It finds a linear subspace that maximizes the correlations among different views. However, the existing methods to find the multiset canonical variables are computationally very expensive, which restricts the application of the MCCA in real-life big data analysis. The covariance matrix of each high-dimensional view may also suffer from the singularity problem due to the limited number of samples. Moreover, the MCCA-based existing feature extraction algorithms are, in general, unsupervised in nature. In this regard, a new supervised feature extraction algorithm is proposed, which integrates multimodal multidimensional data sets by solving maximal correlation problem of the MCCA. A new block matrix representation is introduced to reduce the computational complexity for computing the canonical variables of the MCCA. The analytical formulation enables efficient computation of the multiset canonical variables under supervised ridge regression optimization technique. It deals with the "curse of dimensionality" problem associated with high-dimensional data and facilitates the sequential generation of relevant features with significantly lower computational cost. The effectiveness of the proposed multiblock data integration algorithm, along with a comparison with other existing methods, is demonstrated on several benchmark and real-life cancer data.
Collapse
|
5
|
McKeague IW, Zhang X. Significance testing for canonical correlation analysis in high dimensions. Biometrika 2022; 109:1067-1083. [PMID: 36685139 PMCID: PMC9857302 DOI: 10.1093/biomet/asab059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
We consider the problem of testing for the presence of linear relationships between large sets of random variables based on a post-selection inference approach to canonical correlation analysis. The challenge is to adjust for the selection of subsets of variables having linear combinations with maximal sample correlation. To this end, we construct a stabilized one-step estimator of the euclidean-norm of the canonical correlations maximized over subsets of variables of pre-specified cardinality. This estimator is shown to be consistent for its target parameter and asymptotically normal, provided the dimensions of the variables do not grow too quickly with sample size. We also develop a greedy search algorithm to accurately compute the estimator, leading to a computationally tractable omnibus test for the global null hypothesis that there are no linear relationships between any subsets of variables having the pre-specified cardinality. We further develop a confidence interval that takes the variable selection into account.
Collapse
Affiliation(s)
- Ian W McKeague
- Department of Biostatistics, Columbia University, Room R639, 722 West 168th Street, New York, NY 10032, USA
| | - Xin Zhang
- Department of Statistics, Florida State University, 214 OSB, 117 N. Woodward Ave., Tallahassee, FL 32306, USA
| |
Collapse
|
6
|
Rekavandi AM, Seghouane AK, Evans RJ. Adaptive Brain Activity Detection in Structured Interference and Partially Homogeneous Locally Correlated Disturbance. IEEE Trans Biomed Eng 2022; 69:3064-3073. [PMID: 35320080 DOI: 10.1109/tbme.2022.3161292] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
OBJECTIVE In this paper, we aim to address the problem of subspace detection in the presence of locally-correlated complex Gaussian noise and interference. For applications like brain activity detection using functional magnetic resonance imaging (fMRI) data where the noise is possibly locally correlated, using the sample covariance estimator is not a suitable choice due to significant dependency of its accuracy on the number of observations. METHODS In this study, we take advantage of an assumed banded structure in the covariance matrix to model the local dependence in the noise and propose a new covariance estimation approach. In particular, we use the idea of fac-torizing the joint likelihood function into a few conditional likelihood terms and maximizing each term independently of the others. This process leads to an explicit estimator for banded covariance matrices which requires fewer observations to achieve the same accuracy as the sample covari-ance. This estimate is then fed into an adaptive matched filter, two-step Rao and two-step Wald tests for detection. RESULTS Simulation results reveal the superiority of the proposed methods over well known classical detectors. Finally, the proposed methods are applied to functional magnetic resonance imaging (fMRI) data to localize neural activities in the brain. CONCLUSION The proposed method can offer better activation maps in terms of accuracy and spatial smoothness. SIGNIFICANCE The proposed methods can be seen as alternatives for standard detection approaches which are not perfectly aligned with the properties of fMRI data.
Collapse
|
7
|
Wang W, Kong W, Wang S, Wei K. Detecting Biomarkers of Alzheimer's Disease Based on Multi-constrained Uncertainty-Aware Adaptive Sparse Multi-view Canonical Correlation Analysis. J Mol Neurosci 2022; 72:841-865. [PMID: 35080765 DOI: 10.1007/s12031-021-01963-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 12/29/2021] [Indexed: 12/01/2022]
Abstract
Image genetics mainly explores the pathogenesis of Alzheimer's disease (AD) by studying the relationship between genetic data (such as SNP, gene expression data, and DNA methylation) and imaging data (such as structural MRI (sMRI), fMRI, and PET). Most of the existing research on brain imaging genomics uses two-way or three-way bi-multivariate methods to explore the correlation analysis between genes and brain imaging. However, many of these methods are still affected by the gradient domination or cannot take into account the effect of feature redundancy on the results, so that the typical correlation coefficient and program running speed are not significantly improved. In order to solve the above problems, this paper proposes a multi-constrained uncertainty-aware adaptive sparse multi-view canonical correlation analysis method (MC-unAdaSMCCA) to explore associations among SNPs, gene expression data, and sMRI; that is, based on traditional unAdaSMCCA, orthogonal constraints are imposed on the weights of the three data features through linear programming, which can reduce the redundancy of feature weights to improve the correlation between the data and reduce the complexity of the algorithm to significantly speed up the running speed of the program. Three adaptive sparse multi-view canonical correlation analysis methods are used as benchmarks to evaluate the difference between real neuroimaging data and synthetic data. Compared with the other three methods, our proposed method has obtained better or comparable typical correlation coefficients and typical weights. Moreover, the following experimental results show that the MC-unAdaSMCCA method cannot only identify biomarkers related to AD and mild cognitive impairment (MCI), but also has a strong ability to resist noise and process high-dimensional data. Therefore, our proposed method provides a reliable approach to multi-modal imaging genetic researches.
Collapse
Affiliation(s)
- Wenbo Wang
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Ave., Shanghai, 201306, People's Republic of China
| | - Wei Kong
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Ave., Shanghai, 201306, People's Republic of China.
| | - Shuaiqun Wang
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Ave., Shanghai, 201306, People's Republic of China
| | - Kai Wei
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Ave., Shanghai, 201306, People's Republic of China
| |
Collapse
|
8
|
Seghouane AK, Qadar MA. Sparsity Preserved Canonical Correlation Analysis. 2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) 2020. [DOI: 10.1109/icip40778.2020.9191350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
|
9
|
Qadar MA, Aïssa-El-Bey A, Seghouane AK. Two dimensional CCA via penalized matrix decomposition for structure preserved fMRI data analysis. DIGITAL SIGNAL PROCESSING 2019; 92:36-46. [DOI: 10.1016/j.dsp.2019.04.010] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
|
10
|
Seghouane AK, Shokouhi N. Estimating the Number of Significant Canonical Coordinates. IEEE ACCESS 2019; 7:108806-108817. [DOI: 10.1109/access.2019.2933255] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
|