1
|
Interpretable deep learning methods for multiview learning. BMC Bioinformatics 2024; 25:69. [PMID: 38350879 DOI: 10.1186/s12859-024-05679-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 01/29/2024] [Indexed: 02/15/2024] Open
Abstract
BACKGROUND Technological advances have enabled the generation of unique and complementary types of data or views (e.g. genomics, proteomics, metabolomics) and opened up a new era in multiview learning research with the potential to lead to new biomedical discoveries. RESULTS We propose iDeepViewLearn (Interpretable Deep Learning Method for Multiview Learning) to learn nonlinear relationships in data from multiple views while achieving feature selection. iDeepViewLearn combines deep learning flexibility with the statistical benefits of data and knowledge-driven feature selection, giving interpretable results. Deep neural networks are used to learn view-independent low-dimensional embedding through an optimization problem that minimizes the difference between observed and reconstructed data, while imposing a regularization penalty on the reconstructed data. The normalized Laplacian of a graph is used to model bilateral relationships between variables in each view, therefore, encouraging selection of related variables. iDeepViewLearn is tested on simulated and three real-world data for classification, clustering, and reconstruction tasks. For the classification tasks, iDeepViewLearn had competitive classification results with state-of-the-art methods in various settings. For the clustering task, we detected molecular clusters that differed in their 10-year survival rates for breast cancer. For the reconstruction task, we were able to reconstruct handwritten images using a few pixels while achieving competitive classification accuracy. The results of our real data application and simulations with small to moderate sample sizes suggest that iDeepViewLearn may be a useful method for small-sample-size problems compared to other deep learning methods for multiview learning. CONCLUSION iDeepViewLearn is an innovative deep learning model capable of capturing nonlinear relationships between data from multiple views while achieving feature selection. It is fully open source and is freely available at https://github.com/lasandrall/iDeepViewLearn .
Collapse
|
2
|
Default Mode Network Hypoalignment of Function to Structure Correlates With Depression and Rumination. BIOLOGICAL PSYCHIATRY. COGNITIVE NEUROSCIENCE AND NEUROIMAGING 2024; 9:101-111. [PMID: 37468065 DOI: 10.1016/j.bpsc.2023.06.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 06/06/2023] [Accepted: 06/30/2023] [Indexed: 07/21/2023]
Abstract
BACKGROUND Recent studies have begun to examine how signals in the brain correspond to the underlying white matter structure using tools from the field of graph signal processing to quantify brain function alignment to brain network topology. Here, we applied this framework for the first time toward a transdiagnostic cohort of individuals with internalizing psychopathologies, including mood and anxiety disorders, to uncover how such alignment within the default mode network (DMN) is related to depression and rumination symptoms. METHODS Both diffusion-weighted and resting-state functional magnetic resonance imaging were obtained from participants at baseline (n = 60 patients, n = 19 healthy control participants). Patients were randomized to 12 weeks of treatment with either a selective serotonin reuptake inhibitor or cognitive behavioral therapy, and symptom scales were readministered posttreatment (n = 46 patients at follow-up). Using graph signal processing methodology, we quantified the alignment of functional signals to their underlying white matter structural networks. RESULTS We found that signal alignment within the posterior DMN was decreased in patients with internalizing psychopathologies compared with healthy control participants and was inversely (negatively) correlated with baseline depression and rumination scales. Signal alignment within the posterior DMN was also correlated with the ratio of total within-DMN to extra-DMN functional connectivity for these regions. CONCLUSIONS These findings are consistent with previous literature regarding pathological promiscuity of posterior DMN connectivity and provide the first graph signal processing-based analyses in a transdiagnostic cohort of patients with internalizing psychopathologies.
Collapse
|
3
|
Mental Calculation Drives Reliable and Weak Distant Connectivity While Music Listening Induces Dense Local Connectivity. PHENOMICS (CHAM, SWITZERLAND) 2021; 1:285-298. [PMID: 36939768 PMCID: PMC9590531 DOI: 10.1007/s43657-021-00027-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 08/13/2021] [Accepted: 08/22/2021] [Indexed: 11/27/2022]
Abstract
Mathematical calculation usually requires sustained attention to manipulate numbers in the mind, while listening to light music has a relaxing effect on the brain. The differences in the corresponding brain functional network topologies underlying these behaviors remain rarely known. Here, we systematically examined the brain dynamics of four behaviors (resting with eyes closed and eyes open, tasks of music listening and mental calculation) using 64-channel electroencephalogram (EEG) recordings and graph theory analysis. We developed static and dynamic minimum spanning tree (MST) analysis method and demonstrated that the brain network topology under mental calculation is a more line-like structure with less tree hierarchy and leaf fraction; however, the hub regions, which are mainly located in the frontal, temporal and parietal regions, grow more stable over time. In contrast, music-listening drives the brain to exhibit a highly rich network of star structure, and the hub regions are mainly located in the posterior regions. We then adopted the dynamic dissimilarity of different MSTs over time based on the graph Laplacian and revealed low dissimilarity during mental calculation. These results suggest that the human brain functional connectivity of individuals has unique dynamic diversity and flexibility under various behaviors. Supplementary Information The online version contains supplementary material available at 10.1007/s43657-021-00027-w.
Collapse
|
4
|
Emergence of canonical functional networks from the structural connectome. Neuroimage 2021; 237:118190. [PMID: 34022382 PMCID: PMC8451304 DOI: 10.1016/j.neuroimage.2021.118190] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 04/05/2021] [Accepted: 05/18/2021] [Indexed: 01/21/2023] Open
Abstract
How do functional brain networks emerge from the underlying wiring of the brain? We examine how resting-state functional activation patterns emerge from the underlying connectivity and length of white matter fibers that constitute its “structural connectome”. By introducing realistic signal transmission delays along fiber projections, we obtain a complex-valued graph Laplacian matrix that depends on two parameters: coupling strength and oscillation frequency. This complex Laplacian admits a complex-valued eigen-basis in the frequency domain that is highly tunable and capable of reproducing the spatial patterns of canonical functional networks without requiring any detailed neural activity modeling. Specific canonical functional networks can be predicted using linear superposition of small subsets of complex eigenmodes. Using a novel parameter inference procedure we show that the complex Laplacian outperforms the real-valued Laplacian in predicting functional networks. The complex Laplacian eigenmodes therefore constitute a tunable yet parsimonious substrate on which a rich repertoire of realistic functional patterns can emerge. Although brain activity is governed by highly complex nonlinear processes and dense connections, our work suggests that simple extensions of linear models to the complex domain effectively approximate rich macroscopic spatial patterns observable on BOLD fMRI.
Collapse
|
5
|
Asymmetric high-order anatomical brain connectivity sculpts effective connectivity. Netw Neurosci 2020; 4:871-890. [PMID: 33615094 PMCID: PMC7888488 DOI: 10.1162/netn_a_00150] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 05/18/2020] [Indexed: 12/12/2022] Open
Abstract
Bridging the gap between symmetric, direct white matter brain connectivity and neural dynamics that are often asymmetric and polysynaptic may offer insights into brain architecture, but this remains an unresolved challenge in neuroscience. Here, we used the graph Laplacian matrix to simulate symmetric and asymmetric high-order diffusion processes akin to particles spreading through white matter pathways. The simulated indirect structural connectivity outperformed direct as well as absent anatomical information in sculpting effective connectivity, a measure of causal and directed brain dynamics. Crucially, an asymmetric diffusion process determined by the sensitivity of the network nodes to their afferents best predicted effective connectivity. The outcome is consistent with brain regions adapting to maintain their sensitivity to inputs within a dynamic range. Asymmetric network communication models offer a promising perspective for understanding the relationship between structural and functional brain connectomes, both in normalcy and neuropsychiatric conditions.
Collapse
|
6
|
Spectral Embedding Norm: Looking Deep into the Spectrum of the Graph Laplacian. SIAM JOURNAL ON IMAGING SCIENCES 2020; 13:1015-1048. [PMID: 34136062 PMCID: PMC8204716 DOI: 10.1137/18m1283160] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
The extraction of clusters from a dataset which includes multiple clusters and a significant background component is a non-trivial task of practical importance. In image analysis this manifests for example in anomaly detection and target detection. The traditional spectral clustering algorithm, which relies on the leading K eigenvectors to detect K clusters, fails in such cases. In this paper we propose the spectral embedding norm which sums the squared values of the first I normalized eigenvectors, where I can be significantly larger than K. We prove that this quantity can be used to separate clusters from the background in unbalanced settings, including extreme cases such as outlier detection. The performance of the algorithm is not sensitive to the choice of I, and we demonstrate its application on synthetic and real-world remote sensing and neuroimaging datasets.
Collapse
|
7
|
NUMERICAL INTEGRATION ON GRAPHS: WHERE TO SAMPLE AND HOW TO WEIGH. MATHEMATICS OF COMPUTATION 2020; 89:1933-1952. [PMID: 33927452 PMCID: PMC8081285 DOI: 10.1090/mcom/3515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Let G = (V,E,w) be a finite, connected graph with weighted edges. We are interested in the problem of finding a subset W ⊂ V of vertices and weights aw such that 1 | V | ∑ v ∈ V f ( v ) ∼ ∑ w ∈ W a w f ( w ) for functions f : V → ℝ that are 'smooth' with respect to the geometry of the graph; here ~ indicates that we want the right-hand side to be as close to the left-hand side as possible. The main application are problems where f is known to vary smoothly over the underlying graph but is expensive to evaluate on even a single vertex. We prove an inequality showing that the integration problem can be rewritten as a geometric problem ('the optimal packing of heat balls'). We discuss how one would construct approximate solutions of the heat ball packing problem; numerical examples demonstrate the efficiency of the method.
Collapse
|
8
|
ADHD classification by dual subspace learning using resting-state functional connectivity. Artif Intell Med 2020; 103:101786. [PMID: 32143793 DOI: 10.1016/j.artmed.2019.101786] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2019] [Revised: 12/11/2019] [Accepted: 12/30/2019] [Indexed: 11/19/2022]
Abstract
As one of the most common neurobehavioral diseases in school-age children, Attention Deficit Hyperactivity Disorder (ADHD) has been increasingly studied in recent years. But it is still a challenge problem to accurately identify ADHD patients from healthy persons. To address this issue, we propose a dual subspace classification algorithm by using individual resting-state Functional Connectivity (FC). In detail, two subspaces respectively containing ADHD and healthy control features, called as dual subspaces, are learned with several subspace measures, wherein a modified graph embedding measure is employed to enhance the intra-class relationship of these features. Therefore, given a subject (used as test data) with its FCs, the basic classification principle is to compare its projected component energy of FCs on each subspace and then predict the ADHD or control label according to the subspace with larger energy. However, this principle in practice works with low efficiency, since the dual subspaces are unstably obtained from ADHD databases of small size. Thereby, we present an ADHD classification framework by a binary hypothesis testing of test data. Here, the FCs of test data with its ADHD or control label hypothesis are employed in the discriminative FC selection of training data to promote the stability of dual subspaces. For each hypothesis, the dual subspaces are learned from the selected FCs of training data. The total projected energy of these FCs is also calculated on the subspaces. Sequentially, the energy comparison is carried out under the binary hypotheses. The ADHD or control label is finally predicted for test data with the hypothesis of larger total energy. In the experiments on ADHD-200 dataset, our method achieves a significant classification performance compared with several state-of-the-art machine learning and deep learning methods, where our accuracy is about 90 % for most of ADHD databases in the leave-one-out cross-validation test.
Collapse
|
9
|
PCA via joint graph Laplacian and sparse constraint: Identification of differentially expressed genes and sample clustering on gene expression data. BMC Bioinformatics 2019; 20:716. [PMID: 31888433 PMCID: PMC6936054 DOI: 10.1186/s12859-019-3229-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background In recent years, identification of differentially expressed genes and sample clustering have become hot topics in bioinformatics. Principal Component Analysis (PCA) is a widely used method in gene expression data. However, it has two limitations: first, the geometric structure hidden in data, e.g., pair-wise distance between data points, have not been explored. This information can facilitate sample clustering; second, the Principal Components (PCs) determined by PCA are dense, leading to hard interpretation. However, only a few of genes are related to the cancer. It is of great significance for the early diagnosis and treatment of cancer to identify a handful of the differentially expressed genes and find new cancer biomarkers. Results In this study, a new method gLSPCA is proposed to integrate both graph Laplacian and sparse constraint into PCA. gLSPCA on the one hand improves the clustering accuracy by exploring the internal geometric structure of the data, on the other hand identifies differentially expressed genes by imposing a sparsity constraint on the PCs. Conclusions Experiments of gLSPCA and its comparison with existing methods, including Z-SPCA, GPower, PathSPCA, SPCArt, gLPCA, are performed on real datasets of both pancreatic cancer (PAAD) and head & neck squamous carcinoma (HNSC). The results demonstrate that gLSPCA is effective in identifying differentially expressed genes and sample clustering. In addition, the applications of gLSPCA on these datasets provide several new clues for the exploration of causative factors of PAAD and HNSC.
Collapse
|
10
|
Abstract
The average fitness difference between adjacent sites in a fitness landscape is an important descriptor that impacts in particular the dynamics of selection/mutation processes on the landscape. Of particular interest is its connection to the error threshold phenomenon. We show here that this parameter is intimately tied to the ruggedness through the landscape's amplitude spectrum. For the NK model, a surprisingly simple analytical estimate explains simulation data with high precision.
Collapse
|
11
|
Abstract
We introduce a new, semi-supervised classification method that extensively exploits knowledge. The method has three steps. First, the manifold regularization mechanism, adapted from the Laplacian support vector machine (LapSVM), is adopted to mine the manifold structure embedded in all training data, especially in numerous label-unknown data. Meanwhile, by converting the labels into pairwise constraints, the pairwise constraint regularization formula (PCRF) is designed to compensate for the few but valuable labelled data. Second, by further combining the PCRF with the manifold regularization, the precise manifold and pairwise constraint jointly regularized formula (MPCJRF) is achieved. Third, by incorporating the MPCJRF into the framework of the conventional SVM, our approach, referred to as semi-supervised classification with extensive knowledge exploitation (SSC-EKE), is developed. The significance of our research is fourfold: 1) The MPCJRF is an underlying adjustment, with respect to the pairwise constraints, to the graph Laplacian enlisted for approximating the potential data manifold. This type of adjustment plays the correction role, as an unbiased estimation of the data manifold is difficult to obtain, whereas the pairwise constraints, converted from the given labels, have an overall high confidence level. 2) By transforming the values of the two terms in the MPCJRF such that they have the same range, with a trade-off factor varying within the invariant interval [0, 1), the appropriate impact of the pairwise constraints to the graph Laplacian can be self-adaptively determined. 3) The implication regarding extensive knowledge exploitation is embodied in SSC-EKE. That is, the labelled examples are used not only to control the empirical risk but also to constitute the MPCJRF. Moreover, all data, both labelled and unlabelled, are recruited for the model smoothness and manifold regularization. 4) The complete framework of SSC-EKE organically incorporates multiple theories, such as joint manifold and pairwise constraint-based regularization, smoothness in the reproducing kernel Hilbert space, empirical risk minimization, and spectral methods, which facilitates the preferable classification accuracy as well as the generalizability of SSC-EKE.
Collapse
|
12
|
Enhancing Diffusion MRI Measures By Integrating Grey and White Matter Morphometry With Hyperbolic Wasserstein Distance. PROCEEDINGS. IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING 2017; 2017:520-524. [PMID: 28936280 DOI: 10.1109/isbi.2017.7950574] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In order to improve the preclinical diagnose of Alzheimer's disease (AD), there is a great deal of interest in analyzing the AD related brain structural changes with magnetic resonance image (MRI) analyses. As the major features, variation of the structural connectivity and the cortical surface morphometry provide different views of structural changes to determine whether AD is present on presymptomatic patients. However, the large scale tensor-valued information and relatively low imaging resolution in diffusion MRI (dMRI) have created huge challenges for analysis. In this paper, we propose a novel framework that improves dMRI analysis power by fusing cortical surface morphometry features from structural MRI (sMRI). We first compute the hyperbolic harmonic maps between cortical surfaces with the landmark constraints thus to precisely evaluate surface tensor-based morphometry. Meanwhile, the graph-based analysis of structural connectivity derived from dMRI is conducted. Next, we fuse these two features via the optimal mass transportation (OMT) and eventually the Wasserstein distance (WD) based single image index is computed as a potential clinical multimodality imaging score. We apply our framework to brain images of 20 AD patients and 20 matched healthy controls, randomly chosen from the Alzheimer's Disease Neuroimaging Initiative (AD-NI2) dataset. Our preliminary experimental results of group classification outperformed those of some other single dMRI-based features, such as regional hippocampal volume, mean scores of fractional anisotropy (FA) and mean axial (MD). The novel image fusion pipeline and simple imaging score of structural changes may benefit the preclinical AD and AD prevention research.
Collapse
|
13
|
Manifold learning and maximum likelihood estimation for hyperbolic network embedding. APPLIED NETWORK SCIENCE 2016; 1:10. [PMID: 30533502 PMCID: PMC6245200 DOI: 10.1007/s41109-016-0013-0] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Accepted: 10/25/2016] [Indexed: 05/23/2023]
Abstract
The Popularity-Similarity (PS) model sustains that clustering and hierarchy, properties common to most networks representing complex systems, are the result of an optimisation process in which nodes seek to form ties, not only with the most connected (popular) system components, but also with those that are similar to them. This model has a geometric interpretation in hyperbolic space, where distances between nodes abstract popularity-similarity trade-offs and the formation of scale-free and strongly clustered networks can be accurately described. Current methods for mapping networks to hyperbolic space are based on maximum likelihood estimations or manifold learning. The former approach is very accurate but slow; the latter improves efficiency at the cost of accuracy. Here, we analyse the strengths and limitations of both strategies and assess the advantages of combining them to efficiently embed big networks, allowing for their examination from a geometric perspective. Our evaluations in artificial and real networks support the idea that hyperbolic distance constraints play a significant role in the formation of edges between nodes. This means that challenging problems in network science, like link prediction or community detection, could be more easily addressed under this geometric framework.
Collapse
|
14
|
Abstract
We introduce a general framework for estimation of inverse covariance, or precision, matrices from heterogeneous populations. The proposed framework uses a Laplacian shrinkage penalty to encourage similarity among estimates from disparate, but related, subpopulations, while allowing for differences among matrices. We propose an efficient alternating direction method of multipliers (ADMM) algorithm for parameter estimation, as well as its extension for faster computation in high dimensions by thresholding the empirical covariance matrix to identify the joint block diagonal structure in the estimated precision matrices. We establish both variable selection and norm consistency of the proposed estimator for distributions with exponential or polynomial tails. Further, to extend the applicability of the method to the settings with unknown populations structure, we propose a Laplacian penalty based on hierarchical clustering, and discuss conditions under which this data-driven choice results in consistent estimation of precision matrices in heterogenous populations. Extensive numerical studies and applications to gene expression data from subtypes of cancer with distinct clinical outcomes indicate the potential advantages of the proposed method over existing approaches.
Collapse
|
15
|
Semi-supervised clinical text classification with Laplacian SVMs: an application to cancer case management. J Biomed Inform 2013; 46:869-75. [PMID: 23845911 DOI: 10.1016/j.jbi.2013.06.014] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2012] [Revised: 05/30/2013] [Accepted: 06/28/2013] [Indexed: 11/26/2022]
Abstract
OBJECTIVE To compare linear and Laplacian SVMs on a clinical text classification task; to evaluate the effect of unlabeled training data on Laplacian SVM performance. BACKGROUND The development of machine-learning based clinical text classifiers requires the creation of labeled training data, obtained via manual review by clinicians. Due to the effort and expense involved in labeling data, training data sets in the clinical domain are of limited size. In contrast, electronic medical record (EMR) systems contain hundreds of thousands of unlabeled notes that are not used by supervised machine learning approaches. Semi-supervised learning algorithms use both labeled and unlabeled data to train classifiers, and can outperform their supervised counterparts. METHODS We trained support vector machines (SVMs) and Laplacian SVMs on a training reference standard of 820 abdominal CT, MRI, and ultrasound reports labeled for the presence of potentially malignant liver lesions that require follow up (positive class prevalence 77%). The Laplacian SVM used 19,845 randomly sampled unlabeled notes in addition to the training reference standard. We evaluated SVMs and Laplacian SVMs on a test set of 520 labeled reports. RESULTS The Laplacian SVM trained on labeled and unlabeled radiology reports significantly outperformed supervised SVMs (Macro-F1 0.773 vs. 0.741, Sensitivity 0.943 vs. 0.911, Positive Predictive value 0.877 vs. 0.883). Performance improved with the number of labeled and unlabeled notes used to train the Laplacian SVM (pearson's ρ=0.529 for correlation between number of unlabeled notes and macro-F1 score). These results suggest that practical semi-supervised methods such as the Laplacian SVM can leverage the large, unlabeled corpora that reside within EMRs to improve clinical text classification.
Collapse
|