1
|
Labate D, Shi J. Low dimensional approximation and generalization of multivariate functions on smooth manifolds using deep ReLU neural networks. Neural Netw 2024; 174:106223. [PMID: 38458005 DOI: 10.1016/j.neunet.2024.106223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 02/29/2024] [Accepted: 02/29/2024] [Indexed: 03/10/2024]
Abstract
The expressive power of deep neural networks is manifested by their remarkable ability to approximate multivariate functions in a way that appears to overcome the curse of dimensionality. This ability is exemplified by their success in solving high-dimensional problems where traditional numerical solvers fail due to their limitations in accurately representing high-dimensional structures. To provide a theoretical framework for explaining this phenomenon, we analyze the approximation of Hölder functions defined on a d-dimensional smooth manifold M embedded in RD, with d≪D, using deep neural networks. We prove that the uniform convergence estimates of the approximation and generalization errors by deep neural networks with ReLU activation functions do not depend on the ambient dimension D of the function but only on its lower manifold dimension d, in a precise sense. Our result improves existing results from the literature where approximation and generalization errors were shown to depend weakly on D.
Collapse
Affiliation(s)
- Demetrio Labate
- Department of Applied Mathematics, University of Houston, 651 Phillip G Hoffman, Houston, 77204-3008, TX, USA.
| | - Ji Shi
- Department of Applied Mathematics, University of Houston, 651 Phillip G Hoffman, Houston, 77204-3008, TX, USA.
| |
Collapse
|
2
|
Yu W, Xu N, Huang N, Chen H. Bridging the gap: Geometry-centric discriminative manifold distribution alignment for enhanced classification in colorectal cancer imaging. Comput Biol Med 2024; 170:107998. [PMID: 38266468 DOI: 10.1016/j.compbiomed.2024.107998] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 12/19/2023] [Accepted: 01/13/2024] [Indexed: 01/26/2024]
Abstract
The early detection of colorectal cancer (CRC) through medical image analysis is a pivotal concern in healthcare, with the potential to significantly reduce mortality rates. Current Domain Adaptation (DA) methods strive to mitigate the discrepancies between different imaging modalities that are critical in identifying CRC, yet they often fall short in addressing the complexity of cancer's presentation within these images. These conventional techniques typically overlook the intricate geometrical structures and the local variations within the data, leading to suboptimal diagnostic performance. This study introduces an innovative application of the Discriminative Manifold Distribution Alignment (DMDA) method, which is specifically engineered to enhance the medical image diagnosis of colorectal cancer. DMDA transcends traditional DA approaches by focusing on both local and global distribution alignments and by intricately learning the intrinsic geometrical characteristics present in manifold space. This is achieved without depending on the potentially misleading pseudo-labels, a common pitfall in existing methodologies. Our implementation of DMDA on three distinct datasets, involving several unique DA tasks, has consistently demonstrated superior classification accuracy and computational efficiency. The method adeptly captures the complex morphological and textural nuances of CRC lesions, leading to a significant leap in domain adaptation technology. DMDA's ability to reconcile global and local distributional disparities, coupled with its manifold-based geometrical structure learning, signals a paradigm shift in medical imaging analysis. The results obtained are not only promising in terms of advancing domain adaptation theory but also in their practical implications, offering the prospect of substantially improved diagnostic accuracy and faster clinical workflows. This heralds a transformative approach in personalized oncology care, aligning with the pressing need for early and accurate CRC detection.
Collapse
Affiliation(s)
- Weiwei Yu
- Department of Gastroenterology, Wenzhou TCM Hospital of Zhejiang Chinese Medical University, Wenzhou, Zhejiang, 325000, China.
| | - Nuo Xu
- Department of Medical Oncology, Wenzhou TCM Hospital of Zhejiang Chinese Medical University, Wenzhou, Zhejiang, 325000, China.
| | - Nuanhui Huang
- Department of Medical Oncology, Wenzhou TCM Hospital of Zhejiang Chinese Medical University, Wenzhou, Zhejiang, 325000, China.
| | - Houliang Chen
- Department of Medical Oncology, Wenzhou TCM Hospital of Zhejiang Chinese Medical University, Wenzhou, Zhejiang, 325000, China.
| |
Collapse
|
3
|
Boccato T, Ferrante M, Duggento A, Toschi N. Beyond multilayer perceptrons: Investigating complex topologies in neural networks. Neural Netw 2024; 171:215-228. [PMID: 38096650 DOI: 10.1016/j.neunet.2023.12.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 12/07/2023] [Accepted: 12/07/2023] [Indexed: 01/29/2024]
Abstract
This study delves into the crucial aspect of network topology in artificial neural networks (NNs) and its impact on model performance. Addressing the need to comprehend how network structures influence learning capabilities, the research contrasts traditional multilayer perceptrons (MLPs) with models built on various complex topologies using novel network generation techniques. Drawing insights from synthetic datasets, the study reveals the remarkable accuracy of complex NNs, particularly in high-difficulty scenarios, outperforming MLPs. Our exploration extends to real-world datasets, highlighting the task-specific nature of optimal network topologies and unveiling trade-offs, including increased computational demands and reduced robustness to graph damage in complex NNs compared to MLPs. This research underscores the pivotal role of complex topologies in addressing challenging learning tasks. However, it also signals the necessity for deeper insights into the complex interplay among topological attributes influencing NN performance. By shedding light on the advantages and limitations of complex topologies, this study provides valuable guidance for practitioners and paves the way for future endeavors to design more efficient and adaptable neural architectures across various applications.
Collapse
Affiliation(s)
- Tommaso Boccato
- Department of Biomedicine and Prevention, University of Rome Tor Vergata, Rome, Italy.
| | - Matteo Ferrante
- Department of Biomedicine and Prevention, University of Rome Tor Vergata, Rome, Italy.
| | - Andrea Duggento
- Department of Biomedicine and Prevention, University of Rome Tor Vergata, Rome, Italy.
| | - Nicola Toschi
- Department of Biomedicine and Prevention, University of Rome Tor Vergata, Rome, Italy; A.A. Martinos Center for Biomedical Imaging and Harvard Medical School, Boston, USA.
| |
Collapse
|
4
|
Priyanka EB, Vivek S, Thangavel S, Sampathkumar V, Al-Zaqri N, Warad I. Forecasting and meta-features estimation of wastewater and climate change impacts in coastal region using manifold learning. Environ Res 2024; 240:117355. [PMID: 37863164 DOI: 10.1016/j.envres.2023.117355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2023] [Revised: 08/31/2023] [Accepted: 10/07/2023] [Indexed: 10/22/2023]
Abstract
South Asia's coastlines are the most densely inhabited and economically active ecosystems have already begun to shift due to climate change. Over the past century, climate change has contributed to a gradual and considerable rise in sea level, which has eroded shorelines and increased storm-related coastal flooding. The differences in estuary water quality over time, both seasonally and annually, have been efficiently controlled by changes in stream flow. Assessment requires digitized analytical platforms to lower the risk of catastrophes associated with climate change in coastal towns. To predict future changes in an area's vulnerability and waste planning decisions, a prospective investigation requires qualitative and quantitative scenarios. The paper concentrates on the development of a forecasting platform to evaluate the climate change and waste water impacts on the south coastal region of India. Due to the enhancement of Digitization, a multi-model ensemble combined with manifold learning is implemented on the multi-case models influencing the uncertainty probability rate of 23% and can be ignored with desired precaution on the coastal environmental. Because Manifold Learning Analysis results cannot be utilized directly in wastewater management studies because of their inherent biases, a statistical bias correction and meta-feature estimation have been implemented. Within the climate-hydrology modeling chain, the results demonstrate a wide range of expected changes in water resources in some places. Experimental statistics reveal that the forecasted rate of 91.45% will be the better choice to reduce the uncertainty of climatic change and wastewater management.
Collapse
Affiliation(s)
- E B Priyanka
- Department of Mechatronics Engineering, Kongu Engineering College, Perundurai, 638060, India.
| | - S Vivek
- Department of Civil Engineering, GMR Institute of Technology, Razam, Andra Pradesh, 532127, India.
| | - S Thangavel
- Department of Mechatronics Engineering, Kongu Engineering College, Perundurai, 638060, India.
| | - V Sampathkumar
- Department of Civil Engineering, Kongu Engineering College, Perundurai, 638060, India.
| | - Nabil Al-Zaqri
- Department of Chemistry, College of Science, King Saud University, P.O. Box 2455, Riyadh 11451, Saudi Arabia.
| | - Ismail Warad
- Department of Chemistry, AN-Najah National University, P.O. Box 7, Nablus, Palestine; Research Centre, Manchester Salt & Catalysis, Unit C, 88- 90 Chorlton Rd, M15 4AN Manchester, United Kingdom.
| |
Collapse
|
5
|
Wang SC, Ting CK, Chen CY, Liu C, Lin NC, Loong CC, Wu HT, Lin YT. Arterial blood pressure waveform in liver transplant surgery possesses variability of morphology reflecting recipients' acuity and predicting short term outcomes. J Clin Monit Comput 2023; 37:1521-1531. [PMID: 37436598 DOI: 10.1007/s10877-023-01047-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 06/13/2023] [Indexed: 07/13/2023]
Abstract
We investigated clinical information underneath the beat-to-beat fluctuation of the arterial blood pressure (ABP) waveform morphology. We proposed the Dynamical Diffusion Map algorithm (DDMap) to quantify the variability of morphology. The underlying physiology could be the compensatory mechanisms involving complex interactions between various physiological mechanisms to regulate the cardiovascular system. As a liver transplant surgery contains distinct periods, we investigated its clinical behavior in different surgical steps. Our study used DDmap algorithm, based on unsupervised manifold learning, to obtain a quantitative index for the beat-to-beat variability of morphology. We examined the correlation between the variability of ABP morphology and disease acuity as indicated by Model for End-Stage Liver Disease (MELD) scores, the postoperative laboratory data, and 4 early allograft failure (EAF) scores. Among the 85 enrolled patients, the variability of morphology obtained during the presurgical phase was best correlated with MELD-Na scores. The neohepatic phase variability of morphology was associated with EAF scores as well as postoperative bilirubin levels, international normalized ratio, aspartate aminotransferase levels, and platelet count. Furthermore, variability of morphology presents more associations with the above clinical conditions than the common BP measures and their BP variability indices. The variability of morphology obtained during the presurgical phase is indicative of patient acuity, whereas those during the neohepatic phase are indicative of short-term surgical outcomes.
Collapse
Affiliation(s)
- Shen-Chih Wang
- Department of Anesthesiology, Taipei Veterans General Hospital, Taipei, Taiwan
- School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Chien-Kun Ting
- Department of Anesthesiology, Taipei Veterans General Hospital, Taipei, Taiwan
- School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Cheng-Yen Chen
- School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
- Division of Transplantation Surgery, Taipei Veterans General Hospital, Taipei, Taiwan
| | - Chinsu Liu
- School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
- Division of Transplantation Surgery, Taipei Veterans General Hospital, Taipei, Taiwan
| | - Niang-Cheng Lin
- School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
- Division of Transplantation Surgery, Taipei Veterans General Hospital, Taipei, Taiwan
| | - Che-Chuan Loong
- School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
- Division of Transplantation Surgery, Taipei Veterans General Hospital, Taipei, Taiwan
| | - Hau-Tieng Wu
- Department of Mathematics, Duke University, Durham, NC, USA.
- Department of Statistical Science, Duke University, Durham, NC, USA.
| | - Yu-Ting Lin
- Department of Anesthesiology, Taipei Veterans General Hospital, Taipei, Taiwan.
- School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.
| |
Collapse
|
6
|
Du Y, Sui J, Wang S, Fu R, Jia C. Motor intent recognition of multi-feature fusion EEG signals by UMAP algorithm. Med Biol Eng Comput 2023; 61:2665-2676. [PMID: 37421553 DOI: 10.1007/s11517-023-02878-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2023] [Accepted: 06/25/2023] [Indexed: 07/10/2023]
Abstract
The key to the analysis of electroencephalogram (EEG) signals lies in the extraction of effective features from the raw EEG signals, which can then be utilized to augment the classification accuracy of motor imagery (MI) applications in brain-computer interface (BCI). It can be argued that the utilization of features from multiple domains can be a more effective approach to feature extraction for MI pattern classification, as it can provide a more comprehensive set of information that the traditional single feature extraction method may not be able to capture. In this paper, a multi-feature fusion algorithm based on uniform manifold approximate and projection (UMAP) is proposed for motor imagery EEG signals. The brain functional network and common spatial pattern (CSP) are initially extracted as features. Subsequently, UMAP is utilized to fuse the extracted multi-domain features to generate low-dimensional features with improved discriminative capability. Finally, the k-nearest neighbor (KNN) classifier is applied in a lower dimensional space. The proposed method is evaluated using left-right hand EEG signals, and achieved the average accuracy of over 92%. The results indicate that, compared with single-domain-based feature extraction methods, multi-feature fusion EEG signal classification based on the UMAP algorithm yields superior classification and visualization performance. Feature extraction and fusion based on UMAP algorithm of left-right hand motor imagery.
Collapse
Affiliation(s)
- Yushan Du
- Measurement Technology and Instrumentation Key Lab of Hebei Province, Yanshan University, Qinhuangdao, 066004, China
| | - Jiaxin Sui
- Measurement Technology and Instrumentation Key Lab of Hebei Province, Yanshan University, Qinhuangdao, 066004, China
| | - Shiwei Wang
- Jiangxi New Energy Technology Institute, Xinyu, 338000, China
| | - Rongrong Fu
- Measurement Technology and Instrumentation Key Lab of Hebei Province, Yanshan University, Qinhuangdao, 066004, China.
| | - Chengcheng Jia
- Department of Electrical, Computer & Biomedical Engineering, Ryerson University, Toronto, Canada
| |
Collapse
|
7
|
Zou Y, Tang W, Li B. Spatial segmentation of mass spectrometry imaging data featuring selected principal components. Talanta 2023; 253:123958. [PMID: 36179560 DOI: 10.1016/j.talanta.2022.123958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2022] [Revised: 09/16/2022] [Accepted: 09/19/2022] [Indexed: 12/13/2022]
Abstract
Spatial segmentation aims to find homogeneous/heterogeneous subgroups of spectra or ion images in mass spectrometry imaging (MSI) data. The maps it generated inform researchers of vital characteristics of the data and thus provide the basis for strategizing further biological analysis. Dimensional reduction and clustering are two basic steps of segmentation. Due to the variations in the quality, resolution, density of spectral information, and sizes, not all datasets could be segmented ideally with combinations of different dimensional reduction and clustering algorithms. Here, we proposed a segmentation pipeline that utilized pattern compression by principal component analysis (PCA) and represented by principal components. Instead of preprocessed or raw MSI data, normalized principal components were used for the segmentation process. Multiple datasets of rat brains and mouse kidneys were tested, and the proposed segmentation pipeline presented the obvious advantage of easy-to-use and can be readily intergraded with other existing innovative pipelines.
Collapse
|
8
|
Mitchell-Heggs R, Prado S, Gava GP, Go MA, Schultz SR. Neural manifold analysis of brain circuit dynamics in health and disease. J Comput Neurosci 2023; 51:1-21. [PMID: 36522604 DOI: 10.1007/s10827-022-00839-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 08/30/2022] [Accepted: 10/29/2022] [Indexed: 12/23/2022]
Abstract
Recent developments in experimental neuroscience make it possible to simultaneously record the activity of thousands of neurons. However, the development of analysis approaches for such large-scale neural recordings have been slower than those applicable to single-cell experiments. One approach that has gained recent popularity is neural manifold learning. This approach takes advantage of the fact that often, even though neural datasets may be very high dimensional, the dynamics of neural activity tends to traverse a much lower-dimensional space. The topological structures formed by these low-dimensional neural subspaces are referred to as "neural manifolds", and may potentially provide insight linking neural circuit dynamics with cognitive function and behavioral performance. In this paper we review a number of linear and non-linear approaches to neural manifold learning, including principal component analysis (PCA), multi-dimensional scaling (MDS), Isomap, locally linear embedding (LLE), Laplacian eigenmaps (LEM), t-SNE, and uniform manifold approximation and projection (UMAP). We outline these methods under a common mathematical nomenclature, and compare their advantages and disadvantages with respect to their use for neural data analysis. We apply them to a number of datasets from published literature, comparing the manifolds that result from their application to hippocampal place cells, motor cortical neurons during a reaching task, and prefrontal cortical neurons during a multi-behavior task. We find that in many circumstances linear algorithms produce similar results to non-linear methods, although in particular cases where the behavioral complexity is greater, non-linear methods tend to find lower-dimensional manifolds, at the possible expense of interpretability. We demonstrate that these methods are applicable to the study of neurological disorders through simulation of a mouse model of Alzheimer's Disease, and speculate that neural manifold analysis may help us to understand the circuit-level consequences of molecular and cellular neuropathology.
Collapse
|
9
|
Roger E, Attyé A, Renard F, Baciu M. Leveraging manifold learning techniques to explore white matter anomalies: An application of the TractLearn pipeline in epilepsy. Neuroimage Clin 2022; 36:103209. [PMID: 36162235 DOI: 10.1016/j.nicl.2022.103209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2022] [Revised: 09/15/2022] [Accepted: 09/21/2022] [Indexed: 12/14/2022]
Abstract
An accurate description of brain white matter anatomy in vivo remains a challenge. However, technical progress allows us to analyze structural variations in an increasingly sophisticated way. Current methods of processing diffusion MRI data now make it possible to correct some limiting biases. In addition, the development of statistical learning algorithms offers the opportunity to analyze the data from a new perspective. We applied newly developed tractography models to extract quantitative white matter parameters in a group of patients with chronic temporal lobe epilepsy. Furthermore, we implemented a statistical learning workflow optimized for the MRI diffusion data - the TractLearn pipeline - to model inter-individual variability and predict structural changes in patients. Finally, we interpreted white matter abnormalities in the context of several other parameters reflecting clinical status, as well as neuronal and cognitive functioning for these patients. Overall, we show the relevance of such a diffusion data processing pipeline for the evaluation of clinical populations. The "global to fine scale" funnel statistical approach proposed in this study also contributes to the understanding of neuroplasticity mechanisms involved in refractory epilepsy, thus enriching previous findings.
Collapse
|
10
|
Ge H, Zhu Z, Dai Y, Liu R. Super-resolution reconstruction of biometric features recognition based on manifold learning and deep residual network. Comput Methods Programs Biomed 2022; 221:106822. [PMID: 35667333 DOI: 10.1016/j.cmpb.2022.106822] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 04/10/2022] [Accepted: 04/17/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND AND OBJECTIVE In daily life, face information has the characteristics of uniqueness and universality. However, in a real-world scene, the image information of the face acquired by the acquisition device often contains noises such as blurring and sharpening. As such, super-resolution reconstruction of face features recognition based on manifold learning is proposed in this paper. METHODS We reconstruct low-resolution facial expression images, introduce a simplified residual block network and manifold learning, and propose joint supervision through a new hybrid loss function, which not only retains the color and characteristics of the image, but also retains the high-frequency information. The ResNet50 network uses the weight feature of information entropy to optimize the information of the pooling layer, and the esNet50 network uses the improved PSO algorithm to optimize the initial weight of the error back-propagation phase. RESULTS In the case of inputting extremely low resolution (6 × 6) facial expression images, the accuracy rate is increased by 9.091%. The accuracy of the high-resolution facial expressions after reconstruction with a size of 12×12 is 96.970%. The accuracy rate for happy expressions is 100%, the accuracy rate for anger, disgust, sadness, and surprise recognition is 97%, the accuracy rate for contempt is 94%, and the accuracy rate for fear is 88%. CONCLUSIONS The experimental results verify the feasibility and superiority of the system, and effectively improve the accuracy of low-resolution facial expressions.
Collapse
Affiliation(s)
- Huilin Ge
- School of Electronic Information, Jiangsu University of Science and Technology, Zhenjiang 212003, China.
| | - Zhiyu Zhu
- School of Electronic Information, Jiangsu University of Science and Technology, Zhenjiang 212003, China.
| | - Yuewei Dai
- School of Electronic Information, Jiangsu University of Science and Technology, Zhenjiang 212003, China.
| | - Runbang Liu
- School of Electronic Information, Jiangsu University of Science and Technology, Zhenjiang 212003, China.
| |
Collapse
|
11
|
Ben-Kiki O, Bercovich A, Lifshitz A, Tanay A. Metacell-2: a divide-and-conquer metacell algorithm for scalable scRNA-seq analysis. Genome Biol 2022; 23:100. [PMID: 35440087 PMCID: PMC9019975 DOI: 10.1186/s13059-022-02667-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Accepted: 04/06/2022] [Indexed: 11/10/2022] Open
Abstract
Scaling scRNA-seq to profile millions of cells is crucial for constructing high-resolution maps of transcriptional manifolds. Current analysis strategies, in particular dimensionality reduction and two-phase clustering, offer only limited scaling and sensitivity to define such manifolds. We introduce Metacell-2, a recursive divide-and-conquer algorithm allowing efficient decomposition of scRNA-seq datasets of any size into small and cohesive groups of cells called metacells. Metacell-2 improves outlier cell detection and rare cell type identification, as shown with human bone marrow cell atlas and mouse embryonic data. Metacell-2 is implemented over the scanpy framework for easy integration in any analysis pipeline.
Collapse
Affiliation(s)
- Oren Ben-Kiki
- Department of Computer Science and Applied Mathematics, and Department of Immunology and Reproductive Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Akhiad Bercovich
- Department of Computer Science and Applied Mathematics, and Department of Immunology and Reproductive Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Aviezer Lifshitz
- Department of Computer Science and Applied Mathematics, and Department of Immunology and Reproductive Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Amos Tanay
- Department of Computer Science and Applied Mathematics, and Department of Immunology and Reproductive Biology, Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
12
|
Liu Y, Hu Z, Zhang Y. Symmetric positive definite manifold learning and its application in fault diagnosis. Neural Netw 2021; 147:163-74. [PMID: 35038622 DOI: 10.1016/j.neunet.2021.12.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Revised: 10/07/2021] [Accepted: 12/20/2021] [Indexed: 11/24/2022]
Abstract
Locally linear embedding (LLE) is an effective tool to extract the significant features from a dataset. However, most of the relevant existing algorithms assume that the original dataset resides on a Euclidean space, unfortunately nearly all the original data space is non-Euclidean. In addition, the original LLE does not use the discriminant information of the dataset, which will degrade its performance in feature extraction. To address these problems raised in the conventional LLE, we first employ the original dataset to construct a symmetric positive definite manifold, and then estimate the tangent space of this manifold. Furthermore, the local and global discriminant information are integrated into the LLE, and the improved LLE is operated in the tangent space to extract the important features. We introduce Iris dataset to analyze the capability of the proposed method to extract features. Finally, several experiments are performed on five machinery datasets, and experimental results indicate that our proposed method can extract the excellent low-dimensional representations of the original dataset. Compared with the state-of-the-art methods, the proposed algorithm shows a strong capability for fault diagnosis.
Collapse
|
13
|
Gallos IK, Mantonakis L, Spilioti E, Kattoulas E, Savvidou E, Anyfandi E, Karavasilis E, Kelekis N, Smyrnis N, Siettos CI. The relation of integrated psychological therapy to resting state functional brain connectivity networks in patients with schizophrenia. Psychiatry Res 2021; 306:114270. [PMID: 34775295 DOI: 10.1016/j.psychres.2021.114270] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Revised: 10/22/2021] [Accepted: 10/31/2021] [Indexed: 01/05/2023]
Abstract
Functional brain dysconnectivity measured with resting state functional magnetic resonance imaging (rsfMRI) has been linked to cognitive impairment in schizophrenia. This study investigated the effects on functional brain connectivity of Integrated Psychological Therapy (IPT), a cognitive behavioral oriented group intervention program, in 31 patients with schizophrenia. Patients received IPT or an equal intensity non-specific psychological treatment in a non-randomized design. Evidence of improvement in executive and social functions, psychopathology and overall level of functioning was observed after treatment completion at six months only in the IPT treatment group and was partially sustained at one-year follow up. Independent Component Analysis and Isometric Mapping (ISOMAP), a non-linear manifold learning algorithm, were used to construct functional connectivity networks from the rsfMRI data. Functional brain dysconnectivity was observed in patients compared to a group of 17 healthy controls, both globally and specifically including the default mode (DMN) and frontoparietal network (FPN). DMN and FPN connectivity were reversed towards healthy control patterns only in the IPT treatment group and these effects were sustained at follow up for DMN but not FPN. These data suggest the use of rsfMRI as a biomarker for accessing and monitoring the therapeutic effects of cognitive remediation therapy in schizophrenia.
Collapse
Affiliation(s)
- I K Gallos
- School of Applied Mathematics and Physical Sciences, National Technical University of Athens, Athens, Greece
| | - L Mantonakis
- Laboratory of Cognitive Neuroscience and Sensorimotor Control, University Mental Health, Neurosciences and Precision Medicine Research Institute "COSTAS STEFANIS", Athens, Greece; First Psychiatry Department, National and Kapodistrian University of Athens, School of Medicine, Eginition Hospital, Athens, Greece
| | - E Spilioti
- Laboratory of Cognitive Neuroscience and Sensorimotor Control, University Mental Health, Neurosciences and Precision Medicine Research Institute "COSTAS STEFANIS", Athens, Greece; First Psychiatry Department, National and Kapodistrian University of Athens, School of Medicine, Eginition Hospital, Athens, Greece
| | - E Kattoulas
- Laboratory of Cognitive Neuroscience and Sensorimotor Control, University Mental Health, Neurosciences and Precision Medicine Research Institute "COSTAS STEFANIS", Athens, Greece
| | - E Savvidou
- Laboratory of Cognitive Neuroscience and Sensorimotor Control, University Mental Health, Neurosciences and Precision Medicine Research Institute "COSTAS STEFANIS", Athens, Greece
| | - E Anyfandi
- First Psychiatry Department, National and Kapodistrian University of Athens, School of Medicine, Eginition Hospital, Athens, Greece
| | - E Karavasilis
- Second Department of Radiology, National and Kapodistrian University of Athens, School of Medicine, University General Hospital "ATTIKON", Athens, Greece
| | - N Kelekis
- Second Department of Radiology, National and Kapodistrian University of Athens, School of Medicine, University General Hospital "ATTIKON", Athens, Greece
| | - N Smyrnis
- Laboratory of Cognitive Neuroscience and Sensorimotor Control, University Mental Health, Neurosciences and Precision Medicine Research Institute "COSTAS STEFANIS", Athens, Greece; Second Psychiatry Department, National and Kapodistrian University of Athens, School of Medicine, University General Hospital "ATTIKON", Athens, Greece.
| | - C I Siettos
- Dipartimento di Matematica e Applicazioni "Renato Caccioppoli", Università degli Studi di Napoli Federico II, Naples, Italy
| |
Collapse
|
14
|
Maxime DF, Pamela M, Patrick C, Nicolas D. Characterizing interactions between cardiac shape and deformation by non-linear manifold learning. Med Image Anal 2021; 75:102278. [PMID: 34731772 DOI: 10.1016/j.media.2021.102278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Revised: 09/08/2021] [Accepted: 10/18/2021] [Indexed: 10/20/2022]
Abstract
In clinical routine, high-dimensional descriptors of the cardiac function such as shape and deformation are reduced to scalars (e.g. volumes or ejection fraction), which limit the characterization of complex diseases. Besides, these descriptors undergo interactions depending on disease, which may bias their computational analysis. In this paper, we aim at characterizing such interactions by unsupervised manifold learning. We propose to use a sparsified version of Multiple Manifold Learning to align the latent spaces encoding each descriptor and weighting the strength of the alignment depending on each pair of samples. While this framework was up to now only applied to link different datasets from the same manifold, we demonstrate its relevance to characterize the interactions between different but partially related descriptors of the cardiac function (shape and deformation). We benchmark our approach against linear and non-linear embedding strategies, among which the fusion of manifolds by Multiple Kernel Learning, the independent embedding of each descriptor by Diffusion Maps, and a strict alignment based on pairwise correspondences. We first evaluated the methods on a synthetic dataset from a 0D cardiac model where the interactions between descriptors are fully controlled. Then, we transfered them to a population of right ventricular meshes from 310 subjects (100 healthy and 210 patients with right ventricular disease) obtained from 3D echocardiography, where the link between shape and deformation is key for disease understanding. Our experiments underline the relevance of jointly considering shape and deformation descriptors, and that manifold alignment is preferable over fusion for our application. They also confirm at a finer scale the characteristic traits of the right ventricular diseases in our population.
Collapse
Affiliation(s)
- Di Folco Maxime
- Univ Lyon, UCBL, Inserm, INSA Lyon, CNRS, CREATIS, UMR5220, U1294,Villeurbanne 69621, France.
| | - Moceri Pamela
- Centre Hospitalier Universitaire de Nice, Service de Cardiologie, Nice, France
| | - Clarysse Patrick
- Univ Lyon, UCBL, Inserm, INSA Lyon, CNRS, CREATIS, UMR5220, U1294,Villeurbanne 69621, France
| | - Duchateau Nicolas
- Univ Lyon, UCBL, Inserm, INSA Lyon, CNRS, CREATIS, UMR5220, U1294,Villeurbanne 69621, France
| |
Collapse
|
15
|
Tonin F, Patrinos P, Suykens JAK. Unsupervised learning of disentangled representations in deep restricted kernel machines with orthogonality constraints. Neural Netw 2021; 142:661-679. [PMID: 34399376 DOI: 10.1016/j.neunet.2021.07.023] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Revised: 05/24/2021] [Accepted: 07/19/2021] [Indexed: 10/20/2022]
Abstract
We introduce Constr-DRKM, a deep kernel method for the unsupervised learning of disentangled data representations. We propose augmenting the original deep restricted kernel machine formulation for kernel PCA by orthogonality constraints on the latent variables to promote disentanglement and to make it possible to carry out optimization without first defining a stabilized objective. After discussing a number of algorithms for end-to-end training, we quantitatively evaluate the proposed method's effectiveness in disentangled feature learning. We demonstrate on four benchmark datasets that this approach performs similarly overall to β-VAE on several disentanglement metrics when few training points are available while being less sensitive to randomness and hyperparameter selection than β-VAE. We also present a deterministic initialization of Constr-DRKM's training algorithm that significantly improves the reproducibility of the results. Finally, we empirically evaluate and discuss the role of the number of layers in the proposed methodology, examining the influence of each principal component in every layer and showing that components in lower layers act as local feature detectors capturing the broad trends of the data distribution, while components in deeper layers use the representation learned by previous layers and more accurately reproduce higher-level features.
Collapse
Affiliation(s)
- Francesco Tonin
- Department of Electrical Engineering, ESAT-STADIUS, KU Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium.
| | - Panagiotis Patrinos
- Department of Electrical Engineering, ESAT-STADIUS, KU Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium.
| | - Johan A K Suykens
- Department of Electrical Engineering, ESAT-STADIUS, KU Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium.
| |
Collapse
|
16
|
Gallos IK, Galaris E, Siettos CI. Construction of embedded fMRI resting-state functional connectivity networks using manifold learning. Cogn Neurodyn 2021; 15:585-608. [PMID: 34367362 PMCID: PMC8286923 DOI: 10.1007/s11571-020-09645-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Revised: 09/26/2020] [Accepted: 10/06/2020] [Indexed: 11/26/2022] Open
Abstract
We construct embedded functional connectivity networks (FCN) from benchmark resting-state functional magnetic resonance imaging (rsfMRI) data acquired from patients with schizophrenia and healthy controls based on linear and nonlinear manifold learning algorithms, namely, Multidimensional Scaling, Isometric Feature Mapping, Diffusion Maps, Locally Linear Embedding and kernel PCA. Furthermore, based on key global graph-theoretic properties of the embedded FCN, we compare their classification potential using machine learning. We also assess the performance of two metrics that are widely used for the construction of FCN from fMRI, namely the Euclidean distance and the cross correlation metric. We show that diffusion maps with the cross correlation metric outperform the other combinations.
Collapse
Affiliation(s)
- Ioannis K. Gallos
- School of Applied Mathematical and Physical Sciences, National Technical University of Athens, Athens, Greece
| | - Evangelos Galaris
- Dipartimento di Matematica e Applicazioni “Renato Caccioppoli”, Università degli Studi di Napoli Federico II, Napoli, Italy
| | - Constantinos I. Siettos
- Dipartimento di Matematica e Applicazioni “Renato Caccioppoli”, Università degli Studi di Napoli Federico II, Napoli, Italy
| |
Collapse
|
17
|
Ghodsizad T, Behnam H, Fatemizadeh E, Faghihi Langroudi T, Bayat F. Spatiotemporal registration and fusion of transthoracic echocardiography and volumetric coronary artery tree. Int J Comput Assist Radiol Surg 2021; 16:1493-505. [PMID: 34101135 DOI: 10.1007/s11548-021-02421-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Accepted: 05/26/2021] [Indexed: 10/21/2022]
Abstract
PURPOSE Cardiac multimodal image fusion can offer an image with various types of information in a single image. Many coronary stenosis, which are anatomically clear, are not functionally significant. The treatment of such kind of stenosis can cause irreversible effects on the patient. Thus, choosing the best treatment planning depend on anatomical and functional information is very beneficial. METHODS An algorithm for the fusion of coronary computed tomography angiography (CCTA) as an anatomical and transthoracic echocardiography (TTE) as a functional modality is presented. CCTA and TTE are temporally registered using manifold learning. A pattern search optimization algorithm, using normalized mutual information, is used to find the best match slice to TTE frame from CCTA volume. By employing a free-form deformation, the heart's non-rigid deformations are modeled. The spatiotemporal registered TTE frame is embedded to achieve the fusion result. RESULTS The accuracy is evaluated on CCTA and TTE data obtained from 10 patients. In temporal registration, mean absolute error of 1.97 [Formula: see text] 1.23 is resulted from comparing the output frame numbers from the algorithm and from manual assignment by an expert. In spatial registration, the accuracy of the similarity between the best match slice from CCTA volume and TTE frame is resulted in 1.82 [Formula: see text] 0.024 mm, 6.74 [Formula: see text] 0.013 mm, and 0.901 [Formula: see text] 0.0548 due to mean absolute distance, Hausdorff distance, and Dice similarity coefficient, respectively. CONCLUSION Without the use of ECG and Optical tracking systems, a semiautomatic framework of spatiotemporal registration and fusion of CCTA volume and TTE frame is presented. The experimental results showed the effectiveness of our proposed method to create complementary information from TTE and CCTA, which may help in the early diagnosis and effective treatment of cardiovascular diseases (CVDs).
Collapse
|
18
|
Villa A, Mundanad Narayanan A, Van Huffel S, Bertrand A, Varon C. Utility metric for unsupervised feature selection. PeerJ Comput Sci 2021; 7:e477. [PMID: 33981839 PMCID: PMC8080425 DOI: 10.7717/peerj-cs.477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 03/16/2021] [Indexed: 06/12/2023]
Abstract
Feature selection techniques are very useful approaches for dimensionality reduction in data analysis. They provide interpretable results by reducing the dimensions of the data to a subset of the original set of features. When the data lack annotations, unsupervised feature selectors are required for their analysis. Several algorithms for this aim exist in the literature, but despite their large applicability, they can be very inaccessible or cumbersome to use, mainly due to the need for tuning non-intuitive parameters and the high computational demands. In this work, a publicly available ready-to-use unsupervised feature selector is proposed, with comparable results to the state-of-the-art at a much lower computational cost. The suggested approach belongs to the methods known as spectral feature selectors. These methods generally consist of two stages: manifold learning and subset selection. In the first stage, the underlying structures in the high-dimensional data are extracted, while in the second stage a subset of the features is selected to replicate these structures. This paper suggests two contributions to this field, related to each of the stages involved. In the manifold learning stage, the effect of non-linearities in the data is explored, making use of a radial basis function (RBF) kernel, for which an alternative solution for the estimation of the kernel parameter is presented for cases with high-dimensional data. Additionally, the use of a backwards greedy approach based on the least-squares utility metric for the subset selection stage is proposed. The combination of these new ingredients results in the utility metric for unsupervised feature selection U2FS algorithm. The proposed U2FS algorithm succeeds in selecting the correct features in a simulation environment. In addition, the performance of the method on benchmark datasets is comparable to the state-of-the-art, while requiring less computational time. Moreover, unlike the state-of-the-art, U2FS does not require any tuning of parameters.
Collapse
Affiliation(s)
- Amalia Villa
- STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, Leuven, Belgium
- Leuven.AI, KU Leuven Institute for AI, Leuven, Belgium
| | - Abhijith Mundanad Narayanan
- STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, Leuven, Belgium
- Leuven.AI, KU Leuven Institute for AI, Leuven, Belgium
| | - Sabine Van Huffel
- STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, Leuven, Belgium
- Leuven.AI, KU Leuven Institute for AI, Leuven, Belgium
| | - Alexander Bertrand
- STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, Leuven, Belgium
- Leuven.AI, KU Leuven Institute for AI, Leuven, Belgium
| | - Carolina Varon
- STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, Leuven, Belgium
- Circuits and Systems (CAS) Group, Delft University of Technology, Delft, The Netherlands
- e-Media Research Lab, Campus GroepT, KU Leuven, Leuven, Belgium
| |
Collapse
|
19
|
Kojima R, Yoshidome T. A measure for the identification of preferred particle orientations in cryo-electron microscopy data: A simulation study. Biophys Physicobiol 2021; 18:96-107. [PMID: 34026399 PMCID: PMC8116199 DOI: 10.2142/biophysico.bppb-v18.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Accepted: 03/12/2021] [Indexed: 12/01/2022] Open
Abstract
Cryo-electron microscopy (cryo-EM) is an important experimental technique for the structural analysis of biomolecules that are difficult or impossible to crystallize. The three-dimensional structure of a biomolecule can be reconstructed using two-dimensional electron-density maps, which are experimentally sampled via the electron beam irradiation of vitreous ice in which the target biomolecules are embedded. One assumption required for this reconstruction is that the orientation of the biomolecules in the vitreous ice is isotropic. However, this is not always the case and two-dimensional electron-density maps are often sampled using preferred biomolecular orientations, which can make reconstruction difficult or impossible. Compensation for under-represented views is computationally feasible for the reconstruction of three-dimensional electron density maps, but one must know whether or not there is any missing information in the sampled two-dimensional electron density maps. Thus, a measure to identify whether a cryo-EM data is obtained from the biomolecules adopting preferred orientations is required. In the present study, we propose a measure for which the geometry of manifold projected onto a low-dimensional space is used. To show the usefulness of the measure, we perform simulations for cryo-EM experiment of a protein. It is found that the geometry of manifold projected onto a two-dimensional space for a protein adopting a preferred biomolecular orientation is significantly different from that for a protein adopting a uniform orientation. This result suggests that the geometry of manifold projected onto a low-dimensional space can be used for the measure for the identification that the biomolecules adopt preferred orientations.
Collapse
Affiliation(s)
- Ryota Kojima
- Department of Applied Physics, Graduate School of Engineering, Tohoku University, Sendai, Miyagi 980-8579, Japan
| | - Takashi Yoshidome
- Department of Applied Physics, Graduate School of Engineering, Tohoku University, Sendai, Miyagi 980-8579, Japan
| |
Collapse
|
20
|
Zhang S, Huang K, Zhu J, Liu Y. Manifold adversarial training for supervised and semi-supervised learning. Neural Netw 2021; 140:282-93. [PMID: 33839600 DOI: 10.1016/j.neunet.2021.03.031] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Revised: 02/07/2021] [Accepted: 03/19/2021] [Indexed: 11/20/2022]
Abstract
We propose a new regularization method for deep learning based on the manifold adversarial training (MAT). Unlike previous regularization and adversarial training methods, MAT further considers the local manifold of latent representations. Specifically, MAT manages to build an adversarial framework based on how the worst perturbation could affect the statistical manifold in the latent space rather than the output space. Particularly, a latent feature space with the Gaussian Mixture Model (GMM) is first derived in a deep neural network. We then define the smoothness by the largest variation of Gaussian mixtures when a local perturbation is given around the input data point. On one hand, the perturbations are added in the way that would rough the statistical manifold of the latent space the worst. On the other hand, the model is trained to promote the manifold smoothness the most in the latent space. Importantly, since the latent space is more informative than the output space, the proposed MAT can learn a more robust and compact data representation, leading to further performance improvement. The proposed MAT is important in that it can be considered as a superset of one recently-proposed discriminative feature learning approach called center loss. We conduct a series of experiments in both supervised and semi-supervised learning on four benchmark data sets, showing that the proposed MAT can achieve remarkable performance, much better than those of the state-of-the-art approaches. In addition, we present a series of visualization which could generate further understanding or explanation on adversarial examples.
Collapse
|
21
|
Wei J, Zhou T, Zhang X, Tian T. DTFLOW: Inference and Visualization of Single-cell Pseudotime Trajectory Using Diffusion Propagation. Genomics Proteomics Bioinformatics 2021; 19:306-318. [PMID: 33662626 PMCID: PMC8602766 DOI: 10.1016/j.gpb.2020.08.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/13/2019] [Revised: 05/26/2020] [Accepted: 10/29/2020] [Indexed: 12/13/2022]
Abstract
One of the major challenges in single-cell data analysis is the determination of cellular developmental trajectories using single-cell data. Although substantial studies have been conducted in recent years, more effective methods are still strongly needed to infer the developmental processes accurately. This work devises a new method, named DTFLOW, for determining the pseudo-temporal trajectories with multiple branches. DTFLOW consists of two major steps: a new method called Bhattacharyya kernel feature decomposition (BKFD) to reduce the data dimensions, and a novel approach named Reverse Searching on k-nearest neighbor graph (RSKG) to identify the multi-branching processes of cellular differentiation. In BKFD, we first establish a stationary distribution for each cell to represent the transition of cellular developmental states based on the random walk with restart algorithm, and then propose a new distance metric for calculating pseudotime of single cells by introducing the Bhattacharyya kernel matrix. The effectiveness of DTFLOW is rigorously examined by using four single-cell datasets. We compare the efficiency of DTFLOW with the published state-of-the-art methods. Simulation results suggest that DTFLOW has superior accuracy and strong robustness properties for constructing pseudotime trajectories. The Python source code of DTFLOW can be freely accessed at https://github.com/statway/DTFLOW.
Collapse
Affiliation(s)
- Jiangyong Wei
- College of Science, Huazhong Agricultural University, Wuhan 430070, China; School of Statistics and Mathematics, Zhongnan University of Economics and Law, Wuhan 430073, China
| | - Tianshou Zhou
- School of Mathematics and Statistics, Sun Yat-sen University, Guangzhou 510275, China
| | - Xinan Zhang
- School of Mathematics and Statistics, Central China Normal University, Wuhan 430079, China
| | - Tianhai Tian
- School of Mathematics, Monash University, Melbourne, VIC 3800, Australia.
| |
Collapse
|
22
|
Attyé A, Renard F, Baciu M, Roger E, Lamalle L, Dehail P, Cassoudesalle H, Calamante F. TractLearn: A geodesic learning framework for quantitative analysis of brain bundles. Neuroimage 2021; 233:117927. [PMID: 33689863 DOI: 10.1016/j.neuroimage.2021.117927] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2020] [Revised: 02/25/2021] [Accepted: 03/01/2021] [Indexed: 12/13/2022] Open
Abstract
Deep learning-based convolutional neural networks have recently proved their efficiency in providing fast segmentation of major brain fascicles structures, based on diffusion-weighted imaging. The quantitative analysis of brain fascicles then relies on metrics either coming from the tractography process itself or from each voxel along the bundle. Statistical detection of abnormal voxels in the context of disease usually relies on univariate and multivariate statistics models, such as the General Linear Model (GLM). Yet in the case of high-dimensional low sample size data, the GLM often implies high standard deviation range in controls due to anatomical variability, despite the commonly used smoothing process. This can lead to difficulties to detect subtle quantitative alterations from a brain bundle at the voxel scale. Here we introduce TractLearn, a unified framework for brain fascicles quantitative analyses by using geodesic learning as a data-driven learning task. TractLearn allows a mapping between the image high-dimensional domain and the reduced latent space of brain fascicles using a Riemannian approach. We illustrate the robustness of this method on a healthy population with test-retest acquisition of multi-shell diffusion MRI data, demonstrating that it is possible to separately study the global effect due to different MRI sessions from the effect of local bundle alterations. We have then tested the efficiency of our algorithm on a sample of 5 age-matched subjects referred with mild traumatic brain injury. Our contributions are to propose: 1/ A manifold approach to capture controls variability as standard reference instead of an atlas approach based on a Euclidean mean. 2/ A tool to detect global variation of voxels' quantitative values, which accounts for voxels' interactions in a structure rather than analyzing voxels independently. 3/ A ready-to-plug algorithm to highlight nonlinear variation of diffusion MRI metrics. With this regard, TractLearn is a ready-to-use algorithm for precision medicine.
Collapse
|
23
|
Zhang XH, Xu Y, He YL, Zhu QX. Novel manifold learning based virtual sample generation for optimizing soft sensor with small data. ISA Trans 2021; 109:229-241. [PMID: 33070985 DOI: 10.1016/j.isatra.2020.10.006] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Revised: 09/03/2020] [Accepted: 10/03/2020] [Indexed: 06/11/2023]
Abstract
Due to the extremely complex mechanism and strong non-linear characteristics of industrial processes, data-driven soft sensor technologies play a key role in the intelligent measurement of process industries. However, the information of the collected process data in the steady stage is quite limited and unreliable, causing the small sample problem. As a result, it becomes an intractable challenge to catch the nature of the process and build accurate soft sensor models. To solve this problem, this paper proposes a novel manifold learning based virtual sample generation method (Isomap-VSG) to generate feasible virtual samples in the information gaps for supplementing the original small sample space. To find data sparse regions reasonably, one kind of manifold learning methods called Isomap is used to visualize process data with high dimension. Then virtual samples can be generated by the interpolation method and extreme learning machine. The simulation results on a standard dataset and a real-world application demonstrate that, compared with other advanced methods, the proposed Isomap-VSG method can achieve better performance in terms of generating feasible virtual samples and improving the accuracy of soft sensor models using limited samples.
Collapse
Affiliation(s)
- Xiao-Han Zhang
- College of Information Science & Technology, Beijing University of Chemical Technology, Beijing, 100029, China; Engineering Research Center of Intelligent PSE, Ministry of Education of China, Beijing 100029, China
| | - Yuan Xu
- College of Information Science & Technology, Beijing University of Chemical Technology, Beijing, 100029, China; Engineering Research Center of Intelligent PSE, Ministry of Education of China, Beijing 100029, China
| | - Yan-Lin He
- College of Information Science & Technology, Beijing University of Chemical Technology, Beijing, 100029, China; Engineering Research Center of Intelligent PSE, Ministry of Education of China, Beijing 100029, China.
| | - Qun-Xiong Zhu
- College of Information Science & Technology, Beijing University of Chemical Technology, Beijing, 100029, China; Engineering Research Center of Intelligent PSE, Ministry of Education of China, Beijing 100029, China.
| |
Collapse
|
24
|
Zhang C, Liu S, Han F, Nie Z, Lo B, Zhang Y. Hybrid manifold-deep convolutional neural network for sleep staging. Methods 2021; 202:164-172. [PMID: 33636312 DOI: 10.1016/j.ymeth.2021.02.014] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2020] [Revised: 01/29/2021] [Accepted: 02/16/2021] [Indexed: 11/18/2022] Open
Abstract
Analysis of electroencephalogram (EEG) is a crucial diagnostic criterion for many sleep disorders, of which sleep staging is an important component. Manual stage classification is a labor-intensive process and usually suffered from many subjective factors. Recently, more and more computer-aided techniques have been applied to this task, among which deep convolutional neural network has been performing well as an effective automatic classification model. Despite some comprehensive models have been developed to improve classification results, the accuracy for clinical applications has not been reached due to the lack of sufficient labeled data and the limitation of extracting latent discriminative EEG features. Therefore, we propose a novel hybrid manifold-deep convolutional neural network with hyperbolic attention. To overcome the shortage of labeled data, we update the semi-supervised training scheme as an optimal solution. In order to extract the latent feature representation, we introduce the manifold learning module and the hyperbolic module to extract more discriminative information. Eight subjects from the public dataset are utilized to evaluate our pipeline, and the model achieved 89% accuracy, 70% precision, 80% sensitivity, 72% f1-score and kappa coefficient of 78%, respectively. The proposed model demonstrates powerful ability in extracting feature representation and achieves promising results by using semi-supervised training scheme. Therefore, our approach shows strong potential for future clinical development.
Collapse
Affiliation(s)
- Chuanhao Zhang
- Chongqing Key Laboratory of Nonlinear Circuits and Intelligent Information Processing, College of Electronic and Information Engineering, Southwest University, Chongqing, China
| | - Sen Liu
- Department of Oncology, Central Hospital Affiliated to Shandong First Medical University, Jinan, China
| | - Fang Han
- Peking University People's Hospital, Beijing, China
| | - Zedong Nie
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Benny Lo
- Department of Surgery and Cancer, Imperial College London, UK
| | - Yuan Zhang
- Chongqing Key Laboratory of Nonlinear Circuits and Intelligent Information Processing, College of Electronic and Information Engineering, Southwest University, Chongqing, China.
| |
Collapse
|
25
|
Wen J, Sun H, Fei L, Li J, Zhang Z, Zhang B. Consensus guided incomplete multi-view spectral clustering. Neural Netw 2020; 133:207-219. [PMID: 33227665 DOI: 10.1016/j.neunet.2020.10.014] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Revised: 10/25/2020] [Accepted: 10/29/2020] [Indexed: 10/23/2022]
Abstract
Incomplete multi-view clustering which aims to solve the difficult clustering challenge on incomplete multi-view data collected from diverse domains with missing views has drawn considerable attention in recent years. In this paper, we propose a novel method, called consensus guided incomplete multi-view spectral clustering (CGIMVSC), to address the incomplete clustering problem. Specifically, CGIMVSC seeks to explore the local information within every single-view and the semantic consistent information shared by all views in a unified framework simultaneously, where the local structure is adaptively obtained from the incomplete data rather than pre-constructed via a k-nearest neighbor approach in the existing methods. Considering the semantic consistency of multiple views, CGIMVSC introduces a co-regularization constraint to minimize the disagreement between the common representation and the individual representations with respect to different views, such that all views will obtain a consensus clustering result. Experimental comparisons with some state-of-the-art methods on seven datasets validate the effectiveness of the proposed method on incomplete multi-view clustering.
Collapse
Affiliation(s)
- Jie Wen
- PAMI Research Group, Department of Computer and Information Science, University of Macau, Taipa, Macau
| | - Huijie Sun
- Nanchang Institute of Technology, Nanchang 330044, China; Sun Yat-sen University, Guangzhou 510000, China
| | - Lunke Fei
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China
| | - Jinxing Li
- School of Science and Engineering, Chinese University of Hong Kong (Shenzhen), Shenzhen, 518000, China
| | - Zheng Zhang
- Shenzhen Key Laboratory of Visual Object Detection and Recognition, Harbin Institute of Technology, Shenzhen, Shenzhen 518055, China; Peng Cheng Laboratory, Shenzhen 518055, China
| | - Bob Zhang
- PAMI Research Group, Department of Computer and Information Science, University of Macau, Taipa, Macau.
| |
Collapse
|
26
|
He Q, Laurence DW, Lee CH, Chen JS. Manifold learning based data-driven modeling for soft biological tissues. J Biomech 2020; 117:110124. [PMID: 33515902 DOI: 10.1016/j.jbiomech.2020.110124] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Revised: 07/16/2020] [Accepted: 11/03/2020] [Indexed: 02/08/2023]
Abstract
Data-driven modeling directly utilizes experimental data with machine learning techniques to predict a material's response without the necessity of using phenomenological constitutive models. Although data-driven modeling presents a promising new approach, it has yet to be extended to the modeling of large-deformation biological tissues. Herein, we extend our recent local convexity data-driven (LCDD) framework (He and Chen, 2020) to model the mechanical response of a porcine heart mitral valve posterior leaflet. The predictability of the LCDD framework by using various combinations of biaxial and pure shear training protocols are investigated, and its effectiveness is compared with a full structural, phenomenological model modified from Zhang et al. (2016) and a continuum phenomenological Fung-type model (Tong and Fung, 1976). We show that the predictivity of the proposed LCDD nonlinear solver is generally less sensitive to the type of loading protocols (biaxial and pure shear) used in the data set, while more sensitive to the insufficient coverage of the experimental data when compared to the predictivity of the two selected phenomenological models. While no pre-defined functional form in the material model is necessary in LCDD, this study reinstates the importance of having sufficiently rich data coverage in the date-driven and machine learning type of approaches. It is also shown that the proposed LCDD method is an enhancement over the earlier distance-minimization data-driven (DMDD) against noisy data. This study demonstrates that when sufficient data is available, data-driven computing can be an alternative method for modeling complex biological materials.
Collapse
Affiliation(s)
- Qizhi He
- Department of Structural Engineering, University of California, San Diego, La Jolla, CA 92093, USA; Physical and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Devin W Laurence
- Biomechanics and Biomaterials Design Laboratory, School of Aerospace and Mechanical Engineering, The University of Oklahoma, Norman, OK 73019, USA
| | - Chung-Hao Lee
- Biomechanics and Biomaterials Design Laboratory, School of Aerospace and Mechanical Engineering, The University of Oklahoma, Norman, OK 73019, USA; Institute for Biomedical Engineering, Science and Technology, The University of Oklahoma, Norman, OK 73019, USA
| | - Jiun-Shyan Chen
- Department of Structural Engineering, University of California, San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
27
|
Grollemund V, Le Chat G, Secchi-Buhour MS, Delbot F, Pradat-Peyre JF, Bede P, Pradat PF. Manifold learning for amyotrophic lateral sclerosis functional loss assessment : Development and validation of a prognosis model. J Neurol 2021; 268:825-50. [PMID: 32886252 DOI: 10.1007/s00415-020-10181-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 08/05/2020] [Accepted: 08/06/2020] [Indexed: 12/11/2022]
Abstract
Amyotrophic lateral sclerosis (ALS) is an inexorably progressive neurodegenerative condition with no effective disease-modifying therapy at present. Given the striking clinical heterogeneity of the condition, the development and validation of reliable prognostic models is a recognised research priority. We present a prognostic model for functional decline in ALS where outcome uncertainty is taken into account. Patient data were reduced and projected onto a 2D space using Uniform Manifold Approximation and Projection (UMAP), a novel non-linear dimension reduction technique. Information from 3756 patients was included. Development data were sourced from past clinical trials. Real-world population data were used as validation data. Predictors included age, gender, region of onset, symptom duration, weight at baseline, functional impairment, and estimated rate of functional loss. UMAP projection of patients showed an informative 2D data distribution. As limited data availability precluded complex model designs, the projection was divided into three zones defined by a functional impairment range probability. Zone membership allowed individual patient prediction. Patients belonging to the first zone had a probability of [Formula: see text] (± [Formula: see text]) to have an ALSFRS score over 20 at 1-year follow-up. Patients within the second zone had a probability of [Formula: see text] (± [Formula: see text]) to have an ALSFRS score between 10 and 30 at 1 year follow-up. Finally, patients within the third zone had a probability of [Formula: see text] (± [Formula: see text]) to have an ALSFRS score lower than 20 at 1 year follow-up. This approach requires a limited set of features, is easily updated, improves with additional patient data, and accounts for results uncertainty. This method could therefore be used in a clinical setting for patient stratification and outcome projection.
Collapse
|
28
|
Abstract
BACKGROUND Modern developments in single-cell sequencing technologies enable broad insights into cellular state. Single-cell RNA sequencing (scRNA-seq) can be used to explore cell types, states, and developmental trajectories to broaden our understanding of cellular heterogeneity in tissues and organs. Analysis of these sparse, high-dimensional experimental results requires dimension reduction. Several methods have been developed to estimate low-dimensional embeddings for filtered and normalized single-cell data. However, methods have yet to be developed for unfiltered and unnormalized count data that estimate uncertainty in the low-dimensional space. We present a nonlinear latent variable model with robust, heavy-tailed error and adaptive kernel learning to estimate low-dimensional nonlinear structure in scRNA-seq data. RESULTS Gene expression in a single cell is modeled as a noisy draw from a Gaussian process in high dimensions from low-dimensional latent positions. This model is called the Gaussian process latent variable model (GPLVM). We model residual errors with a heavy-tailed Student's t-distribution to estimate a manifold that is robust to technical and biological noise found in normalized scRNA-seq data. We compare our approach to common dimension reduction tools across a diverse set of scRNA-seq data sets to highlight our model's ability to enable important downstream tasks such as clustering, inferring cell developmental trajectories, and visualizing high throughput experiments on available experimental data. CONCLUSION We show that our adaptive robust statistical approach to estimate a nonlinear manifold is well suited for raw, unfiltered gene counts from high-throughput sequencing technologies for visualization, exploration, and uncertainty estimation of cell states.
Collapse
Affiliation(s)
- Archit Verma
- Chemical and Biological Engineering, Princeton University, 50-70 Olden Street, Princeton, 08540 NJ USA
| | - Barbara E. Engelhardt
- Computer Science, Center for Statistics and Machine Learning, 35 Olden Street, Princeton, 08540 NJ USA
| |
Collapse
|
29
|
Watson JR, Gelbaum Z, Titus M, Zoch G, Wrathall D. Identifying multiscale spatio-temporal patterns in human mobility using manifold learning. PeerJ Comput Sci 2020; 6:e276. [PMID: 33816927 PMCID: PMC7924485 DOI: 10.7717/peerj-cs.276] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Accepted: 04/22/2020] [Indexed: 06/12/2023]
Abstract
When, where and how people move is a fundamental part of how human societies organize around every-day needs as well as how people adapt to risks, such as economic scarcity or instability, and natural disasters. Our ability to characterize and predict the diversity of human mobility patterns has been greatly expanded by the availability of Call Detail Records (CDR) from mobile phone cellular networks. The size and richness of these datasets is at the same time a blessing and a curse: while there is great opportunity to extract useful information from these datasets, it remains a challenge to do so in a meaningful way. In particular, human mobility is multiscale, meaning a diversity of patterns of mobility occur simultaneously, which vary according to timing, magnitude and spatial extent. To identify and characterize the main spatio-temporal scales and patterns of human mobility we examined CDR data from the Orange mobile network in Senegal using a new form of spectral graph wavelets, an approach from manifold learning. This unsupervised analysis reduces the dimensionality of the data to reveal seasonal changes in human mobility, as well as mobility patterns associated with large-scale but short-term religious events. The novel insight into human mobility patterns afforded by manifold learning methods like spectral graph wavelets have clear applications for urban planning, infrastructure design as well as hazard risk management, especially as climate change alters the biophysical landscape on which people work and live, leading to new patterns of human migration around the world.
Collapse
|
30
|
Suetani H, Kitajo K. A manifold learning approach to mapping individuality of human brain oscillations through beta-divergence. Neurosci Res 2020; 156:188-196. [PMID: 32084448 DOI: 10.1016/j.neures.2020.02.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2019] [Revised: 12/25/2019] [Accepted: 01/25/2020] [Indexed: 11/18/2022]
Abstract
This paper proposes an approach for visualizing individuality and inter-individual variations of human brain oscillations measured as multichannel electroencephalographic (EEG) signals in a low-dimensional space based on manifold learning. Using a unified divergence measure between spectral densities termed the "beta-divergence", we introduce an appropriate dissimilarity measure between multichannel EEG signals. Then, t-distributed stochastic neighbor embedding (t-SNE; a state-of-the-art algorithm for manifold learning) together with the beta-divergence based distance was applied to resting state EEG signals recorded from 100 healthy subjects. We were able to obtain a fine low-dimensional visualization that enabled each subject to be identified as an isolated point cloud and that represented inter-individual variations as the relationships between such point clouds. Furthermore, we also discuss how the performance of the low-dimensional visualization depends on the beta-divergence parameter and the t-SNE hyper parameter. Finally, borrowing from the concept of locally linear embedding (LLE), we propose a method for projecting the test sample to the t-SNE space obtained from the training samples and investigate that availability.
Collapse
Affiliation(s)
- Hiromichi Suetani
- Faculty of Science and Technology, Oita University, 700 Dannoharu, Oita 870-1192, Japan; RIKEN CBS-TOYOTA Collaboration Center (BTCC), RIKEN Center for Brain Science, Wako, Saitama 351-0198, Japan.
| | - Keiichi Kitajo
- RIKEN CBS-TOYOTA Collaboration Center (BTCC), RIKEN Center for Brain Science, Wako, Saitama 351-0198, Japan; Division of Neural Dynamics, Department of System Neuroscience, National Institute for Physiological Sciences, National Institutes of Natural Sciences, 38 Nishigonaka, Myodaiji, Okazaki, Aichi 444-8585, Japan; Department of Physiological Sciences, School of Life Science, The Graduate University for Advanced Studies (SOKENDAI), 38 Nishigonaka, Myodaiji, Okazaki, Aichi 444-8585, Japan
| |
Collapse
|
31
|
Abstract
BACKGROUND The coordination of genomic functions is a critical and complex process across biological systems such as phenotypes or states (e.g., time, disease, organism, environmental perturbation). Understanding how the complexity of genomic function relates to these states remains a challenge. To address this, we have developed a novel computational method, ManiNetCluster, which simultaneously aligns and clusters gene networks (e.g., co-expression) to systematically reveal the links of genomic function between different conditions. Specifically, ManiNetCluster employs manifold learning to uncover and match local and non-linear structures among networks, and identifies cross-network functional links. RESULTS We demonstrated that ManiNetCluster better aligns the orthologous genes from their developmental expression profiles across model organisms than state-of-the-art methods (p-value <2.2×10-16). This indicates the potential non-linear interactions of evolutionarily conserved genes across species in development. Furthermore, we applied ManiNetCluster to time series transcriptome data measured in the green alga Chlamydomonas reinhardtii to discover the genomic functions linking various metabolic processes between the light and dark periods of a diurnally cycling culture. We identified a number of genes putatively regulating processes across each lighting regime. CONCLUSIONS ManiNetCluster provides a novel computational tool to uncover the genes linking various functions from different networks, providing new insight on how gene functions coordinate across different conditions. ManiNetCluster is publicly available as an R package at https://github.com/daifengwanglab/ManiNetCluster.
Collapse
Affiliation(s)
- Nam D Nguyen
- Deparment of Computer Science, Stony Brook University, Stony Brook, NY 11794, USA
| | - Ian K Blaby
- Biology Department, Brookhaven National Laboratory, Upton, NY 11973, USA. .,US Department of Energy, Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, 4720, CA, USA.
| | - Daifeng Wang
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, 53726, WI, USA. .,Waisman Center, University of Wisconsin-Madison, Madison, 53705, WI, USA.
| |
Collapse
|
32
|
Seok HS. Performance comparison of dimensionality reduction methods on RNA-Seq data from the GTEx project. Genes Genomics 2019; 42:225-234. [PMID: 31833048 DOI: 10.1007/s13258-019-00896-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Accepted: 11/22/2019] [Indexed: 11/25/2022]
Abstract
BACKGROUND One of the apparent characteristics of bioinformatics data is the combination of very large number of features and relatively small number of samples. The vast number of features makes intuitive understanding of a target domain difficult. Dimensionality reduction or manifold learning has potential to circumvent this obstacle, but restricted methods have been preferred. OBJECTIVE The objective of this study is to observe the characteristics of various dimensionality reduction methods-locally linear embedding (LLE), multi-dimensional scaling (MDS), principal component analysis (PCA), spectral embedding (SE), and t-distributed Stochastic Neighbor Embedding (t-SNE)-on the RNA-Seq dataset from the genotype-tissue expression (GTEx) project. RESULTS The characteristics of the dimensionality reduction methods are observed on the nine groups of three different tissues in the reduced space with dimensionality of two, three, and four. The visualization results report that each dimensionality reduction method produces a very distinct reduced space. The quantitative results are obtained as the performance of k-means clustering. Clustering in the reduced space from non-linear methods such as LLE, t-SNE and SE achieved better results than in the reduced space produced by linear methods like PCA and MDS. CONCLUSIONS The experimental results recommend the application of both linear and non-linear dimensionality reduction methods on the target data for grasping the underlying characteristics of the datasets intuitively.
Collapse
Affiliation(s)
- Ho-Sik Seok
- Department of Computer and Communications Engineering, Kangwon National University, Chuncheon-si, Gangwon-do, 24341, South Korea.
| |
Collapse
|
33
|
Ren Y, Tsai MY, Chen L, Wang J, Li S, Liu Y, Jia X, Shen C. A manifold learning regularization approach to enhance 3D CT image-based lung nodule classification. Int J Comput Assist Radiol Surg 2020; 15:287-95. [PMID: 31768885 DOI: 10.1007/s11548-019-02097-8] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2019] [Accepted: 11/16/2019] [Indexed: 02/07/2023]
Abstract
PURPOSE Diagnosis of lung cancer requires radiologists to review every lung nodule in CT images. Such a process can be very time-consuming, and the accuracy is affected by many factors, such as experience of radiologists and available diagnosis time. To address this problem, we proposed to develop a deep learning-based system to automatically classify benign and malignant lung nodules. METHODS The proposed method automatically determines benignity or malignancy given the 3D CT image patch of a lung nodule to assist diagnosis process. Motivated by the fact that real structure among data is often embedded on a low-dimensional manifold, we developed a novel manifold regularized classification deep neural network (MRC-DNN) to perform classification directly based on the manifold representation of lung nodule images. The concise manifold representation revealing important data structure is expected to benefit the classification, while the manifold regularization enforces strong, but natural constraints on network training, preventing over-fitting. RESULTS The proposed method achieves accurate manifold learning with reconstruction error of ~ 30 HU on real lung nodule CT image data. In addition, the classification accuracy on testing data is 0.90 with sensitivity of 0.81 and specificity of 0.95, which outperforms state-of-the-art deep learning methods. CONCLUSION The proposed MRC-DNN facilitates an accurate manifold learning approach for lung nodule classification based on 3D CT images. More importantly, MRC-DNN suggests a new and effective idea of enforcing regularization for network training, possessing the potential impact to a board range of applications.
Collapse
|
34
|
Kinalis S, Nielsen FC, Winther O, Bagger FO. Deconvolution of autoencoders to learn biological regulatory modules from single cell mRNA sequencing data. BMC Bioinformatics 2019; 20:379. [PMID: 31286861 PMCID: PMC6615267 DOI: 10.1186/s12859-019-2952-9] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Accepted: 06/13/2019] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND Unsupervised machine learning methods (deep learning) have shown their usefulness with noisy single cell mRNA-sequencing data (scRNA-seq), where the models generalize well, despite the zero-inflation of the data. A class of neural networks, namely autoencoders, has been useful for denoising of single cell data, imputation of missing values and dimensionality reduction. RESULTS Here, we present a striking feature with the potential to greatly increase the usability of autoencoders: With specialized training, the autoencoder is not only able to generalize over the data, but also to tease apart biologically meaningful modules, which we found encoded in the representation layer of the network. Our model can, from scRNA-seq data, delineate biological meaningful modules that govern a dataset, as well as give information as to which modules are active in each single cell. Importantly, most of these modules can be explained by known biological functions, as provided by the Hallmark gene sets. CONCLUSIONS We discover that tailored training of an autoencoder makes it possible to deconvolute biological modules inherent in the data, without any assumptions. By comparisons with gene signatures of canonical pathways we see that the modules are directly interpretable. The scope of this discovery has important implications, as it makes it possible to outline the drivers behind a given effect of a cell. In comparison with other dimensionality reduction methods, or supervised models for classification, our approach has the benefit of both handling well the zero-inflated nature of scRNA-seq, and validating that the model captures relevant information, by establishing a link between input and decoded data. In perspective, our model in combination with clustering methods is able to provide information about which subtype a given single cell belongs to, as well as which biological functions determine that membership.
Collapse
Affiliation(s)
- Savvas Kinalis
- Centre for Genomic Medicine Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Finn Cilius Nielsen
- Centre for Genomic Medicine Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Ole Winther
- Centre for Genomic Medicine Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
- Section for Cognitive Systems Department of Applied Mathematics and Computer Science, Technical University of Denmark, Lyngby, Denmark
- Bioinformatics Centre Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Frederik Otzen Bagger
- Centre for Genomic Medicine Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
- University Children’s Hospital Basel and Department of Biomedicine, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
35
|
Abstract
Background Predicting drug-target interactions is time-consuming and expensive. It is important to present the accuracy of the calculation method. There are many algorithms to predict global interactions, some of which use drug-target networks for prediction (ie, a bipartite graph of bound drug pairs and targets known to interact). Although these algorithms can predict some drug-target interactions to some extent, there is little effect for some new drugs or targets that have no known interaction. Results Since the datasets are usually located at or near low-dimensional nonlinear manifolds, we propose an improved GRMF (graph regularized matrix factorization) method to learn these flow patterns in combination with the previous matrix-decomposition method. In addition, we use one of the pre-processing steps previously proposed to improve the accuracy of the prediction. Conclusions Cross-validation is used to evaluate our method, and simulation experiments are used to predict new interactions. In most cases, our method is superior to other methods. Finally, some examples of new drugs and new targets are predicted by performing simulation experiments. And the improved GRMF method can better predict the remaining drug-target interactions.
Collapse
Affiliation(s)
- Zhen Cui
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China
| | - Ying-Lian Gao
- Library of Qufu Normal University, Qufu Normal University, Rizhao, China
| | - Jin-Xing Liu
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China. .,Co-Innovation Center for Information Supply & Assurance Technology, Anhui University, Hefei, China.
| | - Ling-Yun Dai
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China
| | - Sha-Sha Yuan
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China
| |
Collapse
|
36
|
Gadd C, Xing W, Nezhad MM, Shah AA. A Surrogate Modelling Approach Based on Nonlinear Dimension Reduction for Uncertainty Quantification in Groundwater Flow Models. Transp Porous Media 2019; 126:39-77. [PMID: 30872876 PMCID: PMC6390720 DOI: 10.1007/s11242-018-1065-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2017] [Accepted: 04/13/2018] [Indexed: 11/15/2022]
Abstract
In this paper, we develop a surrogate modelling approach for capturing the output field (e.g. the pressure head) from groundwater flow models involving a stochastic input field (e.g. the hydraulic conductivity). We use a Karhunen–Loève expansion for a log-normally distributed input field and apply manifold learning (local tangent space alignment) to perform Gaussian process Bayesian inference using Hamiltonian Monte Carlo in an abstract feature space, yielding outputs for arbitrary unseen inputs. We also develop a framework for forward uncertainty quantification in such problems, including analytical approximations of the mean of the marginalized distribution (with respect to the inputs). To sample from the distribution, we present Monte Carlo approach. Two examples are presented to demonstrate the accuracy of our approach: a Darcy flow model with contaminant transport in 2-d and a Richards equation model in 3-d.
Collapse
Affiliation(s)
- C Gadd
- School of Engineering, University of Warwick, Coventry, CV47AL UK
| | - W Xing
- School of Engineering, University of Warwick, Coventry, CV47AL UK
| | - M Mousavi Nezhad
- School of Engineering, University of Warwick, Coventry, CV47AL UK
| | - A A Shah
- School of Engineering, University of Warwick, Coventry, CV47AL UK
| |
Collapse
|
37
|
Li B, Fan ZT, Zhang XL, Huang DS. Robust dimensionality reduction via feature space to feature space distance metric learning. Neural Netw 2019; 112:1-14. [PMID: 30716617 DOI: 10.1016/j.neunet.2019.01.001] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Revised: 12/26/2018] [Accepted: 01/07/2019] [Indexed: 11/29/2022]
Abstract
Images are often represented as vectors with high dimensions when involved in classification. As a result, dimensionality reduction methods have to be developed to avoid the curse of dimensionality. Among them, Laplacian eigenmaps (LE) have attracted widespread concentrations. In the original LE, point to point (P2P) distance metric is often adopted for manifold learning. Unfortunately, they show few impacts on robustness to noises. In this paper, a novel supervised dimensionality reduction method, named feature space to feature space distance metric learning (FSDML), is presented. For any point, it can construct a feature space spanned by its k intra-class nearest neighbors, which results in a local projection on its nearest feature space. Thus feature space to feature space (S2S) distance metric will be defined to Euclidean distance between two corresponding projections. On one hand, the proposed S2S distance metric displays superiority on robustness by the local projection. On the other hand, the projection on the nearest feature space contributes to fully mining local geometry information hidden in the original data. Moreover, both class label similarity and dissimilarity are also measured, based on which an intra-class graph and an inter-class graph will be individually modeled. Finally, a subspace can be found for classification by maximizing S2S based manifold to manifold distance and preserving S2S based locality of manifolds, simultaneously. Compared to some state-of-art dimensionality reduction methods, experiments validate the proposed method's performance either on synthesized data sets or on benchmark data sets.
Collapse
Affiliation(s)
- Bo Li
- School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, China; Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System, Wuhan, China; Institute of Big Data Science and Engineering, Wuhan University of Science and Technology, Wuhan, China.
| | - Zhang-Tao Fan
- School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, China; Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System, Wuhan, China
| | - Xiao-Long Zhang
- School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, China; Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System, Wuhan, China; Institute of Big Data Science and Engineering, Wuhan University of Science and Technology, Wuhan, China
| | - De-Shuang Huang
- School of Electronics and Information Engineering, Tongji University, Shanghai, China
| |
Collapse
|
38
|
Ouyang J, Liang Z, Chen C, Fu Z, Zhang Y, Liu H. Cryo-electron microscope image denoising based on the geodesic distance. BMC Struct Biol 2018; 18:18. [PMID: 30554569 PMCID: PMC6296045 DOI: 10.1186/s12900-018-0094-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/22/2018] [Accepted: 10/30/2018] [Indexed: 11/18/2022]
Abstract
Background To perform a three-dimensional (3-D) reconstruction of electron cryomicroscopy (cryo-EM) images of viruses, it is necessary to determine the similarity of image blocks of the two-dimensional (2-D) projections of the virus. The projections containing high resolution information are typically very noisy. Instead of the traditional Euler metric, this paper proposes a new method, based on the geodesic metric, to measure the similarity of blocks. Results Our method is a 2-D image denoising approach. A data set of 2243 cytoplasmic polyhedrosis virus (CPV) capsid particle images in different orientations was used to test the proposed method. Relative to Block-matching and three-dimensional filtering (BM3D), Stein’s unbiased risk estimator (SURE), Bayes shrink and K-means singular value decomposition (K-SVD), the experimental results show that the proposed method can achieve a peak signal-to-noise ratio (PSNR) of 45.65. The method can remove the noise from the cryo-EM image and improve the accuracy of particle picking. Conclusions The main contribution of the proposed model is to apply the geodesic distance to measure the similarity of image blocks. We conclude that manifold learning methods can effectively eliminate the noise of the cryo-EM image and improve the accuracy of particle picking.
Collapse
Affiliation(s)
- Jianquan Ouyang
- Key Laboratory of Intelligent Computing and Information Processing, Ministry of Education, College of Information Engineering, Xiangtan University, Xiangtan, 411105, China.
| | - Zezhi Liang
- Key Laboratory of Intelligent Computing and Information Processing, Ministry of Education, College of Information Engineering, Xiangtan University, Xiangtan, 411105, China
| | - Chunyu Chen
- Key Laboratory of Intelligent Computing and Information Processing, Ministry of Education, College of Information Engineering, Xiangtan University, Xiangtan, 411105, China
| | - Zhuosong Fu
- Key Laboratory of Intelligent Computing and Information Processing, Ministry of Education, College of Information Engineering, Xiangtan University, Xiangtan, 411105, China
| | - Yue Zhang
- Key Laboratory of Intelligent Computing and Information Processing, Ministry of Education, College of Information Engineering, Xiangtan University, Xiangtan, 411105, China
| | - Hongrong Liu
- College of Physics and Information Science, Hunan Normal University, Changsha, 410081, Hunan, China
| |
Collapse
|
39
|
Xing M, GadElkarim J, Ajilore O, Wolfson O, Forbes A, Phan KL, Klumpp H, Leow A. Thought Chart: tracking the thought with manifold learning during emotion regulation. Brain Inform 2018; 5:7. [PMID: 30022317 DOI: 10.1186/s40708-018-0085-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2017] [Accepted: 07/12/2018] [Indexed: 11/21/2022] Open
Abstract
The Nash embedding theorem demonstrates that any compact manifold can be isometrically embedded in a Euclidean space. Assuming the complex brain states form a high-dimensional manifold in a topological space, we propose a manifold learning framework, termed Thought Chart, to reconstruct and visualize the manifold in a low-dimensional space. Furthermore, it serves as a data-driven approach to discover the underlying dynamics when the brain is engaged in a series of emotion and cognitive regulation tasks. EEG-based temporal dynamic functional connectomes are created based on 20 psychiatrically healthy participants’ EEG recordings during resting state and an emotion regulation task. Graph dissimilarity space embedding was applied to all the dynamic EEG connectomes. In order to visualize the learned manifold in a lower dimensional space, local neighborhood information is reconstructed via k-nearest neighbor-based nonlinear dimensionality reduction (NDR) and epsilon distance-based NDR. We showed that two neighborhood constructing approaches of NDR embed the manifold in a two-dimensional space, which we named Thought Chart. In Thought Chart, different task conditions represent distinct trajectories. Properties such as the distribution or average length in the 2-D space may serve as useful parameters to explore the underlying cognitive load and emotion processing during the complex task. In sum, this framework is a novel data-driven approach to the learning and visualization of underlying neurophysiological dynamics of complex functional brain data.
Collapse
|
40
|
Bermudez C, Plassard AJ, Davis TL, Newton AT, Resnick SM, Landman BA. Learning Implicit Brain MRI Manifolds with Deep Learning. Proc SPIE Int Soc Opt Eng 2018; 10574:105741L. [PMID: 29887659 PMCID: PMC5990281 DOI: 10.1117/12.2293515] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
An important task in image processing and neuroimaging is to extract quantitative information from the acquired images in order to make observations about the presence of disease or markers of development in populations. Having a low-dimensional manifold of an image allows for easier statistical comparisons between groups and the synthesis of group representatives. Previous studies have sought to identify the best mapping of brain MRI to a low-dimensional manifold, but have been limited by assumptions of explicit similarity measures. In this work, we use deep learning techniques to investigate implicit manifolds of normal brains and generate new, high-quality images. We explore implicit manifolds by addressing the problems of image synthesis and image denoising as important tools in manifold learning. First, we propose the unsupervised synthesis of T1-weighted brain MRI using a Generative Adversarial Network (GAN) by learning from 528 examples of 2D axial slices of brain MRI. Synthesized images were first shown to be unique by performing a cross-correlation with the training set. Real and synthesized images were then assessed in a blinded manner by two imaging experts providing an image quality score of 1-5. The quality score of the synthetic image showed substantial overlap with that of the real images. Moreover, we use an autoencoder with skip connections for image denoising, showing that the proposed method results in higher PSNR than FSL SUSAN after denoising. This work shows the power of artificial networks to synthesize realistic imaging data, which can be used to improve image processing techniques and provide a quantitative framework to structural changes in the brain.
Collapse
Affiliation(s)
- Camilo Bermudez
- Department of Biomedical Engineering, Vanderbilt University, 2201 West End Ave, Nashville, TN, USA 37235
| | - Andrew J Plassard
- Department of Computer Science, Vanderbilt University, 2201 West End Ave, Nashville, TN, USA 37235
| | - Taylor L Davis
- Department of Radiology, Vanderbilt University Medical Center, 2201 West End Ave, Nashville, TN, USA 37235
| | - Allen T Newton
- Department of Radiology, Vanderbilt University Medical Center, 2201 West End Ave, Nashville, TN, USA 37235
| | - Susan M Resnick
- Department of Computer Science, Vanderbilt University, 2201 West End Ave, Nashville, TN, USA 37235
| | - Bennett A Landman
- Department of Biomedical Engineering, Vanderbilt University, 2201 West End Ave, Nashville, TN, USA 37235
| |
Collapse
|
41
|
Ding P, Luo J, Liang C, Xiao Q, Cao B. Human disease MiRNA inference by combining target information based on heterogeneous manifolds. J Biomed Inform 2018; 80:26-36. [PMID: 29481877 DOI: 10.1016/j.jbi.2018.02.013] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2017] [Revised: 02/11/2018] [Accepted: 02/21/2018] [Indexed: 12/12/2022]
Abstract
The emergence of network medicine has provided great insight into the identification of disease-related molecules, which could help with the development of personalized medicine. However, the state-of-the-art methods could neither simultaneously consider target information and the known miRNA-disease associations nor effectively explore novel gene-disease associations as a by-product during the process of inferring disease-related miRNAs. Computational methods incorporating multiple sources of information offer more opportunities to infer disease-related molecules, including miRNAs and genes in heterogeneous networks at a system level. In this study, we developed a novel algorithm, named inference of Disease-related MiRNAs based on Heterogeneous Manifold (DMHM), to accurately and efficiently identify miRNA-disease associations by integrating multi-omics data. Graph-based regularization was utilized to obtain a smooth function on the data manifold, which constitutes the main principle of DMHM. The novelty of this framework lies in the relatedness between diseases and miRNAs, which are measured via heterogeneous manifolds on heterogeneous networks integrating target information. To demonstrate the effectiveness of DMHM, we conducted comprehensive experiments based on HMDD datasets and compared DMHM with six state-of-the-art methods. Experimental results indicated that DMHM significantly outperformed the other six methods under fivefold cross validation and de novo prediction tests. Case studies have further confirmed the practical usefulness of DMHM.
Collapse
Affiliation(s)
- Pingjian Ding
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410083, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410083, China.
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan 250014, China
| | - Qiu Xiao
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410083, China
| | - Buwen Cao
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410083, China
| |
Collapse
|
42
|
Abstract
We introduce a new, semi-supervised classification method that extensively exploits knowledge. The method has three steps. First, the manifold regularization mechanism, adapted from the Laplacian support vector machine (LapSVM), is adopted to mine the manifold structure embedded in all training data, especially in numerous label-unknown data. Meanwhile, by converting the labels into pairwise constraints, the pairwise constraint regularization formula (PCRF) is designed to compensate for the few but valuable labelled data. Second, by further combining the PCRF with the manifold regularization, the precise manifold and pairwise constraint jointly regularized formula (MPCJRF) is achieved. Third, by incorporating the MPCJRF into the framework of the conventional SVM, our approach, referred to as semi-supervised classification with extensive knowledge exploitation (SSC-EKE), is developed. The significance of our research is fourfold: 1) The MPCJRF is an underlying adjustment, with respect to the pairwise constraints, to the graph Laplacian enlisted for approximating the potential data manifold. This type of adjustment plays the correction role, as an unbiased estimation of the data manifold is difficult to obtain, whereas the pairwise constraints, converted from the given labels, have an overall high confidence level. 2) By transforming the values of the two terms in the MPCJRF such that they have the same range, with a trade-off factor varying within the invariant interval [0, 1), the appropriate impact of the pairwise constraints to the graph Laplacian can be self-adaptively determined. 3) The implication regarding extensive knowledge exploitation is embodied in SSC-EKE. That is, the labelled examples are used not only to control the empirical risk but also to constitute the MPCJRF. Moreover, all data, both labelled and unlabelled, are recruited for the model smoothness and manifold regularization. 4) The complete framework of SSC-EKE organically incorporates multiple theories, such as joint manifold and pairwise constraint-based regularization, smoothness in the reproducing kernel Hilbert space, empirical risk minimization, and spectral methods, which facilitates the preferable classification accuracy as well as the generalizability of SSC-EKE.
Collapse
Affiliation(s)
- Pengjiang Qian
- School of Digital Media, Jiangnan University, Wuxi, Jiangsu, P.R. China.,Case Center for Imaging Research, Case Western Reserve University, Cleveland, Ohio, USA.,Department of Radiology, University Hospitals Cleveland Medical Center, Case Western Reserve University, Cleveland, Ohio, USA
| | - Chen Xi
- School of Digital Media, Jiangnan University, Wuxi, Jiangsu, P.R. China.,Case Center for Imaging Research, Case Western Reserve University, Cleveland, Ohio, USA.,Department of Radiology, University Hospitals Cleveland Medical Center, Case Western Reserve University, Cleveland, Ohio, USA
| | - Min Xu
- School of Digital Media, Jiangnan University, Wuxi, Jiangsu, P.R. China
| | - Yizhang Jiang
- School of Digital Media, Jiangnan University, Wuxi, Jiangsu, P.R. China
| | - Kuan-Hao Su
- Case Center for Imaging Research, Case Western Reserve University, Cleveland, Ohio, USA.,Department of Radiology, University Hospitals Cleveland Medical Center, Case Western Reserve University, Cleveland, Ohio, USA
| | - Shitong Wang
- School of Digital Media, Jiangnan University, Wuxi, Jiangsu, P.R. China
| | - Raymond F Muzic
- Case Center for Imaging Research, Case Western Reserve University, Cleveland, Ohio, USA.,Department of Radiology, University Hospitals Cleveland Medical Center, Case Western Reserve University, Cleveland, Ohio, USA
| |
Collapse
|
43
|
Zhang Z, Jia L, Zhang M, Li B, Zhang L, Li F. Discriminative clustering on manifold for adaptive transductive classification. Neural Netw 2017; 94:260-273. [PMID: 28822323 DOI: 10.1016/j.neunet.2017.07.013] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2016] [Revised: 07/18/2017] [Accepted: 07/21/2017] [Indexed: 11/30/2022]
Abstract
In this paper, we mainly propose a novel adaptive transductive label propagation approach by joint discriminative clustering on manifolds for representing and classifying high-dimensional data. Our framework seamlessly combines the unsupervised manifold learning, discriminative clustering and adaptive classification into a unified model. Also, our method incorporates the adaptive graph weight construction with label propagation. Specifically, our method is capable of propagating label information using adaptive weights over low-dimensional manifold features, which is different from most existing studies that usually predict the labels and construct the weights in the original Euclidean space. For transductive classification by our formulation, we first perform the joint discriminative K-means clustering and manifold learning to capture the low-dimensional nonlinear manifolds. Then, we construct the adaptive weights over the learnt manifold features, where the adaptive weights are calculated through performing the joint minimization of the reconstruction errors over features and soft labels so that the graph weights can be joint-optimal for data representation and classification. Using the adaptive weights, we can easily estimate the unknown labels of samples. After that, our method returns the updated weights for further updating the manifold features. Extensive simulations on image classification and segmentation show that our proposed algorithm can deliver the state-of-the-art performance on several public datasets.
Collapse
Affiliation(s)
- Zhao Zhang
- School of Computer Science and Technology & Joint International Research Laboratory of Machine Learning and Neuromorphic Computing, Soochow University, Suzhou 215006, China.
| | - Lei Jia
- School of Computer Science and Technology & Joint International Research Laboratory of Machine Learning and Neuromorphic Computing, Soochow University, Suzhou 215006, China
| | - Min Zhang
- School of Computer Science and Technology & Joint International Research Laboratory of Machine Learning and Neuromorphic Computing, Soochow University, Suzhou 215006, China
| | - Bing Li
- School of Economics, Wuhan University of Technology, No.122 Luoshi Road, Wuhan 430070, China
| | - Li Zhang
- School of Computer Science and Technology & Joint International Research Laboratory of Machine Learning and Neuromorphic Computing, Soochow University, Suzhou 215006, China
| | - Fanzhang Li
- School of Computer Science and Technology & Joint International Research Laboratory of Machine Learning and Neuromorphic Computing, Soochow University, Suzhou 215006, China
| |
Collapse
|
44
|
Zimmer VA, Glocker B, Hahner N, Eixarch E, Sanroma G, Gratacós E, Rueckert D, González Ballester MÁ, Piella G. Learning and combining image neighborhoods using random forests for neonatal brain disease classification. Med Image Anal 2017; 42:189-199. [PMID: 28818743 DOI: 10.1016/j.media.2017.08.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2016] [Revised: 08/01/2017] [Accepted: 08/08/2017] [Indexed: 12/25/2022]
Abstract
It is challenging to characterize and classify normal and abnormal brain development during early childhood. To reduce the complexity of heterogeneous data population, manifold learning techniques are increasingly applied, which find a low-dimensional representation of the data, while preserving all relevant information. The neighborhood definition used for constructing manifold representations of the population is crucial for preserving the similarity structure and it is highly application dependent. The recently proposed neighborhood approximation forests learn a neighborhood structure in a dataset based on a user-defined distance. We propose a framework to learn multiple pairwise distances in a population of brain images and to combine them in an unsupervised manner optimally in a manifold learning step. Unlike other methods that only use a univariate distance measure, our method allows for a natural combination of multiple distances from heterogeneous sources. As a result, it yields a representation of the population that preserves the multiple distances. Furthermore, our method also selects the most predictive features associated with the distances. We evaluate our method in neonatal magnetic resonance images of three groups (term controls, patients affected by intrauterine growth restriction and mild isolated ventriculomegaly). We show that combining multiple distances related to the condition improves the overall characterization and classification of the three clinical groups compared to the use of single distances and classical unsupervised manifold learning.
Collapse
Affiliation(s)
| | - Ben Glocker
- BioMedIA Group, Imperial College London, London, UK
| | - Nadine Hahner
- Fetal i+D Fetal Medicine Research Center, BCNatal - Barcelona Center for Maternal-Fetal and Neonatal Medicine (Hospital Clínic and Hospital Sant Joan de Déu), IDIBAPS, University of Barcelona, Spain; Centre for Biomedical Research on Rare Diseases (CIBER-ER), Barcelona, Spain
| | - Elisenda Eixarch
- Fetal i+D Fetal Medicine Research Center, BCNatal - Barcelona Center for Maternal-Fetal and Neonatal Medicine (Hospital Clínic and Hospital Sant Joan de Déu), IDIBAPS, University of Barcelona, Spain; Centre for Biomedical Research on Rare Diseases (CIBER-ER), Barcelona, Spain
| | | | - Eduard Gratacós
- Fetal i+D Fetal Medicine Research Center, BCNatal - Barcelona Center for Maternal-Fetal and Neonatal Medicine (Hospital Clínic and Hospital Sant Joan de Déu), IDIBAPS, University of Barcelona, Spain; Centre for Biomedical Research on Rare Diseases (CIBER-ER), Barcelona, Spain
| | | | | | - Gemma Piella
- SIMBioSys, Universitat Pompeu Fabra, Barcelona, Spain
| |
Collapse
|
45
|
Welch JD, Hartemink AJ, Prins JF. MATCHER: manifold alignment reveals correspondence between single cell transcriptome and epigenome dynamics. Genome Biol 2017; 18:138. [PMID: 28738873 PMCID: PMC5525279 DOI: 10.1186/s13059-017-1269-0] [Citation(s) in RCA: 93] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2017] [Accepted: 07/05/2017] [Indexed: 12/30/2022] Open
Abstract
Single cell experimental techniques reveal transcriptomic and epigenetic heterogeneity among cells, but how these are related is unclear. We present MATCHER, an approach for integrating multiple types of single cell measurements. MATCHER uses manifold alignment to infer single cell multi-omic profiles from transcriptomic and epigenetic measurements performed on different cells of the same type. Using scM&T-seq and sc-GEM data, we confirm that MATCHER accurately predicts true single cell correlations between DNA methylation and gene expression without using known cell correspondences. MATCHER also reveals new insights into the dynamic interplay between the transcriptome and epigenome in single embryonic stem cells and induced pluripotent stem cells.
Collapse
Affiliation(s)
- Joshua D Welch
- Department of Computer Science, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.,Curriculum in Bioinformatics and Computational Biology, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | | | - Jan F Prins
- Department of Computer Science, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA. .,Curriculum in Bioinformatics and Computational Biology, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
| |
Collapse
|
46
|
Alanis-Lobato G, Mier P, Andrade-Navarro MA. Manifold learning and maximum likelihood estimation for hyperbolic network embedding. Appl Netw Sci 2016; 1:10. [PMID: 30533502 PMCID: PMC6245200 DOI: 10.1007/s41109-016-0013-0] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Accepted: 10/25/2016] [Indexed: 05/23/2023]
Abstract
The Popularity-Similarity (PS) model sustains that clustering and hierarchy, properties common to most networks representing complex systems, are the result of an optimisation process in which nodes seek to form ties, not only with the most connected (popular) system components, but also with those that are similar to them. This model has a geometric interpretation in hyperbolic space, where distances between nodes abstract popularity-similarity trade-offs and the formation of scale-free and strongly clustered networks can be accurately described. Current methods for mapping networks to hyperbolic space are based on maximum likelihood estimations or manifold learning. The former approach is very accurate but slow; the latter improves efficiency at the cost of accuracy. Here, we analyse the strengths and limitations of both strategies and assess the advantages of combining them to efficiently embed big networks, allowing for their examination from a geometric perspective. Our evaluations in artificial and real networks support the idea that hyperbolic distance constraints play a significant role in the formation of edges between nodes. This means that challenging problems in network science, like link prediction or community detection, could be more easily addressed under this geometric framework.
Collapse
Affiliation(s)
- Gregorio Alanis-Lobato
- Institute of Molecular Biology, Ackermannweg 4, Mainz, 55128 Germany
- Faculty of Biology, Johannes Gutenberg Universität, Gresemundweg 2, Mainz, 55128 Germany
| | - Pablo Mier
- Institute of Molecular Biology, Ackermannweg 4, Mainz, 55128 Germany
- Faculty of Biology, Johannes Gutenberg Universität, Gresemundweg 2, Mainz, 55128 Germany
| | - Miguel A. Andrade-Navarro
- Institute of Molecular Biology, Ackermannweg 4, Mainz, 55128 Germany
- Faculty of Biology, Johannes Gutenberg Universität, Gresemundweg 2, Mainz, 55128 Germany
| |
Collapse
|
47
|
Tong C, Shi X, Lan T. Statistical process monitoring based on orthogonal multi-manifold projections and a novel variable contribution analysis. ISA Trans 2016; 65:407-417. [PMID: 27435000 DOI: 10.1016/j.isatra.2016.06.017] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2015] [Revised: 06/02/2016] [Accepted: 06/30/2016] [Indexed: 06/06/2023]
Abstract
Multivariate statistical methods have been widely applied to develop data-based process monitoring models. Recently, a multi-manifold projections (MMP) algorithm was proposed for modeling and monitoring chemical industrial processes, the MMP is an effective tool for preserving the global and local geometric structure of the original data space in the reduced feature subspace, but it does not provide orthogonal basis functions for data reconstruction. Recognition of this issue, an improved version of MMP algorithm named orthogonal MMP (OMMP) is formulated. Based on the OMMP model, a further processing step and a different monitoring index are proposed to model and monitor the variation in the residual subspace. Additionally, a novel variable contribution analysis is presented for fault diagnosis by integrating the nearest in-control neighbor calculation and reconstruction-based contribution analysis. The validity and superiority of the proposed fault detection and diagnosis strategy are then validated through case studies on the Tennessee Eastman benchmark process.
Collapse
Affiliation(s)
- Chudong Tong
- Faculty of Electrical Engineering & Computer Science, Ningbo University, Ningbo 315211, P.R. China.
| | - Xuhua Shi
- Faculty of Electrical Engineering & Computer Science, Ningbo University, Ningbo 315211, P.R. China
| | - Ting Lan
- Faculty of Electrical Engineering & Computer Science, Ningbo University, Ningbo 315211, P.R. China
| |
Collapse
|
48
|
Xie L, Pluta JB, Das SR, Wisse LEM, Wang H, Mancuso L, Kliot D, Avants BB, Ding SL, Manjón JV, Wolk DA, Yushkevich PA. Multi-template analysis of human perirhinal cortex in brain MRI: Explicitly accounting for anatomical variability. Neuroimage 2016; 144:183-202. [PMID: 27702610 DOI: 10.1016/j.neuroimage.2016.09.070] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2016] [Revised: 09/28/2016] [Accepted: 09/30/2016] [Indexed: 01/05/2023] Open
Abstract
RATIONAL The human perirhinal cortex (PRC) plays critical roles in episodic and semantic memory and visual perception. The PRC consists of Brodmann areas 35 and 36 (BA35, BA36). In Alzheimer's disease (AD), BA35 is the first cortical site affected by neurofibrillary tangle pathology, which is closely linked to neural injury in AD. Large anatomical variability, manifested in the form of different cortical folding and branching patterns, makes it difficult to segment the PRC in MRI scans. Pathology studies have found that in ~97% of specimens, the PRC falls into one of three discrete anatomical variants. However, current methods for PRC segmentation and morphometry in MRI are based on single-template approaches, which may not be able to accurately model these discrete variants METHODS: A multi-template analysis pipeline that explicitly accounts for anatomical variability is used to automatically label the PRC and measure its thickness in T2-weighted MRI scans. The pipeline uses multi-atlas segmentation to automatically label medial temporal lobe cortices including entorhinal cortex, PRC and the parahippocampal cortex. Pairwise registration between label maps and clustering based on residual dissimilarity after registration are used to construct separate templates for the anatomical variants of the PRC. An optimal path of deformations linking these templates is used to establish correspondences between all the subjects. Experimental evaluation focuses on the ability of single-template and multi-template analyses to detect differences in the thickness of medial temporal lobe cortices between patients with amnestic mild cognitive impairment (aMCI, n=41) and age-matched controls (n=44). RESULTS The proposed technique is able to generate templates that recover the three dominant discrete variants of PRC and establish more meaningful correspondences between subjects than a single-template approach. The largest reduction in thickness associated with aMCI, in absolute terms, was found in left BA35 using both regional and summary thickness measures. Further, statistical maps of regional thickness difference between aMCI and controls revealed different patterns for the three anatomical variants.
Collapse
Affiliation(s)
- Long Xie
- Penn Image Computing and Science Laboratory (PICSL), Department of Radiology, University of Pennsylvania, Philadelphia, PA, USA; Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, USA.
| | - John B Pluta
- Penn Image Computing and Science Laboratory (PICSL), Department of Radiology, University of Pennsylvania, Philadelphia, PA, USA; Department of Radiology, University of Pennsylvania, Philadelphia, USA
| | - Sandhitsu R Das
- Penn Image Computing and Science Laboratory (PICSL), Department of Radiology, University of Pennsylvania, Philadelphia, PA, USA; Department of Neurology, University of Pennsylvania, Philadelphia, USA; Department of Radiology, University of Pennsylvania, Philadelphia, USA
| | - Laura E M Wisse
- Penn Image Computing and Science Laboratory (PICSL), Department of Radiology, University of Pennsylvania, Philadelphia, PA, USA; Department of Radiology, University of Pennsylvania, Philadelphia, USA
| | | | - Lauren Mancuso
- Penn Memory Center, University of Pennsylvania, Philadelphia, PA, USA; Department of Neurology, University of Pennsylvania, Philadelphia, USA
| | - Dasha Kliot
- Penn Memory Center, University of Pennsylvania, Philadelphia, PA, USA; Department of Neurology, University of Pennsylvania, Philadelphia, USA
| | - Brian B Avants
- Penn Image Computing and Science Laboratory (PICSL), Department of Radiology, University of Pennsylvania, Philadelphia, PA, USA; Department of Radiology, University of Pennsylvania, Philadelphia, USA
| | - Song-Lin Ding
- Allen Institute for Brain Science, Seattle, USA; School of Basic Sciences, Guangzhou Medical University, Guangzhou, China
| | - José V Manjón
- Instituto de Aplicaciones de las Tecnologías de la Información y de las Comunicaciones Avanzadas (ITACA), Universidad Politécnica de Valencia, Camino de Vera s/n, Valencia, Spain
| | - David A Wolk
- Penn Memory Center, University of Pennsylvania, Philadelphia, PA, USA; Department of Neurology, University of Pennsylvania, Philadelphia, USA
| | - Paul A Yushkevich
- Penn Image Computing and Science Laboratory (PICSL), Department of Radiology, University of Pennsylvania, Philadelphia, PA, USA; Department of Radiology, University of Pennsylvania, Philadelphia, USA
| |
Collapse
|
49
|
Baumgartner CF, Kolbitsch C, McClelland JR, Rueckert D, King AP. Autoadaptive motion modelling for MR-based respiratory motion estimation. Med Image Anal 2016; 35:83-100. [PMID: 27343436 DOI: 10.1016/j.media.2016.06.005] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2015] [Revised: 04/22/2016] [Accepted: 06/07/2016] [Indexed: 10/21/2022]
Abstract
Respiratory motion poses significant challenges in image-guided interventions. In emerging treatments such as MR-guided HIFU or MR-guided radiotherapy, it may cause significant misalignments between interventional road maps obtained pre-procedure and the anatomy during the treatment, and may affect intra-procedural imaging such as MR-thermometry. Patient specific respiratory motion models provide a solution to this problem. They establish a correspondence between the patient motion and simpler surrogate data which can be acquired easily during the treatment. Patient motion can then be estimated during the treatment by acquiring only the simpler surrogate data. In the majority of classical motion modelling approaches once the correspondence between the surrogate data and the patient motion is established it cannot be changed unless the model is recalibrated. However, breathing patterns are known to significantly change in the time frame of MR-guided interventions. Thus, the classical motion modelling approach may yield inaccurate motion estimations when the relation between the motion and the surrogate data changes over the duration of the treatment and frequent recalibration may not be feasible. We propose a novel methodology for motion modelling which has the ability to automatically adapt to new breathing patterns. This is achieved by choosing the surrogate data in such a way that it can be used to estimate the current motion in 3D as well as to update the motion model. In particular, in this work, we use 2D MR slices from different slice positions to build as well as to apply the motion model. We implemented such an autoadaptive motion model by extending our previous work on manifold alignment. We demonstrate a proof-of-principle of the proposed technique on cardiac gated data of the thorax and evaluate its adaptive behaviour on realistic synthetic data containing two breathing types generated from 6 volunteers, and real data from 4 volunteers. On synthetic data the autoadaptive motion model yielded 21.45% more accurate motion estimations compared to a non-adaptive motion model 10 min after a change in breathing pattern. On real data we demonstrated the method's ability to maintain motion estimation accuracy despite a drift in the respiratory baseline. Due to the cardiac gating of the imaging data, the method is currently limited to one update per heart beat and the calibration requires approximately 12 min of scanning. Furthermore, the method has a prediction latency of 800 ms. These limitations may be overcome in future work by altering the acquisition protocol.
Collapse
Affiliation(s)
| | - Christoph Kolbitsch
- Division of Imaging Sciences and Biomedical Engineering, King's College London, London, UK
| | - Jamie R McClelland
- Centre for Medical Image Computing, University College London, London, UK
| | - Daniel Rueckert
- Biomedical Image Analysis Group, Department of Computing, Imperial College London, London, UK
| | - Andrew P King
- Division of Imaging Sciences and Biomedical Engineering, King's College London, London, UK
| |
Collapse
|
50
|
Zacharaki EI, Mporas I, Garganis K, Megalooikonomou V. Spike pattern recognition by supervised classification in low dimensional embedding space. Brain Inform 2016; 3:73-83. [PMID: 27747608 PMCID: PMC4883172 DOI: 10.1007/s40708-016-0044-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2015] [Accepted: 02/24/2016] [Indexed: 11/13/2022] Open
Abstract
Epileptiform discharges in interictal electroencephalography (EEG) form the mainstay of epilepsy diagnosis and localization of seizure onset. Visual analysis is rater-dependent and time consuming, especially for long-term recordings, while computerized methods can provide efficiency in reviewing long EEG recordings. This paper presents a machine learning approach for automated detection of epileptiform discharges (spikes). The proposed method first detects spike patterns by calculating similarity to a coarse shape model of a spike waveform and then refines the results by identifying subtle differences between actual spikes and false detections. Pattern classification is performed using support vector machines in a low dimensional space on which the original waveforms are embedded by locality preserving projections. The automatic detection results are compared to experts' manual annotations (101 spikes) on a whole-night sleep EEG recording. The high sensitivity (97 %) and the low false positive rate (0.1 min-1), calculated by intra-patient cross-validation, highlight the potential of the method for automated interictal EEG assessment.
Collapse
Affiliation(s)
- Evangelia I Zacharaki
- Department of Computer Engineering and Informatics, University of Patras, Patras, Greece.
- Center for Visual Computing, CentraleSupélec/Galen Team, INRIA, Paris, France.
| | - Iosif Mporas
- Department of Computer Engineering and Informatics, University of Patras, Patras, Greece
| | | | | |
Collapse
|