1
|
Benisty H, Barson D, Moberly AH, Lohani S, Tang L, Coifman RR, Crair MC, Mishne G, Cardin JA, Higley MJ. Rapid fluctuations in functional connectivity of cortical networks encode spontaneous behavior. Nat Neurosci 2024; 27:148-158. [PMID: 38036743 DOI: 10.1038/s41593-023-01498-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 10/16/2023] [Indexed: 12/02/2023]
Abstract
Experimental work across species has demonstrated that spontaneously generated behaviors are robustly coupled to variations in neural activity within the cerebral cortex. Functional magnetic resonance imaging data suggest that temporal correlations in cortical networks vary across distinct behavioral states, providing for the dynamic reorganization of patterned activity. However, these data generally lack the temporal resolution to establish links between cortical signals and the continuously varying fluctuations in spontaneous behavior observed in awake animals. Here, we used wide-field mesoscopic calcium imaging to monitor cortical dynamics in awake mice and developed an approach to quantify rapidly time-varying functional connectivity. We show that spontaneous behaviors are represented by fast changes in both the magnitude and correlational structure of cortical network activity. Combining mesoscopic imaging with simultaneous cellular-resolution two-photon microscopy demonstrated that correlations among neighboring neurons and between local and large-scale networks also encode behavior. Finally, the dynamic functional connectivity of mesoscale signals revealed subnetworks not predicted by traditional anatomical atlas-based parcellation of the cortex. These results provide new insights into how behavioral information is represented across the neocortex and demonstrate an analytical framework for investigating time-varying functional connectivity in neural networks.
Collapse
Affiliation(s)
- Hadas Benisty
- Department of Neuroscience, Kavli Institute for Neuroscience, Yale University School of Medicine, New Haven, CT, USA
| | - Daniel Barson
- Department of Neuroscience, Kavli Institute for Neuroscience, Yale University School of Medicine, New Haven, CT, USA
| | - Andrew H Moberly
- Department of Neuroscience, Kavli Institute for Neuroscience, Yale University School of Medicine, New Haven, CT, USA
| | - Sweyta Lohani
- Department of Neuroscience, Kavli Institute for Neuroscience, Yale University School of Medicine, New Haven, CT, USA
| | - Lan Tang
- Department of Neuroscience, Kavli Institute for Neuroscience, Yale University School of Medicine, New Haven, CT, USA
| | - Ronald R Coifman
- Program in Applied Mathematics, Yale University, New Haven, CT, USA
| | - Michael C Crair
- Department of Neuroscience, Kavli Institute for Neuroscience, Yale University School of Medicine, New Haven, CT, USA
| | - Gal Mishne
- Halıcıoğlu Data Science Institute, University of California, San Diego, La Jolla, CA, USA
| | - Jessica A Cardin
- Department of Neuroscience, Kavli Institute for Neuroscience, Yale University School of Medicine, New Haven, CT, USA
| | - Michael J Higley
- Department of Neuroscience, Kavli Institute for Neuroscience, Yale University School of Medicine, New Haven, CT, USA.
| |
Collapse
|
2
|
Everaert J, Benisty H, Gadassi Polack R, Joormann J, Mishne G. Which features of repetitive negative thinking and positive reappraisal predict depression? An in-depth investigation using artificial neural networks with feature selection. J Psychopathol Clin Sci 2022; 131:754-768. [PMID: 35862088 DOI: 10.1037/abn0000775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Emotion regulation habits have long been implicated in risk for depression. However, research in this area traditionally adopts an approach that ignores the multifaceted nature of emotion regulation strategies, the clinical heterogeneity of depression, and potential differential relations between emotion regulation features and individual symptoms. To address limitations associated with the dominant aggregate-level approach, this study aimed to identify which features of key emotion regulation strategies are most predictive and when those features are most predictive of individual symptoms of depression across different time lags. Leveraging novel developments in the field of machine learning, artificial neural network models with feature selection were estimated using data from 460 participants who participated in a 20-wave longitudinal study with weekly assessments. At each wave, participants completed measures of repetitive negative thinking, positive reappraisal, perceived stress, and depression symptoms. Results revealed that specific features of repetitive negative thinking (wondering "why cannot I get going?" and having thoughts or images about feelings of loneliness) and positive reappraisal (looking for positive sides) were important indicators for detecting various depressive symptoms, above and beyond perceived stress. These features had overlapping and unique predictive relations with individual cognitive, affective, and somatic symptoms. Examining temporal fluctuations in the predictive utility, results showed that the utility of these emotion regulation features was stable over time. These findings illuminate potential pathways through which emotion regulation features may confer risk for depression and help to identify actionable targets for its prevention and treatment. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
Collapse
|
3
|
Benisty H, Song A, Mishne G, Charles AS. Review of data processing of functional optical microscopy for neuroscience. Neurophotonics 2022; 9:041402. [PMID: 35937186 PMCID: PMC9351186 DOI: 10.1117/1.nph.9.4.041402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 07/15/2022] [Indexed: 05/04/2023]
Abstract
Functional optical imaging in neuroscience is rapidly growing with the development of optical systems and fluorescence indicators. To realize the potential of these massive spatiotemporal datasets for relating neuronal activity to behavior and stimuli and uncovering local circuits in the brain, accurate automated processing is increasingly essential. We cover recent computational developments in the full data processing pipeline of functional optical microscopy for neuroscience data and discuss ongoing and emerging challenges.
Collapse
Affiliation(s)
- Hadas Benisty
- Yale Neuroscience, New Haven, Connecticut, United States
| | - Alexander Song
- Max Planck Institute for Intelligent Systems, Stuttgart, Germany
| | - Gal Mishne
- UC San Diego, Halıcığlu Data Science Institute, Department of Electrical and Computer Engineering and the Neurosciences Graduate Program, La Jolla, California, United States
| | - Adam S. Charles
- Johns Hopkins University, Kavli Neuroscience Discovery Institute, Center for Imaging Science, Department of Biomedical Engineering, Department of Neuroscience, and Mathematical Institute for Data Science, Baltimore, Maryland, United States
| |
Collapse
|
4
|
Zhang M, Mishne G, Chi EC. Multi‐scale affinities with missing data: Estimation and applications. Stat Anal Data Min 2022; 15:303-313. [PMID: 35756358 PMCID: PMC9216212 DOI: 10.1002/sam.11561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Many machine learning algorithms depend on weights that quantify row and column similarities of a data matrix. The choice of weights can dramatically impact the effectiveness of the algorithm. Nonetheless, the problem of choosing weights has arguably not been given enough study. When a data matrix is completely observed, Gaussian kernel affinities can be used to quantify the local similarity between pairs of rows and pairs of columns. Computing weights in the presence of missing data, however, becomes challenging. In this paper, we propose a new method to construct row and column affinities even when data are missing by building off a co-clustering technique. This method takes advantage of solving the optimization problem for multiple pairs of cost parameters and filling in the missing values with increasingly smooth estimates. It exploits the coupled similarity structure among both the rows and columns of a data matrix. We show these affinities can be used to perform tasks such as data imputation, clustering, and matrix completion on graphs.
Collapse
Affiliation(s)
- Min Zhang
- Department of Statistics North Carolina State University Raleigh North Carolina USA
| | - Gal Mishne
- Halcıoğlu Data Science Institute University of California San Diego California USA
| | - Eric C. Chi
- Department of Statistics Rice University Houston Texas USA
| |
Collapse
|
5
|
Armstrong G, Rahman G, Martino C, McDonald D, Gonzalez A, Mishne G, Knight R. Applications and Comparison of Dimensionality Reduction Methods for Microbiome Data. Front Bioinform 2022; 2:821861. [PMID: 36304280 PMCID: PMC9580878 DOI: 10.3389/fbinf.2022.821861] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 02/08/2022] [Indexed: 01/05/2023] Open
Abstract
Dimensionality reduction techniques are a key component of most microbiome studies, providing both the ability to tractably visualize complex microbiome datasets and the starting point for additional, more formal, statistical analyses. In this review, we discuss the motivation for applying dimensionality reduction techniques, the special characteristics of microbiome data such as sparsity and compositionality that make this difficult, the different categories of strategies that are available for dimensionality reduction, and examples from the literature of how they have been successfully applied (together with pitfalls to avoid). We conclude by describing the need for further development in the field, in particular combining the power of phylogenetic analysis with the ability to handle sparsity, compositionality, and non-normality, as well as discussing current techniques that should be applied more widely in future analyses.
Collapse
Affiliation(s)
- George Armstrong
- Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, United States
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, United States
| | - Gibraan Rahman
- Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, United States
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, United States
| | - Cameron Martino
- Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, United States
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, United States
- Center for Microbiome Innovation, Jacobs School of Engineering, University of California, San Diego, La Jolla, CA, United States
| | - Daniel McDonald
- Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, United States
| | - Antonio Gonzalez
- Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, United States
| | - Gal Mishne
- Halıcıoğlu Data Science Institute, University of California, San Diego, La Jolla, CA, United States
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, United States
| | - Rob Knight
- Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, United States
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, United States
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, United States
- *Correspondence: Rob Knight,
| |
Collapse
|
6
|
Charles AS, Cermak N, Affan RO, Scott BB, Schiller J, Mishne G. GraFT: Graph Filtered Temporal Dictionary Learning for Functional Neural Imaging. IEEE Trans Image Process 2022; 31:3509-3524. [PMID: 35533160 PMCID: PMC9278524 DOI: 10.1109/tip.2022.3171414] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Optical imaging of calcium signals in the brain has enabled researchers to observe the activity of hundreds-to-thousands of individual neurons simultaneously. Current methods predominantly use morphological information, typically focusing on expected shapes of cell bodies, to better identify neurons in the field-of-view. The explicit shape constraints limit the applicability of automated cell identification to other important imaging scales with more complex morphologies, e.g., dendritic or widefield imaging. Specifically, fluorescing components may be broken up, incompletely found, or merged in ways that do not accurately describe the underlying neural activity. Here we present Graph Filtered Temporal Dictionary (GraFT), a new approach that frames the problem of isolating independent fluorescing components as a dictionary learning problem. Specifically, we focus on the time-traces-the main quantity used in scientific discovery-and learn a time trace dictionary with the spatial maps acting as the presence coefficients encoding which pixels the time-traces are active in. Furthermore, we present a novel graph filtering model which redefines connectivity between pixels in terms of their shared temporal activity, rather than spatial proximity. This model greatly eases the ability of our method to handle data with complex non-local spatial structure. We demonstrate important properties of our method, such as robustness to morphology, simultaneously detecting different neuronal types, and implicitly inferring number of neurons, on both synthetic data and real data examples. Specifically, we demonstrate applications of our method to calcium imaging both at the dendritic, somatic, and widefield scales.
Collapse
|
7
|
Armstrong G, Martino C, Rahman G, Gonzalez A, Vázquez-Baeza Y, Mishne G, Knight R. Uniform Manifold Approximation and Projection (UMAP) Reveals Composite Patterns and Resolves Visualization Artifacts in Microbiome Data. mSystems 2021; 6:e0069121. [PMID: 34609167 PMCID: PMC8547469 DOI: 10.1128/msystems.00691-21] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 09/19/2021] [Indexed: 11/29/2022] Open
Abstract
Microbiome data are sparse and high dimensional, so effective visualization of these data requires dimensionality reduction. To date, the most commonly used method for dimensionality reduction in the microbiome is calculation of between-sample microbial differences (beta diversity), followed by principal-coordinate analysis (PCoA). Uniform Manifold Approximation and Projection (UMAP) is an alternative method that can reduce the dimensionality of beta diversity distance matrices. Here, we demonstrate the benefits and limitations of using UMAP for dimensionality reduction on microbiome data. Using real data, we demonstrate that UMAP can improve the representation of clusters, especially when the clusters are composed of multiple subgroups. Additionally, we show that UMAP provides improved correlation of biological variation along a gradient with a reduced number of coordinates of the resulting embedding. Finally, we provide parameter recommendations that emphasize the preservation of global geometry. We therefore conclude that UMAP should be routinely used as a complementary visualization method for microbiome beta diversity studies. IMPORTANCE UMAP provides an additional method to visualize microbiome data. The method is extensible to any beta diversity metric used with PCoA, and our results demonstrate that UMAP can indeed improve visualization quality and correspondence with biological and technical variables of interest. The software to perform this analysis is available under an open-source license and can be obtained at https://github.com/knightlab-analyses/umap-microbiome-benchmarking; additionally, we have provided a QIIME 2 plugin for UMAP at https://github.com/biocore/q2-umap.
Collapse
Affiliation(s)
- George Armstrong
- Department of Pediatrics, School of Medicine, University of California, San Diego, California, USA
- Center for Microbiome Innovation, Jacobs School of Engineering, University of California San Diego, La Jolla, California, USA
- Bioinformatics and Systems Biology Program, University of California, San Diego, California, USA
| | - Cameron Martino
- Department of Pediatrics, School of Medicine, University of California, San Diego, California, USA
- Center for Microbiome Innovation, Jacobs School of Engineering, University of California San Diego, La Jolla, California, USA
- Bioinformatics and Systems Biology Program, University of California, San Diego, California, USA
| | - Gibraan Rahman
- Department of Pediatrics, School of Medicine, University of California, San Diego, California, USA
- Bioinformatics and Systems Biology Program, University of California, San Diego, California, USA
| | - Antonio Gonzalez
- Department of Pediatrics, School of Medicine, University of California, San Diego, California, USA
| | - Yoshiki Vázquez-Baeza
- Center for Microbiome Innovation, Jacobs School of Engineering, University of California San Diego, La Jolla, California, USA
| | - Gal Mishne
- Halıcıoğlu Data Science Institute, University of California, San Diego, La Jolla, California, USA
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, California, USA
| | - Rob Knight
- Department of Pediatrics, School of Medicine, University of California, San Diego, California, USA
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, California, USA
- Department of Bioengineering, University of California, San Diego, La Jolla, California, USA
| |
Collapse
|
8
|
Yi H, Huang L, Mishne G, Chi EC. COBRAC: a fast implementation of convex biclustering with compression. Bioinformatics 2021; 37:3667-3669. [PMID: 33904580 PMCID: PMC8545294 DOI: 10.1093/bioinformatics/btab248] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 04/12/2021] [Accepted: 04/21/2021] [Indexed: 11/14/2022] Open
Abstract
SUMMARY Biclustering is a generalization of clustering used to identify simultaneous grouping patterns in observations (rows) and features (columns) of a data matrix. Recently, the biclustering task has been formulated as a convex optimization problem. While this convex recasting of the problem has attractive properties, existing algorithms do not scale well. To address this problem and make convex biclustering a practical tool for analyzing larger data, we propose an implementation of fast convex biclustering called COBRAC to reduce the computing time by iteratively compressing problem size along with the solution path. We apply COBRAC to several gene expression datasets to demonstrate its effectiveness and efficiency. Besides the standalone version for COBRAC, we also developed a related online web server for online calculation and visualization of the downloadable interactive results. AVAILABILITY AND IMPLEMENTATION The source code and test data are available at https://github.com/haidyi/cvxbiclustr or https://zenodo.org/record/4620218. The web server is available at https://cvxbiclustr.ericchi.com. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Haidong Yi
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Le Huang
- Department of Genetics, Curriculum in Bioinformatics & Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Gal Mishne
- Halıcıoğlu Data Science Institute, University of California, San Diego, La Jolla, CA 92093, USA
| | - Eric C Chi
- Department of Statistics, North Carolina State University, Raleigh, NC 27607, USA
| |
Collapse
|
9
|
Gao S, Mishne G, Scheinost D. Nonlinear manifold learning in functional magnetic resonance imaging uncovers a low-dimensional space of brain dynamics. Hum Brain Mapp 2021; 42:4510-4524. [PMID: 34184812 PMCID: PMC8410525 DOI: 10.1002/hbm.25561] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2021] [Revised: 05/26/2021] [Accepted: 05/30/2021] [Indexed: 02/02/2023] Open
Abstract
Large-scale brain dynamics are believed to lie in a latent, low-dimensional space. Typically, the embeddings of brain scans are derived independently from different cognitive tasks or resting-state data, ignoring a potentially large-and shared-portion of this space. Here, we establish that a shared, robust, and interpretable low-dimensional space of brain dynamics can be recovered from a rich repertoire of task-based functional magnetic resonance imaging (fMRI) data. This occurs when relying on nonlinear approaches as opposed to traditional linear methods. The embedding maintains proper temporal progression of the tasks, revealing brain states and the dynamics of network integration. We demonstrate that resting-state data embeds fully onto the same task embedding, indicating similar brain states are present in both task and resting-state data. Our findings suggest analysis of fMRI data from multiple cognitive tasks in a low-dimensional space is possible and desirable.
Collapse
Affiliation(s)
- Siyuan Gao
- Department of Biomedical EngineeringYale UniversityNew HavenConnecticutUSA
| | - Gal Mishne
- Halıcıoğlu Data Science Institute, University of California San DiegoLa JollaCaliforniaUSA
- Neurosciences Graduate Program, University of California San DiegoLa JollaCaliforniaUSA
| | - Dustin Scheinost
- Department of Biomedical EngineeringYale UniversityNew HavenConnecticutUSA
- Department of Radiology and Biomedical ImagingYale School of MedicineNew HavenConnecticutUSA
- Department of Statistics and Data ScienceYale UniversityNew HavenConnecticutUSA
- Child Study Center, Yale School of MedicineNew HavenConnecticutUSA
| |
Collapse
|
10
|
Gao S, Xia X, Scheinost D, Mishne G. Smooth graph learning for functional connectivity estimation. Neuroimage 2021; 239:118289. [PMID: 34171497 DOI: 10.1016/j.neuroimage.2021.118289] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2021] [Revised: 06/15/2021] [Accepted: 06/17/2021] [Indexed: 01/04/2023] Open
Abstract
Functional connectivity (FC) estimated from functional magnetic resonance imaging (fMRI) signals is important in understanding neural representation and information processing in cortical networks. However, due to a lack of "ground truth" FC pattern, the reliability and robustness of FC estimates are usually examined in downstream FC analysis tasks, such as performing participant's identification (also known as "fingerprinting"). In this paper, we propose to learn FC via a smooth graph learning framework. In particular, we treat each time frame of the fMRI time series as a graph signal on an underlying functional brain graph, and estimate the smooth graph functional connectivity (SGFC) by learning the weighted graph adjacency matrix based on graph signal smoothness assumption. We demonstrate that our approach gives rise to a natural and sparse graph representation of FC from which reliable graph measures can be extracted. Reliability of SGFC is evaluated in the context of fingerprinting and compared to correlation FC (CFC). SGFC achieves higher fingerprinting accuracy across several different experiment settings; the improvement is even more significant when a shorter fMRI scanning length is used for FC estimation. In addition to being reliable, we also validate the cognitive relevance of SGFC by using it to predict fluid intelligence. Finally, in evaluating topological measures of the sparse graph, SGFC reveals a more small-world and modular structure compared to CFC. Together, our results suggest that the smooth graph learning framework produces a naturally sparse, reliable, and cognitive-relevant representation of functional connectivity.
Collapse
Affiliation(s)
- Siyuan Gao
- Department of Biomedical Engineering, Yale University, 06520, United States
| | - Xinyue Xia
- Neurosciences Graduate Program, University of California San Diego, 92093, United States
| | - Dustin Scheinost
- Department of Biomedical Engineering, Yale University, 06520, United States; Department of Radiology and Biomedical Imaging, Yale School of Medicine, 06510, United States; Department of Statistics and Data Science, Yale University, 06520, United States; Child Study Center, Yale School of Medicine, 06510, United States
| | - Gal Mishne
- Neurosciences Graduate Program, University of California San Diego, 92093, United States; Halicioğlu Data Science Institute, University of California San Diego, 92093, United States; Department of Electrical and Computer Engineering, University of California San Diego, 92093, United States.
| |
Collapse
|
11
|
Lindenbaum O, Sagiv A, Mishne G, Talmon R. Kernel-based parameter estimation of dynamical systems with unknown observation functions. Chaos 2021; 31:043118. [PMID: 34251227 DOI: 10.1063/5.0044529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2021] [Accepted: 03/29/2021] [Indexed: 06/13/2023]
Abstract
A low-dimensional dynamical system is observed in an experiment as a high-dimensional signal, for example, a video of a chaotic pendulums system. Assuming that we know the dynamical model up to some unknown parameters, can we estimate the underlying system's parameters by measuring its time-evolution only once? The key information for performing this estimation lies in the temporal inter-dependencies between the signal and the model. We propose a kernel-based score to compare these dependencies. Our score generalizes a maximum likelihood estimator for a linear model to a general nonlinear setting in an unknown feature space. We estimate the system's underlying parameters by maximizing the proposed score. We demonstrate the accuracy and efficiency of the method using two chaotic dynamical systems-the double pendulum and the Lorenz '63 model.
Collapse
Affiliation(s)
- Ofir Lindenbaum
- Program in Applied Mathematics, Yale University, 51 Prospect Street, New Haven, Connecticut 06511, USA
| | - Amir Sagiv
- Department of Applied Physics and Applied Mathematics, Columbia University, 500 West 120th Street, New York, New York 10027, USA
| | - Gal Mishne
- Halicioglu Data Science Institute, UC San Diego 9500 Gilman Drive MS 0555 SDSC 215E, La Jolla, California 92093-0555, USA
| | - Ronen Talmon
- Faculty of Electrical Engineering, Technion-Israel Institute of Technology, Haifa 32000, Israel
| |
Collapse
|
12
|
Kohli D, Cloninger A, Mishne G. LDLE: Low Distortion Local Eigenmaps. J Mach Learn Res 2021; 22:282. [PMID: 35873072 PMCID: PMC9307127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
We present Low Distortion Local Eigenmaps (LDLE), a manifold learning technique which constructs a set of low distortion local views of a data set in lower dimension and registers them to obtain a global embedding. The local views are constructed using the global eigenvectors of the graph Laplacian and are registered using Procrustes analysis. The choice of these eigenvectors may vary across the regions. In contrast to existing techniques, LDLE can embed closed and non-orientable manifolds into their intrinsic dimension by tearing them apart. It also provides gluing instruction on the boundary of the torn embedding to help identify the topology of the original manifold. Our experimental results will show that LDLE largely preserved distances up to a constant scale while other techniques produced higher distortion. We also demonstrate that LDLE produces high quality embeddings even when the data is noisy or sparse.
Collapse
Affiliation(s)
- Dhruv Kohli
- Department of Mathematics, University of California San Diego, CA 92093, USA
| | - Alexander Cloninger
- Department of Mathematics, University of California San Diego, CA 92093, USA
| | - Gal Mishne
- Halicioğlu Data Science Institute, University of California San Diego, CA 92093, USA
| |
Collapse
|
13
|
Stanley JS, Chi EC, Mishne G. Multiway Graph Signal Processing on Tensors: Integrative analysis of irregular geometries. IEEE Signal Process Mag 2020; 37:160-173. [PMID: 33473243 PMCID: PMC7814420 DOI: 10.1109/msp.2020.3013555] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Graph signal processing (GSP) is an important methodology for studying data residing on irregular structures. As acquired data is increasingly taking the form of multi-way tensors, new signal processing tools are needed to maximally utilize the multi-way structure within the data. In this paper, we review modern signal processing frameworks generalizing GSP to multi-way data, starting from graph signals coupled to familiar regular axes such as time in sensor networks, and then extending to general graphs across all tensor modes. This widely applicable paradigm motivates reformulating and improving upon classical problems and approaches to creatively address the challenges in tensor-based data. We synthesize common themes arising from current efforts to combine GSP with tensor analysis and highlight future directions in extending GSP to the multi-way paradigm.
Collapse
Affiliation(s)
| | - Eric C Chi
- Dept. of Statistics, NC State University, Raleigh, NC
| | - Gal Mishne
- Halıcıoğlu Data Science Institute, UC San Diego, La Jolla, CA
| |
Collapse
|
14
|
Abstract
The extraction of clusters from a dataset which includes multiple clusters and a significant background component is a non-trivial task of practical importance. In image analysis this manifests for example in anomaly detection and target detection. The traditional spectral clustering algorithm, which relies on the leading K eigenvectors to detect K clusters, fails in such cases. In this paper we propose the spectral embedding norm which sums the squared values of the first I normalized eigenvectors, where I can be significantly larger than K. We prove that this quantity can be used to separate clusters from the background in unbalanced settings, including extreme cases such as outlier detection. The performance of the algorithm is not sensitive to the choice of I, and we demonstrate its application on synthetic and real-world remote sensing and neuroimaging datasets.
Collapse
Affiliation(s)
- Xiuyuan Cheng
- Department of Mathematics, Duke University, Durham, NC
| | - Gal Mishne
- Halicioğlu Data Science Institute, University of California, San Diego
| |
Collapse
|
15
|
Linderman GC, Mishne G, Jaffe A, Kluger Y, Steinerberger S. Randomized near-neighbor graphs, giant components and applications in data science. J Appl Probab 2020; 57:458-476. [PMID: 32913373 PMCID: PMC7480951 DOI: 10.1017/jpr.2020.21] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
If we pick n random points uniformly in [0, 1] d and connect each point to its c d log n-nearest neighbors, where d ≥ 2 is the dimension and c d is a constant depending on the dimension, then it is well known that the graph is connected with high probability. We prove that it suffices to connect every point to c d,1 log log n points chosen randomly among its c d,2 log n-nearest neighbors to ensure a giant component of size n - o(n) with high probability. This construction yields a much sparser random graph with ~ n log log n instead of ~ n log n edges that has comparable connectivity properties. This result has nontrivial implications for problems in data science where an affinity matrix is constructed: instead of connecting each point to its k nearest neighbors, one can often pick k' ≪ k random points out of the k nearest neighbors and only connect to those without sacrificing quality of results. This approach can simplify and accelerate computation; we illustrate this with experimental results in spectral clustering of large-scale datasets.
Collapse
Affiliation(s)
- George C Linderman
- Postal address: Applied Mathematics, Yale University, New Haven, CT 06511
| | - Gal Mishne
- Postal address: Applied Mathematics, Yale University, New Haven, CT 06511
| | - Ariel Jaffe
- Postal address: Applied Mathematics, Yale University, New Haven, CT 06511
| | - Yuval Kluger
- Dept. of Pathology & Applied Mathematics, Yale University, New Haven, CT 06511
| | | |
Collapse
|
16
|
Abstract
We consider the analysis of high dimensional data given in the form of a matrix with columns consisting of observations and rows consisting of features. Often the data is such that the observations do not reside on a regular grid, and the given order of the features is arbitrary and does not convey a notion of locality. Therefore, traditional transforms and metrics cannot be used for data organization and analysis. In this paper, our goal is to organize the data by defining an appropriate representation and metric such that they respect the smoothness and structure underlying the data. We also aim to generalize the joint clustering of observations and features in the case the data does not fall into clear disjoint groups. For this purpose, we propose multiscale data-driven transforms and metrics based on trees. Their construction is implemented in an iterative refinement procedure that exploits the co-dependencies between features and observations. Beyond the organization of a single dataset, our approach enables us to transfer the organization learned from one dataset to another and to integrate several datasets together. We present an application to breast cancer gene expression analysis: learning metrics on the genes to cluster the tumor samples into cancer sub-types and validating the joint organization of both the genes and the samples. We demonstrate that using our approach to combine information from multiple gene expression cohorts, acquired by different profiling technologies, improves the clustering of tumor samples.
Collapse
Affiliation(s)
- Gal Mishne
- Viterbi Faculty of Electrical Engineering, Technion - Israel Institute of Technology, Haifa 32000, Israel
| | - Ronen Talmon
- Viterbi Faculty of Electrical Engineering, Technion - Israel Institute of Technology, Haifa 32000, Israel
| | - Israel Cohen
- Viterbi Faculty of Electrical Engineering, Technion - Israel Institute of Technology, Haifa 32000, Israel
| | - Ronald R Coifman
- Department of Mathematics, Yale University, New Haven, CT 06520 USA
| | - Yuval Kluger
- Department of Pathology and the Yale Cancer Center, Yale University School of Medicine, New Haven, CT 06511 USA
| |
Collapse
|