1
|
Geadah V, Barello G, Greenidge D, Charles AS, Pillow JW. Sparse-Coding Variational Autoencoders. Neural Comput 2024; 36:2571-2601. [PMID: 39383030 DOI: 10.1162/neco_a_01715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 05/28/2024] [Indexed: 10/11/2024]
Abstract
The sparse coding model posits that the visual system has evolved to efficiently code natural stimuli using a sparse set of features from an overcomplete dictionary. The original sparse coding model suffered from two key limitations; however: (1) computing the neural response to an image patch required minimizing a nonlinear objective function via recurrent dynamics and (2) fitting relied on approximate inference methods that ignored uncertainty. Although subsequent work has developed several methods to overcome these obstacles, we propose a novel solution inspired by the variational autoencoder (VAE) framework. We introduce the sparse coding variational autoencoder (SVAE), which augments the sparse coding model with a probabilistic recognition model parameterized by a deep neural network. This recognition model provides a neurally plausible feedforward implementation for the mapping from image patches to neural activities and enables a principled method for fitting the sparse coding model to data via maximization of the evidence lower bound (ELBO). The SVAE differs from standard VAEs in three key respects: the latent representation is overcomplete (there are more latent dimensions than image pixels), the prior is sparse or heavy-tailed instead of gaussian, and the decoder network is a linear projection instead of a deep network. We fit the SVAE to natural image data under different assumed prior distributions and show that it obtains higher test performance than previous fitting methods. Finally, we examine the response properties of the recognition network and show that it captures important nonlinear properties of neurons in the early visual pathway.
Collapse
Affiliation(s)
- Victor Geadah
- Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ 08544, U.S.A.
| | - Gabriel Barello
- Institute of Neuroscience, University of Oregon, Eugene, OR 97403, U.S.A.
| | - Daniel Greenidge
- Department of Computer Science, Princeton University, Princeton, NJ 08544, U.S.A.
| | - Adam S Charles
- Department of Biomedical Engineering, Department Center for Imaging Science, and Department Kavli Neuroscience Discovery Institute, Baltimore, MD 21218, U.S.A.
| | - Jonathan W Pillow
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, U.S.A.
| |
Collapse
|
2
|
Centorrino V, Gokhale A, Davydov A, Russo G, Bullo F. Positive Competitive Networks for Sparse Reconstruction. Neural Comput 2024; 36:1163-1197. [PMID: 38657968 DOI: 10.1162/neco_a_01657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Accepted: 01/16/2024] [Indexed: 04/26/2024]
Abstract
We propose and analyze a continuous-time firing-rate neural network, the positive firing-rate competitive network (PFCN), to tackle sparse reconstruction problems with non-negativity constraints. These problems, which involve approximating a given input stimulus from a dictionary using a set of sparse (active) neurons, play a key role in a wide range of domains, including, for example, neuroscience, signal processing, and machine learning. First, by leveraging the theory of proximal operators, we relate the equilibria of a family of continuous-time firing-rate neural networks to the optimal solutions of sparse reconstruction problems. Then we prove that the PFCN is a positive system and give rigorous conditions for the convergence to the equilibrium. Specifically, we show that the convergence depends only on a property of the dictionary and is linear-exponential in the sense that initially, the convergence rate is at worst linear and then, after a transient, becomes exponential. We also prove a number of technical results to assess the contractivity properties of the neural dynamics of interest. Our analysis leverages contraction theory to characterize the behavior of a family of firing-rate competitive networks for sparse reconstruction with and without non-negativity constraints. Finally, we validate the effectiveness of our approach via a numerical example.
Collapse
Affiliation(s)
| | - Anand Gokhale
- Center for Control, Dynamical Systems, and Computation, University of California, Santa Barbara, Santa Barbara, CA 93106 U.S.A.
| | - Alexander Davydov
- Center for Control, Dynamical Systems, and Computation, University of California, Santa Barbara, Santa Barbara, CA 93106 U.S.A.
| | - Giovanni Russo
- Department of Information and Electric Engineering and Applied Mathematics, University of Salerno, Fisciano 84084, Italy
| | - Francesco Bullo
- Center for Control, Dynamical Systems, and Computation, University of California, Santa Barbara, Santa Barbara, CA 93106 U.S.A.
| |
Collapse
|
3
|
Rentzeperis I, Calatroni L, Perrinet LU, Prandi D. Beyond ℓ1 sparse coding in V1. PLoS Comput Biol 2023; 19:e1011459. [PMID: 37699052 PMCID: PMC10516432 DOI: 10.1371/journal.pcbi.1011459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 09/22/2023] [Accepted: 08/23/2023] [Indexed: 09/14/2023] Open
Abstract
Growing evidence indicates that only a sparse subset from a pool of sensory neurons is active for the encoding of visual stimuli at any instant in time. Traditionally, to replicate such biological sparsity, generative models have been using the ℓ1 norm as a penalty due to its convexity, which makes it amenable to fast and simple algorithmic solvers. In this work, we use biological vision as a test-bed and show that the soft thresholding operation associated to the use of the ℓ1 norm is highly suboptimal compared to other functions suited to approximating ℓp with 0 ≤ p < 1 (including recently proposed continuous exact relaxations), in terms of performance. We show that ℓ1 sparsity employs a pool with more neurons, i.e. has a higher degree of overcompleteness, in order to maintain the same reconstruction error as the other methods considered. More specifically, at the same sparsity level, the thresholding algorithm using the ℓ1 norm as a penalty requires a dictionary of ten times more units compared to the proposed approach, where a non-convex continuous relaxation of the ℓ0 pseudo-norm is used, to reconstruct the external stimulus equally well. At a fixed sparsity level, both ℓ0- and ℓ1-based regularization develop units with receptive field (RF) shapes similar to biological neurons in V1 (and a subset of neurons in V2), but ℓ0-based regularization shows approximately five times better reconstruction of the stimulus. Our results in conjunction with recent metabolic findings indicate that for V1 to operate efficiently it should follow a coding regime which uses a regularization that is closer to the ℓ0 pseudo-norm rather than the ℓ1 one, and suggests a similar mode of operation for the sensory cortex in general.
Collapse
Affiliation(s)
- Ilias Rentzeperis
- Université Paris-Saclay, CNRS, CentraleSupélec, Laboratoire des Signaux et Systèmes, Paris, France
| | - Luca Calatroni
- CNRS, UCA, INRIA, Laboratoire d’Informatique, Signaux et Systèmes de Sophia Antipolis, Sophia Antipolis, France
| | - Laurent U. Perrinet
- Aix Marseille Univ, CNRS, INT, Institut de Neurosciences de la Timone, Marseille, France
| | - Dario Prandi
- Université Paris-Saclay, CNRS, CentraleSupélec, Laboratoire des Signaux et Systèmes, Paris, France
| |
Collapse
|
4
|
Alreja A, Nemenman I, Rozell CJ. Constrained brain volume in an efficient coding model explains the fraction of excitatory and inhibitory neurons in sensory cortices. PLoS Comput Biol 2022; 18:e1009642. [PMID: 35061666 PMCID: PMC8809590 DOI: 10.1371/journal.pcbi.1009642] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Revised: 02/02/2022] [Accepted: 11/14/2021] [Indexed: 11/18/2022] Open
Abstract
The number of neurons in mammalian cortex varies by multiple orders of magnitude across different species. In contrast, the ratio of excitatory to inhibitory neurons (E:I ratio) varies in a much smaller range, from 3:1 to 9:1 and remains roughly constant for different sensory areas within a species. Despite this structure being important for understanding the function of neural circuits, the reason for this consistency is not yet understood. While recent models of vision based on the efficient coding hypothesis show that increasing the number of both excitatory and inhibitory cells improves stimulus representation, the two cannot increase simultaneously due to constraints on brain volume. In this work, we implement an efficient coding model of vision under a constraint on the volume (using number of neurons as a surrogate) while varying the E:I ratio. We show that the performance of the model is optimal at biologically observed E:I ratios under several metrics. We argue that this happens due to trade-offs between the computational accuracy and the representation capacity for natural stimuli. Further, we make experimentally testable predictions that 1) the optimal E:I ratio should be higher for species with a higher sparsity in the neural activity and 2) the character of inhibitory synaptic distributions and firing rates should change depending on E:I ratio. Our findings, which are supported by our new preliminary analyses of publicly available data, provide the first quantitative and testable hypothesis based on optimal coding models for the distribution of excitatory and inhibitory neural types in the mammalian sensory cortices. Neurons in the brain come in two main types: excitatory and inhibitory. The interplay between them shapes neural computation. Despite brain sizes varying by several orders of magnitude across species, the ratio of excitatory and inhibitory sub-populations (E:I ratio) remains relatively constant, and we don’t know why. Simulations of theoretical models of the brain can help answer such questions, especially when experiments are prohibitive or impossible. Here we placed one such theoretical model of sensory coding (’sparse coding’ that minimizes the simultaneously active neurons) under a biophysical ‘volume’ constraint that fixes the total number of neurons available. We vary the E:I ratio in the model (which cannot be done in experiments), and reveal an optimal E:I ratio where the representation of sensory stimulus and energy consumption within the circuit are concurrently optimal. We also show that varying the population sparsity changes the optimal E:I ratio, spanning the relatively narrow ranges observed in biology. Crucially, this minimally parameterized theoretical model makes predictions about structure (recurrent connectivity) and activity (population sparsity) in neural circuits with different E:I ratios (i.e., different species), of which we verify the latter in a first-of-its-kind inter-species comparison using newly publicly available data.
Collapse
Affiliation(s)
- Arish Alreja
- Neuroscience Institute, Center for the Neural Basis of Cognition and Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Ilya Nemenman
- Department of Physics, Department of Biology and Initiative in Theory and Modeling of Living Systems, Emory University, Atlanta, Georgia, United States of America
| | - Christopher J. Rozell
- School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, Georgia, United States of America
- * E-mail:
| |
Collapse
|
5
|
Barrett DG, Denève S, Machens CK. Optimal compensation for neuron loss. eLife 2016; 5. [PMID: 27935480 PMCID: PMC5283835 DOI: 10.7554/elife.12454] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2015] [Accepted: 12/08/2016] [Indexed: 11/13/2022] Open
Abstract
The brain has an impressive ability to withstand neural damage. Diseases that kill neurons can go unnoticed for years, and incomplete brain lesions or silencing of neurons often fail to produce any behavioral effect. How does the brain compensate for such damage, and what are the limits of this compensation? We propose that neural circuits instantly compensate for neuron loss, thereby preserving their function as much as possible. We show that this compensation can explain changes in tuning curves induced by neuron silencing across a variety of systems, including the primary visual cortex. We find that compensatory mechanisms can be implemented through the dynamics of networks with a tight balance of excitation and inhibition, without requiring synaptic plasticity. The limits of this compensatory mechanism are reached when excitation and inhibition become unbalanced, thereby demarcating a recovery boundary, where signal representation fails and where diseases may become symptomatic.
Collapse
Affiliation(s)
- David Gt Barrett
- Laboratoire de Neurosciences Cognitives, École Normale Supérieure, Paris, France.,Champalimaud Neuroscience Programme, Champalimaud Centre for the Unknown, Lisbon, Portugal
| | - Sophie Denève
- Laboratoire de Neurosciences Cognitives, École Normale Supérieure, Paris, France
| | - Christian K Machens
- Champalimaud Neuroscience Programme, Champalimaud Centre for the Unknown, Lisbon, Portugal
| |
Collapse
|
6
|
Zhu M, Rozell CJ. Modeling Inhibitory Interneurons in Efficient Sensory Coding Models. PLoS Comput Biol 2015; 11:e1004353. [PMID: 26172289 PMCID: PMC4501572 DOI: 10.1371/journal.pcbi.1004353] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2014] [Accepted: 05/21/2015] [Indexed: 11/19/2022] Open
Abstract
There is still much unknown regarding the computational role of inhibitory cells in the sensory cortex. While modeling studies could potentially shed light on the critical role played by inhibition in cortical computation, there is a gap between the simplicity of many models of sensory coding and the biological complexity of the inhibitory subpopulation. In particular, many models do not respect that inhibition must be implemented in a separate subpopulation, with those inhibitory interneurons having a diversity of tuning properties and characteristic E/I cell ratios. In this study we demonstrate a computational framework for implementing inhibition in dynamical systems models that better respects these biophysical observations about inhibitory interneurons. The main approach leverages recent work related to decomposing matrices into low-rank and sparse components via convex optimization, and explicitly exploits the fact that models and input statistics often have low-dimensional structure that can be exploited for efficient implementations. While this approach is applicable to a wide range of sensory coding models (including a family of models based on Bayesian inference in a linear generative model), for concreteness we demonstrate the approach on a network implementing sparse coding. We show that the resulting implementation stays faithful to the original coding goals while using inhibitory interneurons that are much more biophysically plausible. Cortical function is a result of coordinated interactions between excitatory and inhibitory neural populations. In previous theoretical models of sensory systems, inhibitory neurons are often ignored or modeled too simplistically to contribute to understanding their role in cortical computation. In biophysical reality, inhibition is implemented with interneurons that have different characteristics from the population of excitatory cells. In this study, we propose a computational approach for including inhibition in theoretical models of neural coding in a way that respects several of these important characteristics, such as the relative number of inhibitory cells and the diversity of their response properties. The main idea is that the significant structure of the sensory world is reflected in very structured models of sensory coding, which can then be exploited in the implementation of the model using modern computational techniques. We demonstrate this approach on one specific model of sensory coding (called “sparse coding”) that has been successful at modeling other aspects of sensory cortex.
Collapse
Affiliation(s)
- Mengchen Zhu
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Christopher J. Rozell
- School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, Georgia, United States of America
- * E-mail:
| |
Collapse
|
7
|
SHAPERO SAMUEL, ZHU MENGCHEN, HASLER JENNIFER, ROZELL CHRISTOPHER. OPTIMAL SPARSE APPROXIMATION WITH INTEGRATE AND FIRE NEURONS. Int J Neural Syst 2014; 24:1440001. [DOI: 10.1142/s0129065714400012] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Sparse approximation is a hypothesized coding strategy where a population of sensory neurons (e.g. V1) encodes a stimulus using as few active neurons as possible. We present the Spiking LCA (locally competitive algorithm), a rate encoded Spiking Neural Network (SNN) of integrate and fire neurons that calculate sparse approximations. The Spiking LCA is designed to be equivalent to the nonspiking LCA, an analog dynamical system that converges on a ℓ1-norm sparse approximations exponentially. We show that the firing rate of the Spiking LCA converges on the same solution as the analog LCA, with an error inversely proportional to the sampling time. We simulate in NEURON a network of 128 neuron pairs that encode 8 × 8 pixel image patches, demonstrating that the network converges to nearly optimal encodings within 20 ms of biological time. We also show that when using more biophysically realistic parameters in the neurons, the gain function encourages additional ℓ0-norm sparsity in the encoding, relative both to ideal neurons and digital solvers.
Collapse
Affiliation(s)
- SAMUEL SHAPERO
- Electronic Systems Laboratory, Georgia Tech Research Institute, 400 10th St NW, Atlanta, Georgia 30318, United States of America
| | - MENGCHEN ZHU
- Biomedical Engineering, Georgia Institute of Technology, 313 Ferst Drive, Atlanta, Georgia 30332, United States of America
| | - JENNIFER HASLER
- Electrical and Computer Engineering, Georgia Institute of Technology, 777 Atlantic Dr NW, Atlanta, Georgia 30332, United States of America
| | - CHRISTOPHER ROZELL
- Electrical and Computer Engineering, Georgia Institute of Technology, 777 Atlantic Dr NW, Atlanta, Georgia 30332, United States of America
| |
Collapse
|
8
|
Hebbian-based mean shift for learning the diverse shapes of V1 simple cell receptive fields. CHINESE SCIENCE BULLETIN-CHINESE 2014. [DOI: 10.1007/s11434-013-0041-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
9
|
Spratling MW. Classification using sparse representations: a biologically plausible approach. BIOLOGICAL CYBERNETICS 2014; 108:61-73. [PMID: 24306061 DOI: 10.1007/s00422-013-0579-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/17/2012] [Accepted: 11/18/2013] [Indexed: 06/02/2023]
Abstract
Representing signals as linear combinations of basis vectors sparsely selected from an overcomplete dictionary has proven to be advantageous for many applications in pattern recognition, machine learning, signal processing, and computer vision. While this approach was originally inspired by insights into cortical information processing, biologically plausible approaches have been limited to exploring the functionality of early sensory processing in the brain, while more practical applications have employed non-biologically plausible sparse coding algorithms. Here, a biologically plausible algorithm is proposed that can be applied to practical problems. This algorithm is evaluated using standard benchmark tasks in the domain of pattern classification, and its performance is compared to a wide range of alternative algorithms that are widely used in signal and image processing. The results show that for the classification tasks performed here, the proposed method is competitive with the best of the alternative algorithms that have been evaluated. This demonstrates that classification using sparse representations can be performed in a neurally plausible manner, and hence, that this mechanism of classification might be exploited by the brain.
Collapse
Affiliation(s)
- M W Spratling
- Department of Informatics, King's College London, Strand, London, WC2R 2LS, UK,
| |
Collapse
|
10
|
Zhu M, Rozell CJ. Visual nonclassical receptive field effects emerge from sparse coding in a dynamical system. PLoS Comput Biol 2013; 9:e1003191. [PMID: 24009491 PMCID: PMC3757072 DOI: 10.1371/journal.pcbi.1003191] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2012] [Accepted: 05/31/2013] [Indexed: 11/25/2022] Open
Abstract
Extensive electrophysiology studies have shown that many V1 simple cells have nonlinear response properties to stimuli within their classical receptive field (CRF) and receive contextual influence from stimuli outside the CRF modulating the cell's response. Models seeking to explain these non-classical receptive field (nCRF) effects in terms of circuit mechanisms, input-output descriptions, or individual visual tasks provide limited insight into the functional significance of these response properties, because they do not connect the full range of nCRF effects to optimal sensory coding strategies. The (population) sparse coding hypothesis conjectures an optimal sensory coding approach where a neural population uses as few active units as possible to represent a stimulus. We demonstrate that a wide variety of nCRF effects are emergent properties of a single sparse coding model implemented in a neurally plausible network structure (requiring no parameter tuning to produce different effects). Specifically, we replicate a wide variety of nCRF electrophysiology experiments (e.g., end-stopping, surround suppression, contrast invariance of orientation tuning, cross-orientation suppression, etc.) on a dynamical system implementing sparse coding, showing that this model produces individual units that reproduce the canonical nCRF effects. Furthermore, when the population diversity of an nCRF effect has also been reported in the literature, we show that this model produces many of the same population characteristics. These results show that the sparse coding hypothesis, when coupled with a biophysically plausible implementation, can provide a unified high-level functional interpretation to many response properties that have generally been viewed through distinct mechanistic or phenomenological models. Simple cells in the primary visual cortex (V1) demonstrate many response properties that are either nonlinear or involve response modulations (i.e., stimuli that do not cause a response in isolation alter the cell's response to other stimuli). These non-classical receptive field (nCRF) effects are generally modeled individually and their collective role in biological vision is not well understood. Previous work has shown that classical receptive field (CRF) properties of V1 cells (i.e., the spatial structure of the visual field responsive to stimuli) could be explained by the sparse coding hypothesis, which is an optimal coding model that conjectures a neural population should use the fewest number of cells simultaneously to represent each stimulus. In this paper, we have performed extensive simulated physiology experiments to show that many nCRF response properties are simply emergent effects of a dynamical system implementing this same sparse coding model. These results suggest that rather than representing disparate information processing operations themselves, these nCRF effects could be consequences of an optimal sensory coding strategy that attempts to represent each stimulus most efficiently. This interpretation provides a potentially unifying high-level functional interpretation to many response properties that have generally been viewed through distinct models.
Collapse
Affiliation(s)
- Mengchen Zhu
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Christopher J. Rozell
- School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, Georgia, United States of America
- * E-mail:
| |
Collapse
|