1
|
Classification and Enumeration of Linearly Separable Boolean Functions Based on Optimal Separation System. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-10781-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
2
|
Schrimpf M, Blank IA, Tuckute G, Kauf C, Hosseini EA, Kanwisher N, Tenenbaum JB, Fedorenko E. The neural architecture of language: Integrative modeling converges on predictive processing. Proc Natl Acad Sci U S A 2021; 118:e2105646118. [PMID: 34737231 PMCID: PMC8694052 DOI: 10.1073/pnas.2105646118] [Citation(s) in RCA: 152] [Impact Index Per Article: 38.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/03/2021] [Indexed: 01/30/2023] Open
Abstract
The neuroscience of perception has recently been revolutionized with an integrative modeling approach in which computation, brain function, and behavior are linked across many datasets and many computational models. By revealing trends across models, this approach yields novel insights into cognitive and neural mechanisms in the target domain. We here present a systematic study taking this approach to higher-level cognition: human language processing, our species' signature cognitive skill. We find that the most powerful "transformer" models predict nearly 100% of explainable variance in neural responses to sentences and generalize across different datasets and imaging modalities (functional MRI and electrocorticography). Models' neural fits ("brain score") and fits to behavioral responses are both strongly correlated with model accuracy on the next-word prediction task (but not other language tasks). Model architecture appears to substantially contribute to neural fit. These results provide computationally explicit evidence that predictive processing fundamentally shapes the language comprehension mechanisms in the human brain.
Collapse
Affiliation(s)
- Martin Schrimpf
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139;
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139
- Center for Brains, Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Idan Asher Blank
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
- Department of Psychology, University of California, Los Angeles, CA 90095
| | - Greta Tuckute
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Carina Kauf
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Eghbal A Hosseini
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Nancy Kanwisher
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139;
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139
- Center for Brains, Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Joshua B Tenenbaum
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
- Center for Brains, Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Evelina Fedorenko
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139;
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139
| |
Collapse
|
3
|
Rao Y, Zhang X. Characterization of Linearly Separable Boolean Functions: A Graph-Theoretic Perspective. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2017; 28:1542-1549. [PMID: 27076471 DOI: 10.1109/tnnls.2016.2542205] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
In this paper, we present a novel approach for studying Boolean function in a graph-theoretic perspective. In particular, we first transform a Boolean function f of n variables into an induced subgraph Hf of the n -dimensional hypercube, and then, we show the properties of linearly separable Boolean functions on the basis of the analysis of the structure of Hf . We define a new class of graphs, called hyperstar, and prove that the induced subgraph Hf of any linearly separable Boolean function f is a hyperstar. The proposal of hyperstar helps us uncover a number of fundamental properties of linearly separable Boolean functions in this paper.
Collapse
|
4
|
Abstract
Vector symbolic architectures (VSAs) are high-dimensional vector representations of objects (e.g., words, image parts), relations (e.g., sentence structures), and sequences for use with machine learning algorithms. They consist of a vector addition operator for representing a collection of unordered objects, a binding operator for associating groups of objects, and a methodology for encoding complex structures. We first develop constraints that machine learning imposes on VSAs; for example, similar structures must be represented by similar vectors. The constraints suggest that current VSAs should represent phrases ("The smart Brazilian girl") by binding sums of terms, in addition to simply binding the terms directly. We show that matrix multiplication can be used as the binding operator for a VSA, and that matrix elements can be chosen at random. A consequence for living systems is that binding is mathematically possible without the need to specify, in advance, precise neuron-to-neuron connection properties for large numbers of synapses. A VSA that incorporates these ideas, Matrix Binding of Additive Terms (MBAT), is described that satisfies all constraints. With respect to machine learning, for some types of problems appropriate VSA representations permit us to prove learnability rather than relying on simulations. We also propose dividing machine (and neural) learning and representation into three stages, with differing roles for learning in each stage. For neural modeling, we give representational reasons for nervous systems to have many recurrent connections, as well as for the importance of phrases in language processing. Sizing simulations and analyses suggest that VSAs in general, and MBAT in particular, are ready for real-world applications.
Collapse
|
5
|
Klampfl S, Maass W. A theoretical basis for emergent pattern discrimination in neural systems through slow feature extraction. Neural Comput 2010; 22:2979-3035. [PMID: 20858129 DOI: 10.1162/neco_a_00050] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Neurons in the brain are able to detect and discriminate salient spatiotemporal patterns in the firing activity of presynaptic neurons. It is open how they can learn to achieve this, especially without the help of a supervisor. We show that a well-known unsupervised learning algorithm for linear neurons, slow feature analysis (SFA), is able to acquire the discrimination capability of one of the best algorithms for supervised linear discrimination learning, the Fisher linear discriminant (FLD), given suitable input statistics. We demonstrate the power of this principle by showing that it enables readout neurons from simulated cortical microcircuits to learn without any supervision to discriminate between spoken digits and to detect repeated firing patterns that are embedded into a stream of noise spike trains with the same firing statistics. Both these computer simulations and our theoretical analysis show that slow feature extraction enables neurons to extract and collect information that is spread out over a trajectory of firing states that lasts several hundred ms. In addition, it enables neurons to learn without supervision to keep track of time (relative to a stimulus onset, or the initiation of a motor response). Hence, these results elucidate how the brain could compute with trajectories of firing states rather than only with fixed point attractors. It also provides a theoretical basis for understanding recent experimental results on the emergence of view- and position-invariant classification of visual objects in inferior temporal cortex.
Collapse
Affiliation(s)
- Stefan Klampfl
- Institute for Theoretical Computer Science, Graz University of Technology, A-8010 Graz, Austria.
| | | |
Collapse
|
6
|
Intelligent Noise Removal from EMG Signal Using Focused Time-Lagged Recurrent Neural Network. APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING 2009. [DOI: 10.1155/2009/129761] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Electromyography (EMG) signals can be used for clinical/biomedical application and modern human computer interaction. EMG signals acquire noise while traveling through tissue, inherent noise in electronics equipment, ambient noise, and so forth. ANN approach is studied for reduction of noise in EMG signal. In this paper, it is shown that Focused Time-Lagged Recurrent Neural Network (FTLRNN) can elegantly solve to reduce the noise from EMG signal. After rigorous computer simulations, authors developed an optimal FTLRNN model, which removes the noise from the EMG signal. Results show that the proposed optimal FTLRNN model has an MSE (Mean Square Error) as low as 0.000067 and 0.000048, correlation coefficient as high as 0.99950 and 0.99939 for noise signal and EMG signal, respectively, when validated on the test dataset. It is also noticed that the output of the estimated FTLRNN model closely follows the real one. This network is indeed robust as EMG signal tolerates the noise variance from 0.1 to 0.4 for uniform noise and 0.30 for Gaussian noise. It is clear that the training of the network is independent of specific partitioning of dataset. It is seen that the performance of the proposed FTLRNN model clearly outperforms the best Multilayer perceptron (MLP) and Radial Basis Function NN (RBF) models. The simple NN model such as the FTLRNN with single-hidden layer can be employed to remove noise from EMG signal.
Collapse
|
7
|
Itskov V, Abbott LF. Pattern capacity of a perceptron for sparse discrimination. PHYSICAL REVIEW LETTERS 2008; 101:018101. [PMID: 18764154 DOI: 10.1103/physrevlett.101.018101] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2008] [Indexed: 05/26/2023]
Abstract
We evaluate the capacity and performance of a perceptron discriminator operating in a highly sparse regime where classic perceptron results do not apply. The perceptron is constructed to respond to a specified set of q stimuli, with only statistical information provided about other stimuli to which it is not supposed to respond. We compute the probability of both false-positive and false-negative errors and determine the capacity of the system for not responding to nonselected stimuli and for responding to selected stimuli in the presence of noise. If q is a sublinear function of N, the number of inputs to the perceptron, these capacities are exponential in N/q.
Collapse
Affiliation(s)
- Vladimir Itskov
- Department of Neuroscience, Department of Physiology and Cellular Biophysics, Columbia University Medical Center, New York, New York 10032-2695, USA
| | | |
Collapse
|
8
|
Ojha PC. Enumeration of linear threshold functions from the lattice of hyperplane intersections. IEEE TRANSACTIONS ON NEURAL NETWORKS 2008; 11:839-50. [PMID: 18249812 DOI: 10.1109/72.857765] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
We present a method for enumerating linear threshold functions of n-dimensional binary inputs. Our starting point is the geometric lattice Ln of hyperplane intersections in the dual (weight) space. We show how the hyperoctahedral group O(n+1), the symmetry group of the (n+1)-dimensional hypercube, can be used to construct a symmetry-adapted poset of hyperplane intersections Deltan which is much more compact and tractable than Ln. A generalized Zeta function and its inverse, the generalized Möbius function, are defined on Deltan. Symmetry-adapted posets of hyperplane intersections for three-, four-, and five-dimensional inputs are constructed and the number of linear threshold functions is computed from the generalized Möbius function. Finally, we show how equivalence classes of linear threshold functions are enumerated by unfolding the symmetry-adapted poset of hyperplane intersections into a symmetry-adapted face poset. It is hoped that our construction will lead to ways of placing asymptotic bounds on the number of equivalence classes of linear threshold functions.
Collapse
Affiliation(s)
- P C Ojha
- School of Information and Software Engineering, University of Ulster at Jordanstown, Newtownabbey BT37 0QB, UK.
| |
Collapse
|
9
|
BERMAN P, DASGUPTA B, SONTAG E. Algorithmic Issues in Reverse Engineering of Protein and Gene Networks via the Modular Response Analysis Method. Ann N Y Acad Sci 2007; 1115:132-41. [DOI: 10.1196/annals.1407.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
10
|
|
11
|
Senn W, Fusi S. Learning Only When Necessary: Better Memories of Correlated Patterns in Networks with Bounded Synapses. Neural Comput 2005; 17:2106-38. [PMID: 16105220 DOI: 10.1162/0899766054615644] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Learning in a neuronal network is often thought of as a linear superposition of synaptic modifications induced by individual stimuli. However, since biological synapses are naturally bounded, a linear superposition would cause fast forgetting of previously acquired memories. Here we show that this forgetting can be avoided by introducing additional constraints on the synaptic and neural dynamics. We consider Hebbian plasticity of excitatory synapses. A synapse is modified only if the postsynaptic response does not match the desired output. With this learning rule, the original memory performances with unbounded weights are regained, provided that (1) there is some global inhibition, (2) the learning rate is small, and (3) the neurons can discriminate small differences in the total synaptic input (e.g., by making the neuronal threshold small compared to the total postsynaptic input). We prove in the form of a generalized perceptron convergence theorem that under these constraints, a neuron learns to classify any linearly separable set of patterns, including a wide class of highly correlated random patterns. During the learning process, excitation becomes roughly balanced by inhibition, and the neuron classifies the patterns on the basis of small differences around this balance. The fact that synapses saturate has the additional benefit that nonlinearly separable patterns, such as similar patterns with contradicting outputs, eventually generate a subthreshold response, and therefore silence neurons that cannot provide any information.
Collapse
Affiliation(s)
- Walter Senn
- Department of Physiology, University of Bern, CH-30 Bern, Switzerland.
| | | |
Collapse
|
12
|
Huerta R, Nowotny T, García-Sanchez M, Abarbanel HDI, Rabinovich MI. Learning Classification in the Olfactory System of Insects. Neural Comput 2004; 16:1601-40. [PMID: 15228747 DOI: 10.1162/089976604774201613] [Citation(s) in RCA: 109] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
We propose a theoretical framework for odor classification in the olfactory system of insects. The classification task is accomplished in two steps. The first is a transformation from the antennal lobe to the intrinsic Kenyon cells in the mushroom body. This transformation into a higher-dimensional space is an injective function and can be implemented without any type of learning at the synaptic connections. In the second step, the encoded odors in the intrinsic Kenyon cells are linearly classified in the mushroom body lobes. The neurons that perform this linear classification are equivalent to hyperplanes whose connections are tuned by local Hebbian learning and by competition due to mutual inhibition. We calculate the range of values of activity and size fo the network required to achieve efficient classification within this scheme in insect olfaction. We are able to demonstrate that biologically plausible control mechanisms can accomplish efficient classification of odors.
Collapse
Affiliation(s)
- Ramón Huerta
- Institute for Nonlinear Science, University of California San Diego, La Jolla CA 92093-0402, U.S.A.
| | | | | | | | | |
Collapse
|
13
|
Síma J, Orponen P. General-Purpose Computation with Neural Networks: A Survey of Complexity Theoretic Results. Neural Comput 2003; 15:2727-78. [PMID: 14629867 DOI: 10.1162/089976603322518731] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
We survey and summarize the literature on the computational aspects of neural network models by presenting a detailed taxonomy of the various models according to their complexity theoretic characteristics. The criteria of classification include the architecture of the network (feedforward versus recurrent), time model (discrete versus continuous), state type (binary versus analog), weight constraints (symmetric versus asymmetric), network size (finite nets versus infinite families), and computation type (deterministic versus probabilistic), among others. The underlying results concerning the computational power and complexity issues of perceptron, radial basis function, winner-take-all, and spiking neural networks are briefly surveyed, with pointers to the relevant literature. In our survey, we focus mainly on the digital computation whose inputs and outputs are binary in nature, although their values are quite often encoded as analog neuron states. We omit the important learning issues.
Collapse
Affiliation(s)
- Jirí Síma
- Institute of Computer Science, Academy of Sciences of the Czech Republic, P.O. Box 5, Prague 8, Czech Republic.
| | | |
Collapse
|
14
|
Abstract
We describe and discuss the properties of a binary neural network that can serve as a dynamic neural filter (DNF), which maps regions of input space into spatiotemporal sequences of neuronal activity. Both deterministic and stochastic dynamics are studied, allowing the investigation of the stability of spatiotemporal sequences under noisy conditions. We define a measure of the coding capacity of a DNF and develop an algorithm for constructing a DNF that can serve as a source of given codes. On the basis of this algorithm, we suggest using a minimal DNF capable of generating observed sequences as a measure of complexity of spatiotemporal data. This measure is applied to experimental observations in the locust olfactory system, whose reverberating local field potential provides a natural temporal scale allowing the use of a binary DNF. For random synaptic matrices, a DNF can generate very large cycles, thus becoming an efficient tool for producing spatiotemporal codes. The latter can be stabilized by applying to the parameters of the DNF a learning algorithm with suitable margins.
Collapse
Affiliation(s)
- Brigitte Quenet
- Laboratoire d'Electronique, Ecole Superieure de Physique et Chimie Industrielles, Paris 75005, France.
| | | |
Collapse
|
15
|
Caticha N, Tejada JEP, Lancet D, Domany E. Computational capacity of an odorant discriminator: the linear separability of curves. Neural Comput 2002; 14:2201-20. [PMID: 12184848 DOI: 10.1162/089976602320264051] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
We introduce and study an artificial neural network inspired by the probabilistic receptor affinity distribution model of olfaction. Our system consists of N sensory neurons whose outputs converge on a single processing linear threshold element. The system's aim is to model discrimination of a single target odorant from a large number p of background odorants within a range of odorant concentrations. We show that this is possible provided p does not exceed a critical value p(c) and calculate the critical capacity alpha(c) = p(c)/N. The critical capacity depends on the range of concentrations in which the discrimination is to be accomplished. If the olfactory bulb may be thought of as a collection of such processing elements, each responsible for the discrimination of a single odorant, our study provides a quantitative analysis of the potential computational properties of the olfactory bulb. The mathematical formulation of the problem we consider is one of determining the capacity for linear separability of continuous curves, embedded in a large-dimensional space. This is accomplished here by a numerical study, using a method that signals whether the discrimination task is realizable, together with a finite-size scaling analysis.
Collapse
Affiliation(s)
- N Caticha
- Instituto de Física, Universidade de São Paulo, CEP 05315-970, São Paulo, SP, Brazil.
| | | | | | | |
Collapse
|
16
|
Abstract
Local receptive field neurons comprise such well-known and widely used unit types as radial basis function (RBF) neurons and neurons with center-surround receptive field. We study the Vapnik-Chervonenkis (VC) dimension of feedforward neural networks with one hidden layer of these units. For several variants of local receptive field neurons, we show that the VC dimension of these networks is superlinear. In particular, we establish the bound Omega(W log k) for any reasonably sized network with W parameters and k hidden nodes. This bound is shown to hold for discrete center-surround receptive field neurons, which are physiologically relevant models of cells in the mammalian visual system, for neurons computing a difference of gaussians, which are popular in computational vision, and for standard RBF neurons, a major alternative to sigmoidal neurons in artificial neural networks. The result for RBF neural networks is of particular interest since it answers a question that has been open for several years. The results also give rise to lower bounds for networks with fixed input dimension. Regarding constants, all bounds are larger than those known thus far for similar architectures with sigmoidal neurons. The superlinear lower bounds contrast with linear upper bounds for single local receptive field neurons also derived here.
Collapse
Affiliation(s)
- Michael Schmitt
- Lehrstuhl Mathematik und Informatik, Fakultät für Mathematik Ruhr-Universität Bochum, D-44780 Bochum, Germany.
| |
Collapse
|
17
|
Abstract
In a great variety of neuron models, neural inputs are combined using the summing operation. We introduce the concept of multiplicative neural networks that contain units that multiply their inputs instead of summing them and thus allow inputs to interact nonlinearly. The class of multiplicative neural networks comprises such widely known and well-studied network types as higher-order networks and product unit networks. We investigate the complexity of computing and learning for multiplicative neural networks. In particular, we derive upper and lower bounds on the Vapnik-Chervonenkis (VC) dimension and the pseudo-dimension for various types of networks with multiplicative units. As the most general case, we consider feedforward networks consisting of product and sigmoidal units, showing that their pseudo-dimension is bounded from above by a polynomial with the same order of magnitude as the currently best-known bound for purely sigmoidal networks. Moreover, we show that this bound holds even when the unit type, product or sigmoidal, may be learned. Crucial for these results are calculations of solution set components bounds for new network classes. As to lower bounds, we construct product unit networks of fixed depth with super-linear VC dimension. For sigmoidal networks of higher order, we establish polynomial bounds that, in contrast to previous results, do not involve any restriction of the network order. We further consider various classes of higher-order units, also known as sigma-pi units, that are characterized by connectivity constraints. In terms of these, we derive some asymptotically tight bounds. Multiplication plays an important role in both neural modeling of biological behavior and computing and learning with artificial neural networks. We briefly survey research in biology and in applications where multiplication is considered an essential computational element. The results we present here provide new tools for assessing the impact of multiplication on the computational power and the learning capabilities of neural networks.
Collapse
Affiliation(s)
- Michael Schmitt
- Lehrstuhl Mathematik und Informatik, Fakultät für Mathematik, Ruhr-Universität Bochum, D-44780 Bochum, Germany.
| |
Collapse
|
18
|
Poirazi P, Mel BW. Choice and value flexibility jointly contribute to the capacity of a subsampled quadratic classifier. Neural Comput 2000; 12:1189-205. [PMID: 10905813 DOI: 10.1162/089976600300015556] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Biophysical modeling studies have suggested that neurons with active dendrites can be viewed as linear units augmented by product terms that arise from interactions between synaptic inputs within the same dendritic subregions. However, the degree to which local nonlinear synaptic interactions could augment the memory capacity of a neuron is not known in a quantitative sense. To approach this question, we have studied the family of subsampled quadratic classifiers: linear classifiers augmented by the best k terms from the set of K = (d2 + d)/2 second-order product terms available in d dimensions. We developed an expression for the total parameter entropy, whose form shows that the capacity of an SQ classifier does not reside solely in its conventional weight values-the explicit memory used to store constant, linear, and higher-order coefficients. Rather, we identify a second type of parameter flexibility that jointly contributes to an SQ classifier's capacity: the choice as to which product terms are included in the model and which are not. We validate the form of the entropy expression using empirical studies of relative capacity within families of geometrically isomorphic SQ classifiers. Our results have direct implications for neurobiological (and other hardware) learning systems, where in the limit of high-dimensional input spaces and low-resolution synaptic weight values, this relatively little explored form of choice flexibility could constitute a major source of trainable model capacity.
Collapse
Affiliation(s)
- P Poirazi
- Department of Biomedical Engineering, University of Southern California, Los Angeles 90089, USA
| | | |
Collapse
|
19
|
Abstract
Compartmental simulations of an anatomically characterized cortical pyramidal cell were carried out to study the integrative behavior of a complex dendritic tree. Previous theoretical (Feldman and Ballard 1982; Durbin and Rumelhart 1989; Mel 1990; Mel and Koch 1990; Poggio and Girosi 1990) and compartmental modeling (Koch et al. 1983; Shepherd et al. 1985; Koch and Poggio 1987; Rall and Segev 1987; Shepherd and Brayton 1987; Shepherd et al. 1989; Brown et al. 1991) work had suggested that multiplicative interactions among groups of neighboring synapses could greatly enhance the processing power of a neuron relative to a unit with only a single global firing threshold. This issue was investigated here, with a particular focus on the role of voltage-dependent N-methyl-D-asparate (NMDA) channels in the generation of cell responses. First, it was found that when a large proportion of the excitatory synaptic input to dendritic spines is carried by NMDA channels, the pyramidal cell responds preferentially to spatially clustered, rather than random, distributions of activated synapses. Second, based on this mechanism, the NMDA-rich neuron is shown to be capable of solving a nonlinear pattern discrimination task. We propose that manipulation of the spatial ordering of afferent synaptic connections onto the dendritic arbor is a possible biological strategy for pattern information storage during learning.
Collapse
Affiliation(s)
- Bartlett W. Mel
- Computation and Neural Systems Program, Division of Biology, 216-76, California Institute of Technology, Pasadena, CA 91125 USA
| |
Collapse
|
20
|
Affiliation(s)
- J W Clark
- McDonnell Center for the Space Sciences, Washington University, St Louis, Missouri 63130
| |
Collapse
|
21
|
RENDELL LARRY, SESHU RAJ. Learning hard concepts through constructive induction: framework and rationale. Comput Intell 1990. [DOI: 10.1111/j.1467-8640.1990.tb00298.x] [Citation(s) in RCA: 58] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
22
|
Durbin R, Rumelhart DE. Product Units: A Computationally Powerful and Biologically Plausible Extension to Backpropagation Networks. Neural Comput 1989. [DOI: 10.1162/neco.1989.1.1.133] [Citation(s) in RCA: 269] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
We introduce a new form of computational unit for feedforward learning networks of the backpropagation type. Instead of calculating a weighted sum this unit calculates a weighted product, where each input is raised to a power determined by a variable weight. Such a unit can learn an arbitrary polynomial term, which would then feed into higher level standard summing units. We show how learning operates with product units, provide examples to show their efficiency for various types of problems, and argue that they naturally extend the family of theoretical feedforward net structures. There is a plausible neurobiological interpretation for one interesting configuration of product and summing units.
Collapse
Affiliation(s)
- Richard Durbin
- Department of Psychology, Stanford University, Stanford, CA 94305, USA
| | | |
Collapse
|
23
|
|
24
|
Models of Certain Nonlinear Systems. Neural Netw 1968. [DOI: 10.1007/978-3-642-87596-0_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|