1
|
Gao T, Pan Q, Zhou J, Wang H, Tao L, Kwan HK. A Novel Attention-Guided Generative Adversarial Network for Whisper-to-Normal Speech Conversion. Cognit Comput 2023. [DOI: 10.1007/s12559-023-10108-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
|
2
|
Peng Y, Liu X, Wang C, Xiao T, Li T. Fusing Attention Features and Contextual Information for Scene Recognition. INT J PATTERN RECOGN 2022. [DOI: 10.1142/s0218001422500148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Aiming to obtain more discriminative features in scene images and overcome the impacts of intra-class differences and inter-class similarities, the paper proposes a scene recognition method that combines attention and context information. First, we introduce the attention mechanism and build a multi-scale attention model. Discriminative information considers salient objects and regions by means of channel attention and spatial attention. Besides, the central loss function joint supervision strategy is introduced to further reduce the misjudgment of intra-class differences. Second, a model based on multi-level context information is proposed to describe the positional relationship between objects, which can effectively alleviate the influence of the similarity of objects between classes. Finally, the two models are merged to give full play to the compatibility of features, so that the final feature representation not only focuses on the effective discriminant information, but also manifests the relative position relationship between significant objects. Extensive experiments have proved that the method in this paper effectively solves the problem of insufficient feature representation in scene recognition tasks, and improves the accuracy of scene recognition.
Collapse
Affiliation(s)
- Yuqing Peng
- School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, P. R. China
| | - Xianzi Liu
- China Shenhua International Engineering Co., Ltd., Beijing 100007, P. R. China
| | - Chenxi Wang
- School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, P. R. China
| | - Tengfei Xiao
- School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, P. R. China
| | - Tiejun Li
- School of Mechanical Engineering, Hebei University of Technology, Tianjin 300401, P. R. China
| |
Collapse
|
3
|
Romeo A, Supèr H. Spiking model of fixational eye movements and figure-ground segmentation. NETWORK (BRISTOL, ENGLAND) 2022; 33:143-166. [PMID: 35613078 DOI: 10.1080/0954898x.2022.2073393] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 03/25/2022] [Accepted: 04/28/2022] [Indexed: 06/15/2023]
Abstract
We present a model connecting eye movements and cortical state. Its structure includes simulated retinal images, motion detection, feature detectors and layers of spiking neurons. The designed scheme shows how the effect of micro-saccadic scale eye movements can lead to successful figure segregation in a figure-ground paradigm, by inducing changes in the neural dynamics through the time evolution of the inhibition range.
Collapse
Affiliation(s)
- August Romeo
- Vision and Control of Action Group, Department of Cognition, Development and Educational Psychology, University of Barcelona, Barcelona, Spain
| | - Hans Supèr
- Vision and Control of Action Group, Department of Cognition, Development and Educational Psychology, University of Barcelona, Barcelona, Spain
- Institute of Neurosciences of the University of Barcelona (UBNeuro), Barcelona, Spain
- Institut de Recerca Sant Joan de Déu (IRSJD), Barcelona, Spain
- Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain
| |
Collapse
|
4
|
Conti F, Van Gorder RA. The role of network structure and time delay in a metapopulation Wilson--Cowan model. J Theor Biol 2019; 477:1-13. [PMID: 31181240 DOI: 10.1016/j.jtbi.2019.05.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2018] [Revised: 04/23/2019] [Accepted: 05/16/2019] [Indexed: 01/11/2023]
Abstract
We study the dynamics of a network Wilson--Cowan model (a system of connected Wilson--Cowan oscillators) for interacting excitatory and inhibitory neuron populations with time delays. Each node in this model corresponds to a population of neurons, including excitatory and inhibitory subpopulations, and hence it can be viewed as a metapopulation model. It is known that information transfer within each cortical area is not instantaneous, and therefore we consider a system of delay differential equations with two different kinds of discrete delays. We account for the time delay in information propagation between individual excitatory and inhibitory subpopulations at each node via intra-node time delays, and we account for time delay in information propagation between neuron populations at different nodes with inter-node time delays. The biologically relevant resting state solutions are oscillatory (stable limit cycles). After determining the influence of the coupling parameters between nodes, the intra-node delays, and the inter-node delays on the dynamics of the two coupled Wilson--Cowan oscillators, we then explore a variety of larger networks of 16 and 100 nodes, in order to determine how the network topology will influence time delayed Wilson--Cowan dynamics. We find that network structure can regularize or deregularize the dynamics, with networks of higher mean degree permitting stable limit cycles and networks with smaller mean degree yielding less regular dynamics (which may range from chaotic solutions, to solutions for which limit cycles collapse into steady states, which are biologically undesirable compared with the preferred stable limit cycles). Furthermore, heterogeneity in the degree distribution of the network (resulting from networks with nodes of varying degree) can result in asynchronous dynamics, even if at each node the local dynamics are that of a limit cycle, in contrast to the synchronization of dynamics between nodes seen when the degree of all nodes is equal. This suggests that homogeneous and well-connected networks permit robust limit cycles under time-delayed Wilson--Cowan dynamics, whereas heterogeneous or poorly connected networks may fail to provide such desirable dynamics, a phenomena akin to structural loss of neuron connections in neurodegenerative diseases.
Collapse
Affiliation(s)
- Federica Conti
- Mathematical Institute, University of Oxford, Andrew Wiles Building, Radcliffe Observatory Quarter, Woodstock Road, Oxford, OX2 6GG, United Kingdom; Institut de Neurosciences de la Timone, Aix-Marseille Université, CNRS, Faculté de Médecine, 27 boulevard Jean Moulin, Marseille 13005, France
| | - Robert A Van Gorder
- Mathematical Institute, University of Oxford, Andrew Wiles Building, Radcliffe Observatory Quarter, Woodstock Road, Oxford, OX2 6GG, United Kingdom; Department of Mathematics and Statistics, University of Otago, P.O. Box 56, Dunedin 9054, New Zealand.
| |
Collapse
|
5
|
Neural dynamics of spreading attentional labels in mental contour tracing. Neural Netw 2019; 119:113-138. [PMID: 31404805 DOI: 10.1016/j.neunet.2019.07.016] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2018] [Revised: 07/12/2019] [Accepted: 07/21/2019] [Indexed: 11/22/2022]
Abstract
Behavioral and neural data suggest that visual attention spreads along contour segments to bind them into a unified object representation. Such attentional labeling segregates the target contour from distractors in a process known as mental contour tracing. A recurrent competitive map is developed to simulate the dynamics of mental contour tracing. In the model, local excitation opposes global inhibition and enables enhanced activity to propagate on the path offered by the contour. The extent of local excitatory interactions is modulated by the output of the multi-scale contour detection network, which constrains the speed of activity spreading in a scale-dependent manner. Furthermore, an L-junction detection network enables tracing to switch direction at the L-junctions, but not at the X- or T-junctions, thereby preventing spillover to a distractor contour. Computer simulations reveal that the model exhibits a monotonic increase in tracing time as a function of the distance to be traced. Also, the speed of tracing increases with decreasing proximity to the distractor contour and with the reduced curvature of the contours. The proposed model demonstrated how an elaborated version of the winner-takes-all network can implement a complex cognitive operation such as contour tracing.
Collapse
|
6
|
Wang D, Chen J. Supervised Speech Separation Based on Deep Learning: An Overview. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 2018; 26:1702-1726. [PMID: 31223631 PMCID: PMC6586438 DOI: 10.1109/taslp.2018.2842159] [Citation(s) in RCA: 121] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Speech separation is the task of separating target speech from background interference. Traditionally, speech separation is studied as a signal processing problem. A more recent approach formulates speech separation as a supervised learning problem, where the discriminative patterns of speech, speakers, and background noise are learned from training data. Over the past decade, many supervised separation algorithms have been put forward. In particular, the recent introduction of deep learning to supervised speech separation has dramatically accelerated progress and boosted separation performance. This paper provides a comprehensive overview of the research on deep learning based supervised speech separation in the last several years. We first introduce the background of speech separation and the formulation of supervised separation. Then, we discuss three main components of supervised separation: learning machines, training targets, and acoustic features. Much of the overview is on separation algorithms where we review monaural methods, including speech enhancement (speech-nonspeech separation), speaker separation (multitalker separation), and speech dereverberation, as well as multimicrophone techniques. The important issue of generalization, unique to supervised learning, is discussed. This overview provides a historical perspective on how advances are made. In addition, we discuss a number of conceptual issues, including what constitutes the target source.
Collapse
Affiliation(s)
- DeLiang Wang
- Department of Computer Science and Engineering and the Center for Cognitive and Brain Sciences, The Ohio State University, Columbus, OH 43210 USA, and also with the Center of Intelligent Acoustics and Immersive Communications, Northwestern Polytechnical University, Xi'an 710072, China
| | - Jitong Chen
- Department of Computer Science and Engineering, The Ohio State University, Columbus, OH 43210 USA. He is now with Silicon Valley AI Lab, Baidu Research, Sunnyvale, CA 94089 USA
| |
Collapse
|
7
|
Lin X, Zhou S, Tang H, Qi Y, Xie X. A Novel Fractional-Order Chaotic Phase Synchronization Model for Visual Selection and Shifting. ENTROPY (BASEL, SWITZERLAND) 2018; 20:E251. [PMID: 33265342 PMCID: PMC7512766 DOI: 10.3390/e20040251] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/03/2018] [Revised: 04/01/2018] [Accepted: 04/02/2018] [Indexed: 01/30/2023]
Abstract
Visual information processing is one of the fields of cognitive informatics. In this paper, a two-layer fractional-order chaotic network, which can simulate the mechanism of visual selection and shifting, is established. Unlike other object selection models, the proposed model introduces control units to select object. The first chaotic network layer of the model is used to implement image segmentation. A control layer is added as the second layer, consisting of a central neuron, which controls object selection and shifting. To implement visual selection and shifting, a strategy is proposed that can achieve different subnets corresponding to the objects in the first layer synchronizing with the central neuron at different time. The central unit acting as the central nervous system synchronizes with different subnets (hybrid systems), implementing the mechanism of visual selection and shifting in the human system. The proposed model corresponds better with the human visual system than the typical model of visual information encoding and transmission and provides new possibilities for further analysis of the mechanisms of the human cognitive system. The reasonability of the proposed model is verified by experiments using artificial and natural images.
Collapse
Affiliation(s)
- Xiaoran Lin
- College of Computer Science, Chongqing University, Chongqing 400044, China
- Chongqing/MII Key Lab. of Computer Network and Communication Technology, Chongqing 400044, China
| | - Shangbo Zhou
- College of Computer Science, Chongqing University, Chongqing 400044, China
- Chongqing/MII Key Lab. of Computer Network and Communication Technology, Chongqing 400044, China
| | - Hongbin Tang
- College of Computer Science, Chongqing University, Chongqing 400044, China
- College of Mathematics and Information Engineering, Chongqing University of Education, Chongqing 400065, China
| | - Ying Qi
- College of Computer Science, Chongqing University, Chongqing 400044, China
- Chongqing/MII Key Lab. of Computer Network and Communication Technology, Chongqing 400044, China
| | - Xianzhong Xie
- Chongqing/MII Key Lab. of Computer Network and Communication Technology, Chongqing 400044, China
| |
Collapse
|
8
|
Romeo A, Supèr H. Global oscillation regime change by gated inhibition. Neural Netw 2016; 82:76-83. [PMID: 27479874 DOI: 10.1016/j.neunet.2016.06.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2015] [Revised: 06/06/2016] [Accepted: 06/26/2016] [Indexed: 10/21/2022]
Abstract
The role of sensory inputs in the modelling of synchrony regimes is exhibited by means of networks of spiking cells where the relative strength of the inhibitory interaction is controlled by the activation of a linear unit working as a gating variable. Adaptation to stimulus size is determined by the value of a changing length scale, modelled by the time-varying radius of a circular receptive field. In this set-up, 'consolidation' time intervals relevant to attentional effects are shown to depend on the dynamics governing the evolution of the introduced length scale.
Collapse
Affiliation(s)
- August Romeo
- Department of Cognition, Development and Educational Psychology, Faculty of Psychology, University of Barcelona, Spain
| | - Hans Supèr
- Department of Cognition, Development and Educational Psychology, Faculty of Psychology, University of Barcelona, Spain; Institute of Neurosciences, Faculty of Psychology, University of Barcelona, Spain; Catalan Institution for Research and Advanced Studies (ICREA), Spain.
| |
Collapse
|
9
|
Zhan K, Teng J, Shi J, Li Q, Wang M. Feature-Linking Model for Image Enhancement. Neural Comput 2016; 28:1072-100. [DOI: 10.1162/neco_a_00832] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Inspired by gamma-band oscillations and other neurobiological discoveries, neural networks research shifts the emphasis toward temporal coding, which uses explicit times at which spikes occur as an essential dimension in neural representations. We present a feature-linking model (FLM) that uses the timing of spikes to encode information. The first spiking time of FLM is applied to image enhancement, and the processing mechanisms are consistent with the human visual system. The enhancement algorithm achieves boosting the details while preserving the information of the input image. Experiments are conducted to demonstrate the effectiveness of the proposed method. Results show that the proposed method is effective.
Collapse
Affiliation(s)
- Kun Zhan
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu 730000, China
| | - Jicai Teng
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu 730000, China
| | - Jinhui Shi
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu 730000, China
| | - Qiaoqiao Li
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu 730000, China
| | - Mingying Wang
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu 730000, China
| |
Collapse
|
10
|
Benicasa AX, Quiles MG, Silva TC, Zhao L, Romero RA. An object-based visual selection framework. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2015.10.111] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
11
|
Yuan Y, Mou L, Lu X. Scene recognition by manifold regularized deep learning architecture. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2015; 26:2222-2233. [PMID: 25622326 DOI: 10.1109/tnnls.2014.2359471] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Scene recognition is an important problem in the field of computer vision, because it helps to narrow the gap between the computer and the human beings on scene understanding. Semantic modeling is a popular technique used to fill the semantic gap in scene recognition. However, most of the semantic modeling approaches learn shallow, one-layer representations for scene recognition, while ignoring the structural information related between images, often resulting in poor performance. Modeled after our own human visual system, as it is intended to inherit humanlike judgment, a manifold regularized deep architecture is proposed for scene recognition. The proposed deep architecture exploits the structural information of the data, making for a mapping between visible layer and hidden layer. By the proposed approach, a deep architecture could be designed to learn the high-level features for scene recognition in an unsupervised fashion. Experiments on standard data sets show that our method outperforms the state-of-the-art used for scene recognition.
Collapse
|
12
|
|
13
|
Ferrari FAS, Viana RL, Lopes SR, Stoop R. Phase synchronization of coupled bursting neurons and the generalized Kuramoto model. Neural Netw 2015; 66:107-18. [PMID: 25828961 DOI: 10.1016/j.neunet.2015.03.003] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2014] [Revised: 02/24/2015] [Accepted: 03/03/2015] [Indexed: 11/30/2022]
Abstract
Bursting neurons fire rapid sequences of action potential spikes followed by a quiescent period. The basic dynamical mechanism of bursting is the slow currents that modulate a fast spiking activity caused by rapid ionic currents. Minimal models of bursting neurons must include both effects. We considered one of these models and its relation with a generalized Kuramoto model, thanks to the definition of a geometrical phase for bursting and a corresponding frequency. We considered neuronal networks with different connection topologies and investigated the transition from a non-synchronized to a partially phase-synchronized state as the coupling strength is varied. The numerically determined critical coupling strength value for this transition to occur is compared with theoretical results valid for the generalized Kuramoto model.
Collapse
Affiliation(s)
- F A S Ferrari
- Department of Physics, Federal University of Paraná, 81531-990 Curitiba, Paraná, Brazil
| | - R L Viana
- Department of Physics, Federal University of Paraná, 81531-990 Curitiba, Paraná, Brazil.
| | - S R Lopes
- Department of Physics, Federal University of Paraná, 81531-990 Curitiba, Paraná, Brazil
| | - R Stoop
- Institute of Neuroinformatics, University of Zürich and Eidgenössische Technische Hochschule Zürich, 8057 Zürich, Switzerland
| |
Collapse
|
14
|
Ursino M, Cuppini C, Magosso E. Neurocomputational approaches to modelling multisensory integration in the brain: A review. Neural Netw 2014; 60:141-65. [DOI: 10.1016/j.neunet.2014.08.003] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2014] [Revised: 08/05/2014] [Accepted: 08/07/2014] [Indexed: 10/24/2022]
|
15
|
Supèr H, Romeo A. Approximate emergent synchrony in spatially coupled spiking neurons with discrete interaction. Neural Comput 2014; 26:2419-40. [PMID: 25149703 DOI: 10.1162/neco_a_00658] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Models for perceptual grouping and contour integration are presented. Connection weights depend on distances and angle differences, while neurons evolve following a spiking dynamics (Izhikevich's model in most of the considered cases). Although the studied synapses depend on discrete three-valued functions, simulations display the emergence of approximate synchrony, making these cognitive tasks possible. Noise effects are examined, and the possibility of achieving similar results with a different neuron model is discussed.
Collapse
Affiliation(s)
- Hans Supèr
- Department of Basic Psychology, Faculty of Psychology and Institute for Brain, Cognition, and Behavior, University of Barcelona, Barcelona 08035, Spain, and Catalan Institute for Research and Advanced Studies, Barcelona 08010, Spain
| | | |
Collapse
|
16
|
CONA FILIPPO, URSINO MAURO. A MULTI-LAYER NEURAL-MASS MODEL FOR LEARNING SEQUENCES USING THETA/GAMMA OSCILLATIONS. Int J Neural Syst 2013; 23:1250036. [DOI: 10.1142/s0129065712500360] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
A neural mass model for the memorization of sequences is presented. It exploits three layers of cortical columns that generate a theta/gamma rhythm. The first layer implements an auto-associative memory working in the theta range; the second segments objects in the gamma range; finally, the feedback interactions between the third and the second layers realize a hetero-associative memory for learning a sequence. After training with Hebbian and anti-Hebbian rules, the network recovers sequences and accounts for the phase-precession phenomenon.
Collapse
Affiliation(s)
- FILIPPO CONA
- Department of Electronics, Computer Sciences and Systems, University of Bologna, Via Venezia, 52, Cesena (FC), 47521, Italy
| | - MAURO URSINO
- Department of Electronics, Computer Sciences and Systems, University of Bologna, Via Venezia, 52, Cesena (FC), 47521, Italy
| |
Collapse
|
17
|
|
18
|
Improving Wishart Classification of Polarimetric SAR Data Using the Hopfield Neural Network Optimization Approach. REMOTE SENSING 2012. [DOI: 10.3390/rs4113571] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
19
|
Herrera PJ, Pajares G, Guijarro M, Ruz JJ, Cruz JM. A stereovision matching strategy for images captured with fish-eye lenses in forest environments. SENSORS 2012; 11:1756-83. [PMID: 22319380 PMCID: PMC3274010 DOI: 10.3390/s110201756] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/21/2010] [Revised: 01/12/2011] [Accepted: 01/27/2011] [Indexed: 11/16/2022]
Abstract
We present a novel strategy for computing disparity maps from hemispherical stereo images obtained with fish-eye lenses in forest environments. At a first segmentation stage, the method identifies textures of interest to be either matched or discarded. This is achieved by applying a pattern recognition strategy based on the combination of two classifiers: Fuzzy Clustering and Bayesian. At a second stage, a stereovision matching process is performed based on the application of four stereovision matching constraints: epipolar, similarity, uniqueness and smoothness. The epipolar constraint guides the process. The similarity and uniqueness are mapped through a decision making strategy based on a weighted fuzzy similarity approach, obtaining a disparity map. This map is later filtered through the Hopfield Neural Network framework by considering the smoothness constraint. The combination of the segmentation and stereovision matching approaches makes the main contribution. The method is compared against the usage of simple features and combined similarity matching strategies.
Collapse
Affiliation(s)
- Pedro Javier Herrera
- Department of Computer Architecture and Automatic Control, Faculty of Computer Science, Complutense University, 28040 Madrid, Spain; E-Mails: (J.J.R.); (J.M.C.)
- Author to whom correspondence should be addressed; E-Mail: ; Tel.: +34-913947546; Fax: +34-913947547
| | - Gonzalo Pajares
- Department of Software Engineering and Artificial Intelligence, Faculty of Computer Science, Complutense University, 28040 Madrid, Spain; E-Mails: (G.P.); (M.G.)
| | - María Guijarro
- Department of Software Engineering and Artificial Intelligence, Faculty of Computer Science, Complutense University, 28040 Madrid, Spain; E-Mails: (G.P.); (M.G.)
| | - José J. Ruz
- Department of Computer Architecture and Automatic Control, Faculty of Computer Science, Complutense University, 28040 Madrid, Spain; E-Mails: (J.J.R.); (J.M.C.)
| | - Jesús M. Cruz
- Department of Computer Architecture and Automatic Control, Faculty of Computer Science, Complutense University, 28040 Madrid, Spain; E-Mails: (J.J.R.); (J.M.C.)
| |
Collapse
|
20
|
Wang R, Zhang Z, Qu J, Cao J. Phase Synchronization Motion and Neural Coding in Dynamic Transmission of Neural Information. ACTA ACUST UNITED AC 2011; 22:1097-106. [PMID: 21652286 DOI: 10.1109/tnn.2011.2119377] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Rubin Wang
- Institute for Cognitive Neurodynamics, School of Science, School of Information Science and Engineering, East China University of Science and Technology, Shanghai, China.
| | | | | | | |
Collapse
|
21
|
Qu J, Wang R, Du Y. An improved selective attention model considering orientation preferences. Neural Comput Appl 2011. [DOI: 10.1007/s00521-011-0679-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
22
|
Selecting salient objects in real scenes: An oscillatory correlation model. Neural Netw 2011; 24:54-64. [DOI: 10.1016/j.neunet.2010.09.002] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2010] [Revised: 09/06/2010] [Accepted: 09/07/2010] [Indexed: 11/21/2022]
|
23
|
Pajares G, Guijarro M, Ribeiro A. A Hopfield Neural Network for combining classifiers applied to textured images. Neural Netw 2010; 23:144-53. [DOI: 10.1016/j.neunet.2009.07.019] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2008] [Revised: 11/14/2008] [Accepted: 07/15/2009] [Indexed: 12/01/2022]
|
24
|
Abstract
Distributed synchronization is known to occur at several scales in the brain, and has been suggested as playing a key functional role in perceptual grouping. State-of-the-art visual grouping algorithms, however, seem to give comparatively little attention to neural synchronization analogies. Based on the framework of concurrent synchronization of dynamical systems, simple networks of neural oscillators coupled with diffusive connections are proposed to solve visual grouping problems. The key idea is to embed the desired grouping properties in the choice of the diffusive couplings, so that synchronization of oscillators within each group indicates perceptual grouping of the underlying stimulative atoms, while desynchronization between groups corresponds to group segregation. Compared with state-of-the-art approaches, the same algorithm is shown to achieve promising results on several classical visual grouping problems, including point clustering, contour integration, and image segmentation.
Collapse
Affiliation(s)
- Guoshen Yu
- Electrical and Computer Engineering Department, University of Minnesota, Twin Cities, MN 55455 USA.
| | | |
Collapse
|
25
|
Zhan K, Zhang H, Ma Y. New spiking cortical model for invariant texture retrieval and image processing. ACTA ACUST UNITED AC 2009; 20:1980-6. [PMID: 19906586 DOI: 10.1109/tnn.2009.2030585] [Citation(s) in RCA: 118] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Based on the studies of existing local-connected neural network models, in this brief, we present a new spiking cortical neural networks model and find that time matrix of the model can be recognized as a human subjective sense of stimulus intensity. The series of output pulse images of a proposed model represents the segment, edge, and texture features of the original image, and can be calculated based on several efficient measures and forms a sequence as the feature of the original image. We characterize texture images by the sequence for an invariant texture retrieval. The experimental results show that the retrieval scheme is effective in extracting the rotation and scale invariant features. The new model can also obtain good results when it is used in other image processing applications.
Collapse
Affiliation(s)
- Kun Zhan
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu, China
| | | | | |
Collapse
|
26
|
A neural model of selective attention and object segmentation in the visual scene: an approach based on partial synchronization and star-like architecture of connections. Neural Netw 2009; 22:707-19. [PMID: 19616919 DOI: 10.1016/j.neunet.2009.06.047] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2009] [Revised: 05/30/2009] [Accepted: 06/25/2009] [Indexed: 11/22/2022]
Abstract
A brain-inspired computational system is presented that allows sequential selection and processing of objects from a visual scene. The system is comprised of three modules. The selective attention module is designed as a network of spiking neurons of the Hodgkin-Huxley type with star-like connections between the central unit and peripheral elements. The attention focus is represented by those peripheral neurons that generate spikes synchronously with the central neuron while the activity of other peripheral neurons is suppressed. Such dynamics corresponds to the partial synchronization mode. It is shown that peripheral neurons with higher firing rates are preferentially drawn into partial synchronization. We show that local excitatory connections facilitate synchronization, while local inhibitory connections help distinguishing between two groups of peripheral neurons with similar intrinsic frequencies. The module automatically scans a visual scene and sequentially selects regions of interest for detailed processing and object segmentation. The contour extraction module implements standard image processing algorithms for contour extraction. The module computes raw contours of objects accompanied by noise and some spurious inclusions. At the next stage, the object segmentation module designed as a network of phase oscillators is used for precise determination of object boundaries and noise suppression. This module has a star-like architecture of connections. The segmented object is represented by a group of peripheral oscillators working in the regime of partial synchronization with the central oscillator. The functioning of each module is illustrated by an example of processing of the visual scene taken from a visual stream of a robot camera.
Collapse
|
27
|
Chaotic phase synchronization and desynchronization in an oscillator network for object selection. Neural Netw 2009; 22:728-37. [PMID: 19595565 DOI: 10.1016/j.neunet.2009.06.027] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2009] [Revised: 06/04/2009] [Accepted: 06/25/2009] [Indexed: 11/23/2022]
Abstract
Object selection refers to the mechanism of extracting objects of interest while ignoring other objects and background in a given visual scene. It is a fundamental issue for many computer vision and image analysis techniques and it is still a challenging task to artificial visual systems. Chaotic phase synchronization takes place in cases involving almost identical dynamical systems and it means that the phase difference between the systems is kept bounded over the time, while their amplitudes remain chaotic and may be uncorrelated. Instead of complete synchronization, phase synchronization is believed to be a mechanism for neural integration in brain. In this paper, an object selection model is proposed. Oscillators in the network representing the salient object in a given scene are phase synchronized, while no phase synchronization occurs for background objects. In this way, the salient object can be extracted. In this model, a shift mechanism is also introduced to change attention from one object to another. Computer simulations show that the model produces some results similar to those observed in natural vision systems.
Collapse
|
28
|
Singer W. Distributed processing and temporal codes in neuronal networks. Cogn Neurodyn 2009; 3:189-96. [PMID: 19562517 PMCID: PMC2727167 DOI: 10.1007/s11571-009-9087-z] [Citation(s) in RCA: 109] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2009] [Revised: 06/01/2009] [Accepted: 06/01/2009] [Indexed: 01/13/2023] Open
Abstract
The cerebral cortex presents itself as a distributed dynamical system with the characteristics of a small world network. The neuronal correlates of cognitive and executive processes often appear to consist of the coordinated activity of large assemblies of widely distributed neurons. These features require mechanisms for the selective routing of signals across densely interconnected networks, the flexible and context dependent binding of neuronal groups into functionally coherent assemblies and the task and attention dependent integration of subsystems. In order to implement these mechanisms, it is proposed that neuronal responses should convey two orthogonal messages in parallel. They should indicate (1) the presence of the feature to which they are tuned and (2) with which other neurons (specific target cells or members of a coherent assembly) they are communicating. The first message is encoded in the discharge frequency of the neurons (rate code) and it is proposed that the second message is contained in the precise timing relationships between individual spikes of distributed neurons (temporal code). It is further proposed that these precise timing relations are established either by the timing of external events (stimulus locking) or by internal timing mechanisms. The latter are assumed to consist of an oscillatory modulation of neuronal responses in different frequency bands that cover a broad frequency range from <2 Hz (delta) to >40 Hz (gamma) and ripples. These oscillations limit the communication of cells to short temporal windows whereby the duration of these windows decreases with oscillation frequency. Thus, by varying the phase relationship between oscillating groups, networks of functionally cooperating neurons can be flexibly configurated within hard wired networks. Moreover, by synchronizing the spikes emitted by neuronal populations, the saliency of their responses can be enhanced due to the coincidence sensitivity of receiving neurons in very much the same way as can be achieved by increasing the discharge rate. Experimental evidence will be reviewed in support of the coexistence of rate and temporal codes. Evidence will also be provided that disturbances of temporal coding mechanisms are likely to be one of the pathophysiological mechanisms in schizophrenia.
Collapse
Affiliation(s)
- Wolf Singer
- Max Planck Institute for Brain Research, Frankfurt/M., Germany,
| |
Collapse
|
29
|
Quiles MG, Zhao L, Breve FA, Romero RA. A network of integrate and fire neurons for visual selection. Neurocomputing 2009. [DOI: 10.1016/j.neucom.2008.10.024] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
30
|
Ursino M, Magosso E, Cuppini C. Recognition of Abstract Objects Via Neural Oscillators: Interaction Among Topological Organization, Associative Memory and Gamma Band Synchronization. ACTA ACUST UNITED AC 2009; 20:316-35. [PMID: 19171515 DOI: 10.1109/tnn.2008.2006326] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Mauro Ursino
- Department of Electronics, Computer Science and Systems, University of Bologna, I-40136 Bologna, Italy.
| | | | | |
Collapse
|
31
|
He H, Chen S. IMORL: incremental multiple-object recognition and localization. ACTA ACUST UNITED AC 2008; 19:1727-38. [PMID: 18842477 DOI: 10.1109/tnn.2008.2001774] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
This paper proposes an incremental multiple-object recognition and localization (IMORL) method. The objective of IMORL is to adaptively learn multiple interesting objects in an image. Unlike the conventional multiple-object learning algorithms, the proposed method can automatically and adaptively learn from continuous video streams over the entire learning life. This kind of incremental learning capability enables the proposed approach to accumulate experience and use such knowledge to benefit future learning and the decision making process. Furthermore, IMORL can effectively handle variations in the number of instances in each data chunk over the learning life. Another important aspect analyzed in this paper is the concept drifting issue. In multiple-object learning scenarios, it is a common phenomenon that new interesting objects may be introduced during the learning life. To handle this situation, IMORL uses an adaptive learning principle to autonomously adjust to such new information. The proposed approach is independent of the base learning models, such as decision tree, neural networks, support vector machines, and others, which provide the flexibility of using this method as a general learning methodology in multiple-object learning scenarios. In this paper, we use a neural network with a multilayer perceptron (MLP) structure as the base learning model and test the performance of this method in various video stream data sets. Simulation results show the effectiveness of this method.
Collapse
Affiliation(s)
- Haibo He
- Department of Electrical and Computer Engineering, Stevens Institute of Technology, Hoboken, NJ 07030, USA.
| | | |
Collapse
|
32
|
Zhao L, Cupertino TH, Bertini Jr. JR. Chaotic synchronization in general network topology for scene segmentation. Neurocomputing 2008. [DOI: 10.1016/j.neucom.2008.02.024] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
33
|
|
34
|
Lamela H, Ruiz-Llata M. Image identification system based on an optical broadcast neural network and a pulse coupled neural network preprocessor stage. APPLIED OPTICS 2008; 47:B52-B63. [PMID: 18382551 DOI: 10.1364/ao.47.000b52] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
We describe the concept of a vision system based on an optoelectronic hardware neural processor. The proposed system is composed of a pulse coupled neural network (PCNN) preprocessor stage that converts an input image into a temporal pulsed pattern. These pulses are inputs to the optical broadcast neural network (OBNN) processor, which classifies the input pattern between a set of reference patterns based on a pattern matching strategy. The PCNN is to provide immunity to the scale, rotation, and translation of objects in the image. The OBNN provides high parallelism and a high speed hardware neural processor.
Collapse
Affiliation(s)
- Horacio Lamela
- Grupo de Optoelectrónica y Tecnología Láser, Universidad Carlos III de Madrid, Madrid, Spain.
| | | |
Collapse
|
35
|
Rao AR, Cecchi GA, Peck CC, Kozloski JR. Unsupervised segmentation with dynamical units. ACTA ACUST UNITED AC 2008; 19:168-82. [PMID: 18269948 DOI: 10.1109/tnn.2007.905852] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
In this paper, we present a novel network to separate mixtures of inputs that have been previously learned. A significant capability of the network is that it segments the components of each input object that most contribute to its classification. The network consists of amplitude-phase units that can synchronize their dynamics, so that separation is determined by the amplitude of units in an output layer, and segmentation by phase similarity between input and output layer units. Learning is unsupervised and based on a Hebbian update, and the architecture is very simple. Moreover, efficient segmentation can be achieved even when there is considerable superposition of the inputs. The network dynamics are derived from an objective function that rewards sparse coding in the generalized amplitude-phase variables. We argue that this objective function can provide a possible formal interpretation of the binding problem and that the implementation of the network architecture and dynamics is biologically plausible.
Collapse
|
36
|
An oscillatory correlation model of auditory streaming. Cogn Neurodyn 2008; 2:7-19. [PMID: 19003469 DOI: 10.1007/s11571-007-9035-8] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2007] [Accepted: 12/06/2007] [Indexed: 10/22/2022] Open
Abstract
We present a neurocomputational model for auditory streaming, which is a prominent phenomenon of auditory scene analysis. The proposed model represents auditory scene analysis by oscillatory correlation, where a perceptual stream corresponds to a synchronized assembly of neural oscillators and different streams correspond to desynchronized oscillator assemblies. The underlying neural architecture is a two-dimensional network of relaxation oscillators with lateral excitation and global inhibition, where one dimension represents time and another dimension frequency. By employing dynamic connections along the frequency dimension and a random element in global inhibition, the proposed model produces a temporal coherence boundary and a fissure boundary that closely match those from the psychophysical data of auditory streaming. Several issues are discussed, including how to represent physical time and how to relate shifting synchronization to auditory attention.
Collapse
|
37
|
Massively distributed digital implementation of an integrate-and-fire LEGION network for visual scene segmentation. Neurocomputing 2007. [DOI: 10.1016/j.neucom.2006.11.009] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
38
|
Abstract
This paper outlines an optimization relaxation approach based on the analog Hopfield neural network (HNN) for solving the image change detection problem between two images. A difference image is obtained by subtracting pixel by pixel both images. The network topology is built so that each pixel in the difference image is a node in the network. Each node is characterized by its state, which determines if a pixel has changed. An energy function is derived, so that the network converges to stable states. The analog Hopfield's model allows each node to take on analog state values. Unlike most widely used approaches, where binary labels (changed/unchanged) are assigned to each pixel, the analog property provides the strength of the change. The main contribution of this paper is reflected in the customization of the analog Hopfield neural network to derive an automatic image change detection approach. When a pixel is being processed, some existing image change detection procedures consider only interpixel relations on its neighborhood. The main drawback of such approaches is the labeling of this pixel as changed or unchanged according to the information supplied by its neighbors, where its own information is ignored. The Hopfield model overcomes this drawback and for each pixel allows a tradeoff between the influence of its neighborhood and its own criterion. This is mapped under the energy function to be minimized. The performance of the proposed method is illustrated by comparative analysis against some existing image change detection methods.
Collapse
Affiliation(s)
- Gonzalo Pajares
- Departamento de Sistemas Informáticos y Programación, Facultad de Informática, Universidad Complutense, Madrid 28040, Spain.
| |
Collapse
|