1
|
Wendt G, Faul F. Binocular luster elicited by isoluminant chromatic stimuli relies on mechanisms similar to those in the achromatic case. J Vis 2024; 24:7. [PMID: 38536184 PMCID: PMC10985784 DOI: 10.1167/jov.24.3.7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Accepted: 02/05/2024] [Indexed: 04/04/2024] Open
Abstract
The phenomenon of binocular luster can be evoked by simple dichoptic center-surround stimuli showing a luminance contrast difference between the eyes. Previous findings support the idea that this phenomenon is mediated by a low-level conflict mechanism that integrates the monocular signals from different types of contrast detector cells. Also, isoluminant stimuli with different chromatic contrasts between eyes can trigger sensations of luster. Here, we investigate whether the lustrous impression in such purely chromatic stimuli depends on interocular contrast differences and in particular on interocular contrast polarity pairings in a similar way as in the achromatic case. In our experiments, we measured the magnitude of the lustrous response using a series of isoluminant dichoptic center-ring-surround stimuli with varying ring width whose chromatic properties were varied along the red-green and blue-yellow cardinal directions. The trends in the data were very similar to those of our former study with achromatic stimuli, indicating similar mechanisms in both cases. The empirical luster data could also be predicted fairly well by a chromatic version of our interocular conflict model (with overall R2 values between 0.577 and 0.639), for which two different receptive field models were used, simulating the behavior of color-sensitive double-opponent cells in V1.
Collapse
|
2
|
Lindeberg T. Covariance properties under natural image transformations for the generalised Gaussian derivative model for visual receptive fields. Front Comput Neurosci 2023; 17:1189949. [PMID: 37398936 PMCID: PMC10311448 DOI: 10.3389/fncom.2023.1189949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 05/23/2023] [Indexed: 07/04/2023] Open
Abstract
The property of covariance, also referred to as equivariance, means that an image operator is well-behaved under image transformations, in the sense that the result of applying the image operator to a transformed input image gives essentially a similar result as applying the same image transformation to the output of applying the image operator to the original image. This paper presents a theory of geometric covariance properties in vision, developed for a generalised Gaussian derivative model of receptive fields in the primary visual cortex and the lateral geniculate nucleus, which, in turn, enable geometric invariance properties at higher levels in the visual hierarchy. It is shown how the studied generalised Gaussian derivative model for visual receptive fields obeys true covariance properties under spatial scaling transformations, spatial affine transformations, Galilean transformations and temporal scaling transformations. These covariance properties imply that a vision system, based on image and video measurements in terms of the receptive fields according to the generalised Gaussian derivative model, can, to first order of approximation, handle the image and video deformations between multiple views of objects delimited by smooth surfaces, as well as between multiple views of spatio-temporal events, under varying relative motions between the objects and events in the world and the observer. We conclude by describing implications of the presented theory for biological vision, regarding connections between the variabilities of the shapes of biological visual receptive fields and the variabilities of spatial and spatio-temporal image structures under natural image transformations. Specifically, we formulate experimentally testable biological hypotheses as well as needs for measuring population statistics of receptive field characteristics, originating from predictions from the presented theory, concerning the extent to which the shapes of the biological receptive fields in the primary visual cortex span the variabilities of spatial and spatio-temporal image structures induced by natural image transformations, based on geometric covariance properties.
Collapse
Affiliation(s)
- Tony Lindeberg
- Computational Brain Science Lab, Division of Computational Science and Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| |
Collapse
|
3
|
Bryer AJ, Rey JS, Perilla JR. Performance efficient macromolecular mechanics via sub-nanometer shape based coarse graining. Nat Commun 2023; 14:2014. [PMID: 37037809 PMCID: PMC10086035 DOI: 10.1038/s41467-023-37801-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Accepted: 03/30/2023] [Indexed: 04/12/2023] Open
Abstract
Dimensionality reduction via coarse grain modeling is a valuable tool in biomolecular research. For large assemblies, ultra coarse models are often knowledge-based, relying on a priori information to parameterize models thus hindering general predictive capability. Here, we present substantial advances to the shape based coarse graining (SBCG) method, which we refer to as SBCG2. SBCG2 utilizes a revitalized formulation of the topology representing network which makes high-granularity modeling possible, preserving atomistic details that maintain assembly characteristics. Further, we present a method of granularity selection based on charge density Fourier Shell Correlation and have additionally developed a refinement method to optimize, adjust and validate high-granularity models. We demonstrate our approach with the conical HIV-1 capsid and heteromultimeric cofilin-2 bound actin filaments. Our approach is available in the Visual Molecular Dynamics (VMD) software suite, and employs a CHARMM-compatible Hamiltonian that enables high-performance simulation in the GPU-resident NAMD3 molecular dynamics engine.
Collapse
Affiliation(s)
- Alexander J Bryer
- Department of Chemistry and Biochemistry, University of Delaware, Newark, DE, 19716, USA
| | - Juan S Rey
- Department of Chemistry and Biochemistry, University of Delaware, Newark, DE, 19716, USA
| | - Juan R Perilla
- Department of Chemistry and Biochemistry, University of Delaware, Newark, DE, 19716, USA.
| |
Collapse
|
4
|
Lindeberg T. A time-causal and time-recursive scale-covariant scale-space representation of temporal signals and past time. BIOLOGICAL CYBERNETICS 2023; 117:21-59. [PMID: 36689001 PMCID: PMC10160219 DOI: 10.1007/s00422-022-00953-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Accepted: 11/21/2022] [Indexed: 05/05/2023]
Abstract
This article presents an overview of a theory for performing temporal smoothing on temporal signals in such a way that: (i) temporally smoothed signals at coarser temporal scales are guaranteed to constitute simplifications of corresponding temporally smoothed signals at any finer temporal scale (including the original signal) and (ii) the temporal smoothing process is both time-causal and time-recursive, in the sense that it does not require access to future information and can be performed with no other temporal memory buffer of the past than the resulting smoothed temporal scale-space representations themselves. For specific subsets of parameter settings for the classes of linear and shift-invariant temporal smoothing operators that obey this property, it is shown how temporal scale covariance can be additionally obtained, guaranteeing that if the temporal input signal is rescaled by a uniform temporal scaling factor, then also the resulting temporal scale-space representations of the rescaled temporal signal will constitute mere rescalings of the temporal scale-space representations of the original input signal, complemented by a shift along the temporal scale dimension. The resulting time-causal limit kernel that obeys this property constitutes a canonical temporal kernel for processing temporal signals in real-time scenarios when the regular Gaussian kernel cannot be used, because of its non-causal access to information from the future, and we cannot additionally require the temporal smoothing process to comprise a complementary memory of the past beyond the information contained in the temporal smoothing process itself, which in this way also serves as a multi-scale temporal memory of the past. We describe how the time-causal limit kernel relates to previously used temporal models, such as Koenderink's scale-time kernels and the ex-Gaussian kernel. We do also give an overview of how the time-causal limit kernel can be used for modelling the temporal processing in models for spatio-temporal and spectro-temporal receptive fields, and how it more generally has a high potential for modelling neural temporal response functions in a purely time-causal and time-recursive way, that can also handle phenomena at multiple temporal scales in a theoretically well-founded manner. We detail how this theory can be efficiently implemented for discrete data, in terms of a set of recursive filters coupled in cascade. Hence, the theory is generally applicable for both: (i) modelling continuous temporal phenomena over multiple temporal scales and (ii) digital processing of measured temporal signals in real time. We conclude by stating implications of the theory for modelling temporal phenomena in biological, perceptual, neural and memory processes by mathematical models, as well as implications regarding the philosophy of time and perceptual agents. Specifically, we propose that for A-type theories of time, as well as for perceptual agents, the notion of a non-infinitesimal inner temporal scale of the temporal receptive fields has to be included in representations of the present, where the inherent nonzero temporal delay of such time-causal receptive fields implies a need for incorporating predictions from the actual time-delayed present in the layers of a perceptual hierarchy, to make it possible for a representation of the perceptual present to constitute a representation of the environment with timing properties closer to the actual present.
Collapse
Affiliation(s)
- Tony Lindeberg
- Computational Brain Science Lab, Division of Computational Science and Technology, KTH Royal Institute of Technology, 100 44, Stockholm, Sweden.
| |
Collapse
|
5
|
Méndez CA, Celeghin A, Diano M, Orsenigo D, Ocak B, Tamietto M. A deep neural network model of the primate superior colliculus for emotion recognition. Philos Trans R Soc Lond B Biol Sci 2022; 377:20210512. [PMID: 36126660 PMCID: PMC9489290 DOI: 10.1098/rstb.2021.0512] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Accepted: 07/18/2022] [Indexed: 12/01/2022] Open
Abstract
Although sensory processing is pivotal to nearly every theory of emotion, the evaluation of the visual input as 'emotional' (e.g. a smile as signalling happiness) has been traditionally assumed to take place in supramodal 'limbic' brain regions. Accordingly, subcortical structures of ancient evolutionary origin that receive direct input from the retina, such as the superior colliculus (SC), are traditionally conceptualized as passive relay centres. However, mounting evidence suggests that the SC is endowed with the necessary infrastructure and computational capabilities for the innate recognition and initial categorization of emotionally salient features from retinal information. Here, we built a neurobiologically inspired convolutional deep neural network (DNN) model that approximates physiological, anatomical and connectional properties of the retino-collicular circuit. This enabled us to characterize and isolate the initial computations and discriminations that the DNN model of the SC can perform on facial expressions, based uniquely on the information it directly receives from the virtual retina. Trained to discriminate facial expressions of basic emotions, our model matches human error patterns and above chance, yet suboptimal, classification accuracy analogous to that reported in patients with V1 damage, who rely on retino-collicular pathways for non-conscious vision of emotional attributes. When presented with gratings of different spatial frequencies and orientations never 'seen' before, the SC model exhibits spontaneous tuning to low spatial frequencies and reduced orientation discrimination, as can be expected from the prevalence of the magnocellular (M) over parvocellular (P) projections. Likewise, face manipulation that biases processing towards the M or P pathway affects expression recognition in the SC model accordingly, an effect that dovetails with variations of activity in the human SC purposely measured with ultra-high field functional magnetic resonance imaging. Lastly, the DNN generates saliency maps and extracts visual features, demonstrating that certain face parts, like the mouth or the eyes, provide higher discriminative information than other parts as a function of emotional expressions like happiness and sadness. The present findings support the contention that the SC possesses the necessary infrastructure to analyse the visual features that define facial emotional stimuli also without additional processing stages in the visual cortex or in 'limbic' areas. This article is part of the theme issue 'Cracking the laugh code: laughter through the lens of biology, psychology and neuroscience'.
Collapse
Affiliation(s)
- Carlos Andrés Méndez
- Department of Psychology, University of Torino, Via Verdi 10, Torino 10124, Italy
| | - Alessia Celeghin
- Department of Psychology, University of Torino, Via Verdi 10, Torino 10124, Italy
| | - Matteo Diano
- Department of Psychology, University of Torino, Via Verdi 10, Torino 10124, Italy
| | - Davide Orsenigo
- Department of Psychology, University of Torino, Via Verdi 10, Torino 10124, Italy
| | - Brian Ocak
- Department of Psychology, University of Torino, Via Verdi 10, Torino 10124, Italy
- Section of Cognitive Neurophysiology and Imaging, National Institute of Mental Health, 49 Convent Drive, Bethesda, MD 20892, USA
| | - Marco Tamietto
- Department of Psychology, University of Torino, Via Verdi 10, Torino 10124, Italy
- Department of Medical and Clinical Psychology, and CoRPS - Center of Research on Psychology in Somatic diseases, Tilburg University, PO Box 90153, 5000 LE Tilburg, The Netherlands
| |
Collapse
|
6
|
Wei T, Tian Y, Wang Y, Liang Y, Chen CW. Optimized separable convolution: Yet another efficient convolution operator. AI OPEN 2022. [DOI: 10.1016/j.aiopen.2022.10.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
|
7
|
Raj R, Dahlen D, Duyck K, Yu CR. Maximal Dependence Capturing as a Principle of Sensory Processing. Front Comput Neurosci 2022; 16:857653. [PMID: 35399919 PMCID: PMC8989953 DOI: 10.3389/fncom.2022.857653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Accepted: 02/15/2022] [Indexed: 11/13/2022] Open
Abstract
Sensory inputs conveying information about the environment are often noisy and incomplete, yet the brain can achieve remarkable consistency in recognizing objects. Presumably, transforming the varying input patterns into invariant object representations is pivotal for this cognitive robustness. In the classic hierarchical representation framework, early stages of sensory processing utilize independent components of environmental stimuli to ensure efficient information transmission. Representations in subsequent stages are based on increasingly complex receptive fields along a hierarchical network. This framework accurately captures the input structures; however, it is challenging to achieve invariance in representing different appearances of objects. Here we assess theoretical and experimental inconsistencies of the current framework. In its place, we propose that individual neurons encode objects by following the principle of maximal dependence capturing (MDC), which compels each neuron to capture the structural components that contain maximal information about specific objects. We implement the proposition in a computational framework incorporating dimension expansion and sparse coding, which achieves consistent representations of object identities under occlusion, corruption, or high noise conditions. The framework neither requires learning the corrupted forms nor comprises deep network layers. Moreover, it explains various receptive field properties of neurons. Thus, MDC provides a unifying principle for sensory processing.
Collapse
Affiliation(s)
- Rishabh Raj
- Stowers Institute for Medical Research, Kansas City, MO, United States
| | - Dar Dahlen
- Stowers Institute for Medical Research, Kansas City, MO, United States
| | - Kyle Duyck
- Stowers Institute for Medical Research, Kansas City, MO, United States
| | - C. Ron Yu
- Stowers Institute for Medical Research, Kansas City, MO, United States
- Department of Anatomy and Cell Biology, University of Kansas Medical Center, Kansas City, KS, United States
| |
Collapse
|
8
|
Barbieri D. Reconstructing Group Wavelet Transform From Feature Maps With a Reproducing Kernel Iteration. Front Comput Neurosci 2022; 16:775241. [PMID: 35370587 PMCID: PMC8965351 DOI: 10.3389/fncom.2022.775241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Accepted: 01/28/2022] [Indexed: 11/13/2022] Open
Abstract
In this article, we consider the problem of reconstructing an image that is downsampled in the space of its SE(2) wavelet transform, which is motivated by classical models of simple cell receptive fields and feature preference maps in the primary visual cortex. We prove that, whenever the problem is solvable, the reconstruction can be obtained by an elementary project and replace iterative scheme based on the reproducing kernel arising from the group structure, and show numerical results on real images.
Collapse
Affiliation(s)
- Davide Barbieri
- Departamento de Matemáticas, Universidad Autónoma de Madrid, Madrid, Spain
| |
Collapse
|
9
|
Unsupervised anomaly detection in multivariate time series with online evolving spiking neural networks. Mach Learn 2022. [DOI: 10.1007/s10994-022-06129-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
AbstractWith the increasing demand for digital products, processes and services the research area of automatic detection of signal outliers in streaming data has gained a lot of attention. The range of possible applications for this kind of algorithms is versatile and ranges from the monitoring of digital machinery and predictive maintenance up to applications in analyzing big data healthcare sensor data. In this paper we present a method for detecting anomalies in streaming multivariate times series by using an adapted evolving Spiking Neural Network. As the main components of this work we contribute (1) an alternative rank-order-based learning algorithm which uses the precise times of the incoming spikes for adjusting the synaptic weights, (2) an adapted, realtime-capable and efficient encoding technique for multivariate data based on multi-dimensional Gaussian Receptive Fields and (3) a continuous outlier scoring function for an improved interpretability of the classifications. Spiking neural networks are extremely efficient when it comes to process time dependent information. We demonstrate the effectiveness of our model on a synthetic dataset based on the Numenta Anomaly Benchmark with various anomaly types. We compare our algorithm to other streaming anomaly detecting algorithms and can prove that our algorithm performs better in detecting anomalies while demanding less computational resources for processing high dimensional data.
Collapse
|
10
|
Liu X, Robinson PA. Analytic Model for Feature Maps in the Primary Visual Cortex. Front Comput Neurosci 2022; 16:659316. [PMID: 35185503 PMCID: PMC8854373 DOI: 10.3389/fncom.2022.659316] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Accepted: 01/05/2022] [Indexed: 11/29/2022] Open
Abstract
A compact analytic model is proposed to describe the combined orientation preference (OP) and ocular dominance (OD) features of simple cells and their mutual constraints on the spatial layout of the combined OP-OD map in the primary visual cortex (V1). This model consists of three parts: (i) an anisotropic Laplacian (AL) operator that represents the local neural sensitivity to the orientation of visual inputs; and (ii) obtain a receptive field (RF) operator that models the anisotropic spatial projection from nearby neurons to a given V1 cell over scales of a few tenths of a millimeter and combines with the AL operator to give an overall OP operator; and (iii) a map that describes how the parameters of these operators vary approximately periodically across V1. The parameters of the proposed model maximize the neural response at a given OP with an OP tuning curve fitted to experimental results. It is found that the anisotropy of the AL operator does not significantly affect OP selectivity, which is dominated by the RF anisotropy, consistent with Hubel and Wiesel's original conclusions that orientation tuning width of V1 simple cell is inversely related to the elongation of its RF. A simplified and idealized OP-OD map is then constructed to describe the approximately periodic local OP-OD structure of V1 in a compact form. It is shown explicitly that the OP map can be approximated by retaining its dominant spatial Fourier coefficients, which are shown to suffice to reconstruct its basic spatial structure. Moreover, this representation is a suitable form to analyze observed OP maps compactly and to be used in neural field theory (NFT) for analyzing activity modulated by the OP-OD structure of V1. Application to independently simulated V1 OP structure shows that observed irregularities in the map correspond to a spread of dominant coefficients in a circle in Fourier space. In addition, there is a strong bias toward two perpendicular directions when only a small patch of local map is included. The bias is decreased as the amount of V1 included in the Fourier transform is increased.
Collapse
Affiliation(s)
- Xiaochen Liu
- School of Physics, The University of Sydney, Sydney, NSW, Australia
- Center for Integrative Brain Function, The University of Sydney, Sydney, NSW, Australia
- *Correspondence: Xiaochen Liu
| | - Peter A. Robinson
- School of Physics, The University of Sydney, Sydney, NSW, Australia
- Center for Integrative Brain Function, The University of Sydney, Sydney, NSW, Australia
| |
Collapse
|
11
|
Abstract
The mouse has dichromatic color vision based on two different types of opsins: short (S)- and middle (M)-wavelength-sensitive opsins with peak sensitivity to ultraviolet (UV; 360 nm) and green light (508 nm), respectively. In the mouse retina, cone photoreceptors that predominantly express the S-opsin are more sensitive to contrasts and denser towards the ventral retina, preferentially sampling the upper part of the visual field. In contrast, the expression of the M-opsin gradually increases towards the dorsal retina that encodes the lower visual field. Such a distinctive retinal organization is assumed to arise from a selective pressure in evolution to efficiently encode the natural scenes. However, natural image statistics of UV light remain largely unexplored. Here we developed a multi-spectral camera to acquire high-quality UV and green images of the same natural scenes, and examined the optimality of the mouse retina to the image statistics. We found that the local contrast and the spatial correlation were both higher in UV than in green for images above the horizon, but lower in UV than in green for those below the horizon. This suggests that the dorsoventral functional division of the mouse retina is not optimal for maximizing the bandwidth of information transmission. Factors besides the coding efficiency, such as visual behavioral requirements, will thus need to be considered to fully explain the characteristic organization of the mouse retina.
Collapse
Affiliation(s)
- Luca Abballe
- Department of Biomedical Engineering, Sapienza University of Rome, Rome, Italy
| | - Hiroki Asari
- European Molecular Biology Laboratory, Epigenetics and Neurobiology Unit, EMBL Rome, Monterotondo, Rome, Italy
| |
Collapse
|
12
|
Multi-Frequency Image Completion via a Biologically-Inspired Sub-Riemannian Model with Frequency and Phase. J Imaging 2021; 7:jimaging7120271. [PMID: 34940739 PMCID: PMC8704454 DOI: 10.3390/jimaging7120271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 11/24/2021] [Accepted: 12/01/2021] [Indexed: 11/16/2022] Open
Abstract
We present a novel cortically-inspired image completion algorithm. It uses five-dimensional sub-Riemannian cortical geometry, modeling the orientation, spatial frequency and phase-selective behavior of the cells in the visual cortex. The algorithm extracts the orientation, frequency and phase information existing in a given two-dimensional corrupted input image via a Gabor transform and represents those values in terms of cortical cell output responses in the model geometry. Then, it performs completion via a diffusion concentrated in a neighborhood along the neural connections within the model geometry. The diffusion models the activity propagation integrating orientation, frequency and phase features along the neural connections. Finally, the algorithm transforms the diffused and completed output responses back to the two-dimensional image plane.
Collapse
|
13
|
Kristensen DG, Sandberg K. Population receptive fields of human primary visual cortex organised as DC-balanced bandpass filters. Sci Rep 2021; 11:22423. [PMID: 34789812 PMCID: PMC8599479 DOI: 10.1038/s41598-021-01891-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Accepted: 10/29/2021] [Indexed: 11/22/2022] Open
Abstract
The response to visual stimulation of population receptive fields (pRF) in the human visual cortex has been modelled with a Difference of Gaussians model, yet many aspects of their organisation remain poorly understood. Here, we examined the mathematical basis and signal-processing properties of this model and argue that the DC-balanced Difference of Gaussians (DoG) holds a number of advantages over a DC-biased DoG. Through functional magnetic resonance imaging (fMRI) pRF mapping, we compared performance of DC-balanced and DC-biased models in human primary visual cortex and found that when model complexity is taken into account, the DC-balanced model is preferred. Finally, we present evidence indicating that the BOLD signal DC offset contains information related to the processing of visual stimuli. Taken together, the results indicate that V1 pRFs are at least frequently organised in the exact constellation that allows them to function as bandpass filters, which makes the separation of stimulus contrast and luminance possible. We further speculate that if the DoG models stimulus contrast, the DC offset may reflect stimulus luminance. These findings suggest that it may be possible to separate contrast and luminance processing in fMRI experiments and this could lead to new insights on the haemodynamic response.
Collapse
Affiliation(s)
- Daniel Gramm Kristensen
- Department of Clinical Medicine, Center of Functionally Integrative Neuroscience, Aarhus University Hospital, Aarhus University, Nørrebrogade 44, Building 1A, 8000, Aarhus C, Denmark.
| | - Kristian Sandberg
- Department of Clinical Medicine, Center of Functionally Integrative Neuroscience, Aarhus University Hospital, Aarhus University, Nørrebrogade 44, Building 1A, 8000, Aarhus C, Denmark
| |
Collapse
|
14
|
Fan FL, Xiong J, Li M, Wang G. On Interpretability of Artificial Neural Networks: A Survey. IEEE TRANSACTIONS ON RADIATION AND PLASMA MEDICAL SCIENCES 2021; 5:741-760. [PMID: 35573928 PMCID: PMC9105427 DOI: 10.1109/trpms.2021.3066428] [Citation(s) in RCA: 55] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/30/2023]
Abstract
Deep learning as represented by the artificial deep neural networks (DNNs) has achieved great success recently in many important areas that deal with text, images, videos, graphs, and so on. However, the black-box nature of DNNs has become one of the primary obstacles for their wide adoption in mission-critical applications such as medical diagnosis and therapy. Because of the huge potentials of deep learning, increasing the interpretability of deep neural networks has recently attracted much research attention. In this paper, we propose a simple but comprehensive taxonomy for interpretability, systematically review recent studies in improving interpretability of neural networks, describe applications of interpretability in medicine, and discuss possible future research directions of interpretability, such as in relation to fuzzy logic and brain science.
Collapse
Affiliation(s)
- Feng-Lei Fan
- Department of Biomedical Engineering, Rensselaer Polytechnic Institute, Troy, NY, USA
| | - Jinjun Xiong
- IBM Thomas J. Watson Research Center, Yorktown Heights, NY, 10598, USA
| | - Mengzhou Li
- Department of Biomedical Engineering, Rensselaer Polytechnic Institute, Troy, NY, USA
| | - Ge Wang
- Department of Biomedical Engineering, Rensselaer Polytechnic Institute, Troy, NY, USA
| |
Collapse
|
15
|
Koenderink J. The structure of images: 1984-2021. BIOLOGICAL CYBERNETICS 2021; 115:117-120. [PMID: 33774717 DOI: 10.1007/s00422-021-00870-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Accepted: 03/13/2021] [Indexed: 06/12/2023]
Abstract
I present a personal account of the origin, development and future of a concept that appeared in this journal in 1984. The title was The Structure of Images. It became known as "scale space."
Collapse
|
16
|
A Cortical-Inspired Sub-Riemannian Model for Poggendorff-Type Visual Illusions. J Imaging 2021; 7:jimaging7030041. [PMID: 34460697 PMCID: PMC8321287 DOI: 10.3390/jimaging7030041] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Revised: 01/27/2021] [Accepted: 02/11/2021] [Indexed: 11/20/2022] Open
Abstract
We consider Wilson-Cowan-type models for the mathematical description of orientation-dependent Poggendorff-like illusions. Our modelling improves two previously proposed cortical-inspired approaches, embedding the sub-Riemannian heat kernel into the neuronal interaction term, in agreement with the intrinsically anisotropic functional architecture of V1 based on both local and lateral connections. For the numerical realisation of both models, we consider standard gradient descent algorithms combined with Fourier-based approaches for the efficient computation of the sub-Laplacian evolution. Our numerical results show that the use of the sub-Riemannian kernel allows us to reproduce numerically visual misperceptions and inpainting-type biases in a stronger way in comparison with the previous approaches.
Collapse
|
17
|
Lindeberg T. Normative theory of visual receptive fields. Heliyon 2021; 7:e05897. [PMID: 33521348 PMCID: PMC7820928 DOI: 10.1016/j.heliyon.2021.e05897] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Revised: 12/28/2020] [Accepted: 12/31/2020] [Indexed: 11/19/2022] Open
Abstract
This article gives an overview of a normative theory of visual receptive fields. We describe how idealized functional models of early spatial, spatio-chromatic and spatio-temporal receptive fields can be derived in a principled way, based on a set of axioms that reflect structural properties of the environment in combination with assumptions about the internal structure of a vision system to guarantee consistent handling of image representations over multiple spatial and temporal scales. Interestingly, this theory leads to predictions about visual receptive field shapes with qualitatively very good similarities to biological receptive fields measured in the retina, the LGN and the primary visual cortex (V1) of mammals.
Collapse
Affiliation(s)
- Tony Lindeberg
- Computational Brain Science Lab, Division of Computational Science and Technology, KTH Royal Institute of Technology, SE-100 44 Stockholm, Sweden
| |
Collapse
|
18
|
Zhuo Z, Huang J, Lu K, Pan D, Feng S. A size-invariant convolutional network with dense connectivity applied to retinal vessel segmentation measured by a unique index. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2020; 196:105508. [PMID: 32563893 DOI: 10.1016/j.cmpb.2020.105508] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 03/15/2020] [Accepted: 04/12/2020] [Indexed: 06/11/2023]
Abstract
BACKGROUND AND OBJECTIVES Retinal vessel segmentation (RVS) helps in diagnosing diseases such as hypertension, cardiovascular diseases, and others. Convolutional neural networks are widely used in RVS tasks. However, how to comprehensively evaluate the segmentation results and how to improve the networks' learning ability are two great challenges. METHODS In this paper, we proposed an ingenious index: fusion score (FS), which provides an overall measure for those binary images. The FS converts multiple metrics into a single target, and therefore facilitates the optimal threshold's selection and models' comparison. In addition, We simultaneously combined size-invariant feature maps and dense connectivity together to improve the traditional CNN's learning ability. Therefore, a size-invariant convolutional network with dense connectivity is designed for RVS. The size-invariant skill helps the deep layers create feature maps with high resolution. The dense connectivity technique is utilized to integrate those hierarchical features and reuse characteristic maps to enhance the network's learning ability. Finally, an optimized threshold is used on the output image to obtain a binary image. RESULTS The results of experiments conducted on two shared retinal image databases, DRIVE and STARE, demonstrate that our approach outperforms other techniques when evaluated in terms of F1-score, Matthews correlation coefficient (MCC), G-mean and FS. In addition, the cross training reveals that our method has stronger robustness with respect to training sets. Segmenting a 565 × 584 image only takes 39 ms with a single GPU (graphics processing unit). CONCLUSIONS Compared with those traditional metrics, the FS is a better indicator to measure the results of RVS tasks. The experimental results revealed that the proposed method is more suitable for real-world applications.
Collapse
Affiliation(s)
- Zhongshuo Zhuo
- School of Physics and Telecommunication Engineering, South China Normal University, Guangzhou 510006, China.
| | - Jianping Huang
- School of Physics and Telecommunication Engineering, South China Normal University, Guangzhou 510006, China.
| | - Ke Lu
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China.
| | - Daru Pan
- School of Physics and Telecommunication Engineering, South China Normal University, Guangzhou 510006, China.
| | - Shouting Feng
- School of Physics and Telecommunication Engineering, South China Normal University, Guangzhou 510006, China.
| |
Collapse
|
19
|
Parzhin Y, Kosenko V, Podorozhniak A, Malyeyeva O, Timofeyev V. Detector neural network vs connectionist ANNs. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.07.025] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
20
|
Baspinar E, Sarti A, Citti G. A sub-Riemannian model of the visual cortex with frequency and phase. JOURNAL OF MATHEMATICAL NEUROSCIENCE 2020; 10:11. [PMID: 32728818 PMCID: PMC7391467 DOI: 10.1186/s13408-020-00089-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/24/2019] [Accepted: 07/19/2020] [Indexed: 06/11/2023]
Abstract
In this paper, we present a novel model of the primary visual cortex (V1) based on orientation, frequency, and phase selective behavior of V1 simple cells. We start from the first-level mechanisms of visual perception, receptive profiles. The model interprets V1 as a fiber bundle over the two-dimensional retinal plane by introducing orientation, frequency, and phase as intrinsic variables. Each receptive profile on the fiber is mathematically interpreted as rotated, frequency modulated, and phase shifted Gabor function. We start from the Gabor function and show that it induces in a natural way the model geometry and the associated horizontal connectivity modeling of the neural connectivity patterns in V1. We provide an image enhancement algorithm employing the model framework. The algorithm is capable of exploiting not only orientation but also frequency and phase information existing intrinsically in a two-dimensional input image. We provide the experimental results corresponding to the enhancement algorithm.
Collapse
Affiliation(s)
- E. Baspinar
- MathNeuro Team, INRIA Sophia Antipolis, Valbonne, France
| | | | - G. Citti
- MathNeuro Team, INRIA Sophia Antipolis, Valbonne, France
- Department of Mathematics, University of Bologna, Bologna, Italy
| |
Collapse
|
21
|
Waniek N. Transition Scale-Spaces: A Computational Theory for the Discretized Entorhinal Cortex. Neural Comput 2020; 32:330-394. [DOI: 10.1162/neco_a_01255] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Although hippocampal grid cells are thought to be crucial for spatial navigation, their computational purpose remains disputed. Recently, they were proposed to represent spatial transitions and convey this knowledge downstream to place cells. However, a single scale of transitions is insufficient to plan long goal-directed sequences in behaviorally acceptable time. Here, a scale-space data structure is suggested to optimally accelerate retrievals from transition systems, called transition scale-space (TSS). Remaining exclusively on an algorithmic level, the scale increment is proved to be ideally [Formula: see text] for biologically plausible receptive fields. It is then argued that temporal buffering is necessary to learn the scale-space online. Next, two modes for retrieval of sequences from the TSS are presented: top down and bottom up. The two modes are evaluated in symbolic simulations (i.e., without biologically plausible spiking neurons). Additionally, a TSS is used for short-cut discovery in a simulated Morris water maze. Finally, the results are discussed in depth with respect to biological plausibility, and several testable predictions are derived. Moreover, relations to other grid cell models, multiresolution path planning, and scale-space theory are highlighted. Summarized, reward-free transition encoding is shown here, in a theoretical model, to be compatible with the observed discretization along the dorso-ventral axis of the medial entorhinal cortex. Because the theoretical model generalizes beyond navigation, the TSS is suggested to be a general-purpose cortical data structure for fast retrieval of sequences and relational knowledge. Source code for all simulations presented in this paper can be found at https://github.com/rochus/transitionscalespace .
Collapse
Affiliation(s)
- Nicolai Waniek
- Bosch Center for Artificial Intelligence, Robert Bosch GmbH, 71272 Renningen, Germany
| |
Collapse
|
22
|
Patterns of individual differences in fiber tract integrity of the face processing brain network support neurofunctional models. Neuroimage 2020; 204:116229. [DOI: 10.1016/j.neuroimage.2019.116229] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Revised: 08/19/2019] [Accepted: 09/25/2019] [Indexed: 12/22/2022] Open
|
23
|
Nakamura D, Satoh S. Simple speed estimators reproduce MT responses and identify strength of visual illusion. Neural Comput Appl 2019. [DOI: 10.1007/s00521-017-3211-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
24
|
Xu X, Chen Q, Xu R. The Study of Spatial Frequency Channels for Human Visual System. INT J PATTERN RECOGN 2019. [DOI: 10.1142/s0218001419550073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Similar to auditory perception of sound system, color perception of the human visual system also presents a multi-frequency channel property. In order to study the multi-frequency channel mechanism of how the human visual system processes color information, the paper proposed a psychophysical experiment to measure the contrast sensitivities based on 17 color samples of 16 spatial frequencies on CIELAB opponent color space. Correlation analysis was carried out on the psychophysical experiment data, and the results show obvious linear correlations of observations for different spatial frequencies of different observers, which indicates that a linear model can be used to model how human visual system processes spatial frequency information. The results of solving the model based on the experiment data of color samples show that 9 spatial frequency tuning curves can exist in human visual system with each lightness, R–G and Y–B color channel and each channel can be represented by 3 tuning curves, which reflect the “center-around” form of the human visual receptive field. It is concluded that there are 9 spatial frequency channels in human vision system. The low frequency tuning curve of a narrow-frequency bandwidth shows the characteristics of lower level receptive field for human vision system, the medium frequency tuning curve shows a low pass property of the change of medium frequent colors and the high frequency tuning curve of a width-frequency bandwidth, which has a feedback effect on the low and medium frequency channels and shows the characteristics of higher level receptive field for human vision system, which represents the discrimination of details.
Collapse
Affiliation(s)
- Xiangyang Xu
- School of Communication, Shenzhen Polytechnic, Guangdong Shenzhen 518055, P. R. China
| | - Qiao Chen
- School of Communication, Shenzhen Polytechnic, Guangdong Shenzhen 518055, P. R. China
| | - Ruixin Xu
- School of Communication, Shenzhen Polytechnic, Guangdong Shenzhen 518055, P. R. China
| |
Collapse
|
25
|
Griffin LD. The Atlas Structure of Images. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2019; 41:234-245. [PMID: 29990035 DOI: 10.1109/tpami.2017.2777856] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Many operations of vision require image regions to be isolated and inter-related. This is challenging when they are different in detail and extent. Practical methods of Computer Vision approach this through the tools of downsampling, pyramids, cropping and patches. In this paper we develop an ideal geometric structure for this, compatible with the existing scale space model of image measurement. Its elements are apertures which view the image like fuzzy-edged portholes of frosted glass. We establish containment and cause/effect relations between apertures, and show that these link them into cross-scale atlases. Atlases formed of Gaussian apertures are shown to be a continuous version of the image pyramid used in Computer Vision, and allow various types of image description to naturally be expressed within their framework. We show that views through Gaussian apertures are approximately equivalent to the jets of derivative of Gaussian filter responses that form part of standard Scale Space theory. This supports a view of the simple cells of mammalian V1 as implementing a system of local views of the retinal image of varying extent and resolution. As a worked example we develop a keypoint descriptor scheme that outperforms previous schemes that do not make use of learning.
Collapse
|
26
|
Han D, Li M, Mei M, Sun X. The functional and structural characteristics of the emotion network in alexithymia. Neuropsychiatr Dis Treat 2018; 14:991-998. [PMID: 29695908 PMCID: PMC5905825 DOI: 10.2147/ndt.s154601] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND Alexithymia is a multifaceted personality trait characterized by emotional dysfunction. METHODS In this study, the functional and structural features of the emotion network in alexithymia were investigated using resting-state functional MRI (rsfMRI), voxel-based morphometry (VBM), functional connectivity (FC) analysis, and diffusion tensor imaging (DTI). Alexithymic and non-alexithymic students were recruited from the local university. The intrinsic neural activity and gray matter density of the brain regions in the emotion network were measured using rsfMRI and VBM; the FC and structural connectivity of the brain regions in the emotion network were measured using FC analysis and DTI. RESULTS The altered intrinsic neural activity in V1, rostral dorsal anterior cingulate cortex, and left amygdala, and the weak FC between V1 and left superior temporal gyrus and V1 and left paracentral lobule in alexithymia subjects were identified. However, no alteration of the structure and structural connectivity of the emotion network was identified. CONCLUSION The results indicated that the development of alexithymia might have been caused only by slight alteration of the neural activity. Furthermore, the results suggest that noninvasive treatment technologies for improving the brain activity are suitable for alexithymic individuals.
Collapse
Affiliation(s)
- Dai Han
- Institutes of Psychological Sciences, Hangzhou Normal University, Hangzhou, Zhejiang, China.,Children and Adolescents Mental Health Joint Clinic, The Affiliated Hospital of Hangzhou Normal University, Hangzhou, Zhejiang, China.,Zhejiang Key Laboratory for Research in Assessment of Cognitive Impairments, Hangzhou, Zhejiang, China
| | - Mei Li
- Mental Health Education and Counseling Center, Hangzhou Normal University, Hangzhou, Zhejiang, China
| | - Minjun Mei
- Mental Health Education and Counseling Center, Hangzhou Normal University, Hangzhou, Zhejiang, China
| | - Xiaofei Sun
- Mental Health Education and Counseling Center, Hangzhou Normal University, Hangzhou, Zhejiang, China
| |
Collapse
|
27
|
Towards building a more complex view of the lateral geniculate nucleus: Recent advances in understanding its role. Prog Neurobiol 2017. [DOI: 10.1016/j.pneurobio.2017.06.002] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
28
|
Olveres J, Nava R, Escalante-Ramírez B, Vallejo E, Kybic J. Left ventricle Hermite-based segmentation. Comput Biol Med 2017; 87:236-249. [PMID: 28618336 DOI: 10.1016/j.compbiomed.2017.05.025] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Revised: 05/26/2017] [Accepted: 05/27/2017] [Indexed: 10/19/2022]
Abstract
In recent years, computed tomography (CT) has become a standard technique in cardiac imaging because it provides detailed information that may facilitate the diagnosis of the conditions that interfere with correct heart function. However, CT-based cardiac diagnosis requires manual segmentation of heart cavities, which is a difficult and time-consuming task. Thus, in this paper, we propose a novel technique to segment endocardium and epicardium boundaries based on a 2D approach. The proposal computes relevant information of the left ventricle and its adjacent structures using the Hermite transform. The novelty of the work is that the information is combined with active shape models and level sets to improve the segmentation. Our database consists of mid-third slices selected from 28 volumes manually segmented by expert physicians. The segmentation is assessed using Dice coefficient and Hausdorff distance. In addition, we introduce a novel metric called Ray Feature error to evaluate our method. The results show that the proposal accurately discriminates cardiac tissue. Thus, it may be a useful tool for supporting heart disease diagnosis and tailoring treatments.
Collapse
Affiliation(s)
- Jimena Olveres
- Facultad de Ingeniería, Universidad Nacional Autónoma de México, Mexico.
| | - Rodrigo Nava
- Faculty of Electrical Engineering, Czech Technical University in Prague, Czech Republic
| | | | | | - Jan Kybic
- Faculty of Electrical Engineering, Czech Technical University in Prague, Czech Republic
| |
Collapse
|
29
|
|
30
|
Realistic Image Rendition Using a Variable Exponent Functional Model for Retinex. SENSORS 2016; 16:s16060832. [PMID: 27338379 PMCID: PMC4934258 DOI: 10.3390/s16060832] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/17/2016] [Revised: 05/11/2016] [Accepted: 05/16/2016] [Indexed: 01/22/2023]
Abstract
The goal of realistic image rendition is to recover the acquired image under imperfect illuminant conditions, where non-uniform illumination may degrade image quality with high contrast and low SNR. In this paper, the assumption regarding illumination is modified and a variable exponent functional model for Retinex is proposed to remove non-uniform illumination and reduce halo artifacts. The theoretical derivation is provided and experimental results are presented to illustrate the effectiveness of the proposed model.
Collapse
|
31
|
Pei ZJ, Gao GX, Hao B, Qiao QL, Ai HJ. A cascade model of information processing and encoding for retinal prosthesis. Neural Regen Res 2016; 11:646-51. [PMID: 27212929 PMCID: PMC4870925 DOI: 10.4103/1673-5374.180752] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Retinal prosthesis offers a potential treatment for individuals suffering from photoreceptor degeneration diseases. Establishing biological retinal models and simulating how the biological retina convert incoming light signal into spike trains that can be properly decoded by the brain is a key issue. Some retinal models have been presented, ranking from structural models inspired by the layered architecture to functional models originated from a set of specific physiological phenomena. However, Most of these focus on stimulus image compression, edge detection and reconstruction, but do not generate spike trains corresponding to visual image. In this study, based on state-of-the-art retinal physiological mechanism, including effective visual information extraction, static nonlinear rectification of biological systems and neurons Poisson coding, a cascade model of the retina including the out plexiform layer for information processing and the inner plexiform layer for information encoding was brought forward, which integrates both anatomic connections and functional computations of retina. Using MATLAB software, spike trains corresponding to stimulus image were numerically computed by four steps: linear spatiotemporal filtering, static nonlinear rectification, radial sampling and then Poisson spike generation. The simulated results suggested that such a cascade model could recreate visual information processing and encoding functionalities of the retina, which is helpful in developing artificial retina for the retinally blind.
Collapse
Affiliation(s)
- Zhi-Jun Pei
- Department of Clinical Engineering, Inner Mongolia Autonomous Region People's Hospital, Hohhot, Inner Mongolia Autonomous Region, China
| | - Guan-Xin Gao
- Department of Clinical Engineering, Inner Mongolia Autonomous Region People's Hospital, Hohhot, Inner Mongolia Autonomous Region, China
| | - Bo Hao
- Department of Clinical Engineering, Inner Mongolia Autonomous Region People's Hospital, Hohhot, Inner Mongolia Autonomous Region, China
| | - Qing-Li Qiao
- School of Biomedical Engineering, Tianjin Medical University, Tianjin, China
| | - Hui-Jian Ai
- School of Biomedical Engineering, Chongqing Medical University, Chongqing, China
| |
Collapse
|
32
|
Lindeberg T, Friberg A. Idealized computational models for auditory receptive fields. PLoS One 2015; 10:e0119032. [PMID: 25822973 PMCID: PMC4379182 DOI: 10.1371/journal.pone.0119032] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2014] [Accepted: 01/24/2015] [Indexed: 11/19/2022] Open
Abstract
We present a theory by which idealized models of auditory receptive fields can be derived in a principled axiomatic manner, from a set of structural properties to (i) enable invariance of receptive field responses under natural sound transformations and (ii) ensure internal consistency between spectro-temporal receptive fields at different temporal and spectral scales. For defining a time-frequency transformation of a purely temporal sound signal, it is shown that the framework allows for a new way of deriving the Gabor and Gammatone filters as well as a novel family of generalized Gammatone filters, with additional degrees of freedom to obtain different trade-offs between the spectral selectivity and the temporal delay of time-causal temporal window functions. When applied to the definition of a second-layer of receptive fields from a spectrogram, it is shown that the framework leads to two canonical families of spectro-temporal receptive fields, in terms of spectro-temporal derivatives of either spectro-temporal Gaussian kernels for non-causal time or a cascade of time-causal first-order integrators over the temporal domain and a Gaussian filter over the logspectral domain. For each filter family, the spectro-temporal receptive fields can be either separable over the time-frequency domain or be adapted to local glissando transformations that represent variations in logarithmic frequencies over time. Within each domain of either non-causal or time-causal time, these receptive field families are derived by uniqueness from the assumptions. It is demonstrated how the presented framework allows for computation of basic auditory features for audio processing and that it leads to predictions about auditory receptive fields with good qualitative similarity to biological receptive fields measured in the inferior colliculus (ICC) and primary auditory cortex (A1) of mammals.
Collapse
Affiliation(s)
- Tony Lindeberg
- Department of Computational Biology, School of Computer Science and Communication, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Anders Friberg
- Department of Speech, Music and Hearing, School of Computer Science and Communication, KTH Royal Institute of Technology, Stockholm, Sweden
| |
Collapse
|