1
|
Hashim IC, Senden M, Goebel R. PrediRep: Modeling hierarchical predictive coding with an unsupervised deep learning network. Neural Netw 2025; 185:107246. [PMID: 39946763 DOI: 10.1016/j.neunet.2025.107246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Revised: 10/02/2024] [Accepted: 01/31/2025] [Indexed: 03/09/2025]
Abstract
Hierarchical predictive coding (hPC) provides a compelling framework for understanding how the cortex predicts future sensory inputs by minimizing prediction errors through an internal generative model of the external world. Existing deep learning models inspired by hPC incorporate architectural choices that deviate from core hPC principles, potentially limiting their utility for neuroscientific investigations. We introduce PrediRep (Predicting Representations), a novel deep learning network that adheres more closely to architectural principles of hPC. We validate PrediRep by comparing its functional alignment with hPC to that of existing models after being trained on a next-frame prediction task. Our findings demonstrate that PrediRep, particularly when trained with an all-level loss function (PrediRepAll), exhibits high functional alignment with hPC. In contrast to other contemporary deep learning networks inspired by hPC, it consistently processes input-relevant information at higher hierarchical levels and maintains active representations and accurate predictions across all hierarchical levels. Although PrediRep was designed primarily to serve as a model suitable for neuroscientific research rather than to optimize performance, it nevertheless achieves competitive performance in next-frame prediction while utilizing significantly fewer trainable parameters than alternative models. Our results underscore that even minor architectural deviations from neuroscientific theories like hPC can lead to significant functional discrepancies. By faithfully adhering to hPC principles, PrediRep provides a more accurate tool for in silico exploration of cortical phenomena. PrediRep's lightweight and biologically plausible design makes it well-suited for future studies aiming to investigate the neural underpinnings of predictive coding and to derive empirically testable predictions.
Collapse
Affiliation(s)
- Ibrahim C Hashim
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands; Maastricht Brain Imaging Centre, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands.
| | - Mario Senden
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands; Maastricht Brain Imaging Centre, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands.
| | - Rainer Goebel
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands; Maastricht Brain Imaging Centre, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands
| |
Collapse
|
2
|
Gao X, Ma S, Ni W, Kuang Y, Yu Y, Zhou L, Li Y, Guo C, Xu C, Li L, Huang H, Han J. Design of Multi-Cancer VOCs Profiling Platform via a Deep Learning-Assisted Sensing Library Screening Strategy. Anal Chem 2025; 97:8301-8312. [PMID: 40211116 DOI: 10.1021/acs.analchem.4c06468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2025]
Abstract
The efficiency of sensor arrays in parallel discrimination of multianalytes is fundamentally influenced by the quantity and performance of the sensor elements. The advent of combinational design has notably accelerated the generation of chemical libraries, offering numerous candidates for the development of robust sensor arrays. However, screening elements with superior cross-responsiveness remains challenging, impeding the development of high-performance sensor arrays. Herein, we propose a new deep learning-assisted, two-step screening strategy to identify the optimal combination of minimal sensor elements, using a designed volatile organic compounds (VOCs)-targeted sensor library. 400 sensing elements constructed by pairing 20 ionizable cationic elements and 20 anionic dyes in the sensor library were employed for various VOCs, generating plentiful color variation data. By employing a feedforward neural network─random forest-recursive feature elimination (FRR) algorithm, sensing elements were effectively screened, resulting in the rapidly producing 8-element and 10-element arrays for two VOC models, both achieving 100% discrimination accuracy. Furthermore, a smartphone-based point-of-care testing (POCT) platform achieved cancer discrimination in a simulated cancer VOC model, using image-based deep learning, demonstrating the rationality and practicality of deep learning in the assembly of sensor elements for parallel sensing platforms.
Collapse
Affiliation(s)
- Xu Gao
- State Key Laboratory of Natural Medicines, National R&D Center for Chinese Herbal Medicine Processing, College of Engineering, China Pharmaceutical University, Nanjing 211198, China
| | - Shuoyang Ma
- State Key Laboratory of Natural Medicines, National R&D Center for Chinese Herbal Medicine Processing, College of Engineering, China Pharmaceutical University, Nanjing 211198, China
| | - Weiwei Ni
- State Key Laboratory of Natural Medicines, National R&D Center for Chinese Herbal Medicine Processing, College of Engineering, China Pharmaceutical University, Nanjing 211198, China
| | - Yongbin Kuang
- State Key Laboratory of Natural Medicines, National R&D Center for Chinese Herbal Medicine Processing, College of Engineering, China Pharmaceutical University, Nanjing 211198, China
| | - Yang Yu
- State Key Laboratory of Natural Medicines, National R&D Center for Chinese Herbal Medicine Processing, College of Engineering, China Pharmaceutical University, Nanjing 211198, China
| | - Lingjia Zhou
- State Key Laboratory of Natural Medicines, National R&D Center for Chinese Herbal Medicine Processing, College of Engineering, China Pharmaceutical University, Nanjing 211198, China
| | - Yong Li
- College of Life Science and Technology, Ningxia Polytechnic, Yingchuan, Ningxia 750021, China
| | - Chao Guo
- College of Life Science and Technology, Ningxia Polytechnic, Yingchuan, Ningxia 750021, China
| | - Chao Xu
- College of Life Science and Technology, Ningxia Polytechnic, Yingchuan, Ningxia 750021, China
| | - Linxian Li
- Department of Neuroscience, Karolinska Institutet, Stockholm 17177, Sweden
| | - Hui Huang
- State Key Laboratory of Natural Medicines, National R&D Center for Chinese Herbal Medicine Processing, College of Engineering, China Pharmaceutical University, Nanjing 211198, China
- Department of Neuroscience, Karolinska Institutet, Stockholm 17177, Sweden
| | - Jinsong Han
- State Key Laboratory of Natural Medicines, National R&D Center for Chinese Herbal Medicine Processing, College of Engineering, China Pharmaceutical University, Nanjing 211198, China
| |
Collapse
|
3
|
Ye Z, Wessel R, Franken TP. Brain-like border ownership signals support prediction of natural videos. iScience 2025; 28:112199. [PMID: 40224014 PMCID: PMC11986989 DOI: 10.1016/j.isci.2025.112199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2024] [Revised: 02/04/2025] [Accepted: 03/06/2025] [Indexed: 04/15/2025] Open
Abstract
To make sense of visual scenes, the brain must segment foreground from background. This is thought to be facilitated by neurons that signal border ownership (BOS), which indicate which side of a border in their receptive field is owned by an object. How these signals emerge without a teaching signal of what is foreground remains unclear. Here we find that many units in PredNet, a self-supervised deep neural network trained to predict future frames in natural videos, are selective for BOS. They share key properties with BOS neurons in the brain, including robustness to object transformations and hysteresis. Ablation revealed that BOS units contribute more to prediction than other units for videos with moving objects. Our findings suggest that BOS neurons might emerge due to an evolutionary or developmental pressure to predict future input in natural, complex dynamic environments, even without an explicit requirement to segment foreground from background.
Collapse
Affiliation(s)
- Zeyuan Ye
- Department of Physics, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Ralf Wessel
- Department of Physics, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Tom P. Franken
- Department of Neuroscience, Washington University in St. Louis, St. Louis, MO 63110, USA
| |
Collapse
|
4
|
May L, Dauphin A, Gjorgjieva J. Pre-training artificial neural networks with spontaneous retinal activity improves motion prediction in natural scenes. PLoS Comput Biol 2025; 21:e1012830. [PMID: 40096645 DOI: 10.1371/journal.pcbi.1012830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Accepted: 01/27/2025] [Indexed: 03/19/2025] Open
Abstract
The ability to process visual stimuli rich with motion represents an essential skill for animal survival and is largely already present at the onset of vision. Although the exact mechanisms underlying its maturation remain elusive, spontaneous activity patterns in the retina, known as retinal waves, have been shown to contribute to this developmental process. Retinal waves exhibit complex spatio-temporal statistics and contribute to the establishment of circuit connectivity and function in the visual system, including the formation of retinotopic maps and the refinement of receptive fields in downstream areas such as the thalamus and visual cortex. Recent work in mice has shown that retinal waves have statistical features matching those of natural visual stimuli, such as optic flow, suggesting that they could prime the visual system for motion processing upon vision onset. Motivated by these findings, we examined whether artificial neural network (ANN) models trained on natural movies show improved performance if pre-trained with retinal waves. We employed the spatio-temporally complex task of next-frame prediction, in which the ANN was trained to predict the next frame based on preceding input frames of a movie. We found that pre-training ANNs with retinal waves enhances the processing of real-world visual stimuli and accelerates learning. Strikingly, when we merely replaced the initial training epochs on naturalistic stimuli with retinal waves, keeping the total training time the same, we still found that an ANN trained on retinal waves temporarily outperforms one trained solely on natural movies. Similar to observations made in biological systems, we also found that pre-training with spontaneous activity refines the receptive field of ANN neurons. Overall, our work sheds light on the functional role of spatio-temporally patterned spontaneous activity in the processing of motion in natural scenes, suggesting it acts as a training signal to prepare the developing visual system for adult visual processing.
Collapse
Affiliation(s)
- Lilly May
- School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Alice Dauphin
- School of Life Sciences, Technical University of Munich, Freising, Germany
- Institute of Machine Learning and Neural Computation, Graz University of Technology, Graz, Austria
| | | |
Collapse
|
5
|
Greco A, Siegel M. A spatiotemporal style transfer algorithm for dynamic visual stimulus generation. NATURE COMPUTATIONAL SCIENCE 2025; 5:155-169. [PMID: 39706876 PMCID: PMC11860245 DOI: 10.1038/s43588-024-00746-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Accepted: 11/21/2024] [Indexed: 12/23/2024]
Abstract
Understanding how visual information is encoded in biological and artificial systems often requires the generation of appropriate stimuli to test specific hypotheses, but available methods for video generation are scarce. Here we introduce the spatiotemporal style transfer (STST) algorithm, a dynamic visual stimulus generation framework that allows the manipulation and synthesis of video stimuli for vision research. We show how stimuli can be generated that match the low-level spatiotemporal features of their natural counterparts, but lack their high-level semantic features, providing a useful tool to study object recognition. We used these stimuli to probe PredNet, a predictive coding deep network, and found that its next-frame predictions were not disrupted by the omission of high-level information, with human observers also confirming the preservation of low-level features and lack of high-level information in the generated stimuli. We also introduce a procedure for the independent spatiotemporal factorization of dynamic stimuli. Testing such factorized stimuli on humans and deep vision models suggests a spatial bias in how humans and deep vision models encode dynamic visual information. These results showcase potential applications of the STST algorithm as a versatile tool for dynamic stimulus generation in vision science.
Collapse
Affiliation(s)
- Antonino Greco
- Department of Neural Dynamics and Magnetoencephalography, Hertie Institute for Clinical Brain Research, University of Tübingen, Tübingen, Germany.
- Centre for Integrative Neuroscience, University of Tübingen, Tübingen, Germany.
- MEG Center, University of Tübingen, Tübingen, Germany.
| | - Markus Siegel
- Department of Neural Dynamics and Magnetoencephalography, Hertie Institute for Clinical Brain Research, University of Tübingen, Tübingen, Germany.
- Centre for Integrative Neuroscience, University of Tübingen, Tübingen, Germany.
- MEG Center, University of Tübingen, Tübingen, Germany.
- German Center for Mental Health (DZPG), Tübingen, Germany.
| |
Collapse
|
6
|
Brucklacher M, Pezzulo G, Mannella F, Galati G, Pennartz CMA. Learning to segment self-generated from externally caused optic flow through sensorimotor mismatch circuits. Neural Netw 2025; 181:106716. [PMID: 39383679 DOI: 10.1016/j.neunet.2024.106716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 06/09/2024] [Accepted: 09/07/2024] [Indexed: 10/11/2024]
Abstract
Efficient sensory detection requires the capacity to ignore task-irrelevant information, for example when optic flow patterns created by egomotion need to be disentangled from object perception. To investigate how this is achieved in the visual system, predictive coding with sensorimotor mismatch detection is an attractive starting point. Indeed, experimental evidence for sensorimotor mismatch signals in early visual areas exists, but it is not understood how they are integrated into cortical networks that perform input segmentation and categorization. Our model advances a biologically plausible solution by extending predictive coding models with the ability to distinguish self-generated from externally caused optic flow. We first show that a simple three neuron circuit produces experience-dependent sensorimotor mismatch responses, in agreement with calcium imaging data from mice. This microcircuit is then integrated into a neural network with two generative streams. The motor-to-visual stream consists of parallel microcircuits between motor and visual areas and learns to spatially predict optic flow resulting from self-motion. The second stream bidirectionally connects a motion-selective higher visual area (mHVA) to V1, assigning a crucial role to the abundant feedback connections to V1: the maintenance of a generative model of externally caused optic flow. In the model, area mHVA learns to segment moving objects from the background, and facilitates object categorization. Based on shared neurocomputational principles across species, the model also maps onto primate visual cortex. Our work extends Hebbian predictive coding to sensorimotor settings, in which the agent actively moves - and learns to predict the consequences of its own movements.
Collapse
Affiliation(s)
- Matthias Brucklacher
- Cognitive and Systems Neuroscience, University of Amsterdam, 1098XH Amsterdam, Netherlands.
| | - Giovanni Pezzulo
- Institute of Cognitive Sciences and Technologies, National Research Council, 00196 Rome, Italy
| | - Francesco Mannella
- Institute of Cognitive Sciences and Technologies, National Research Council, 00196 Rome, Italy
| | - Gaspare Galati
- Brain Imaging Laboratory, Department of Psychology, Sapienza University, 00185 Rome, Italy
| | - Cyriel M A Pennartz
- Cognitive and Systems Neuroscience, University of Amsterdam, 1098XH Amsterdam, Netherlands
| |
Collapse
|
7
|
Mao J, Rothkopf CA, Stocker AA. Adaptation optimizes sensory encoding for future stimuli. PLoS Comput Biol 2025; 21:e1012746. [PMID: 39823517 PMCID: PMC11771873 DOI: 10.1371/journal.pcbi.1012746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Revised: 01/27/2025] [Accepted: 12/21/2024] [Indexed: 01/19/2025] Open
Abstract
Sensory neurons continually adapt their response characteristics according to recent stimulus history. However, it is unclear how such a reactive process can benefit the organism. Here, we test the hypothesis that adaptation actually acts proactively in the sense that it optimally adjusts sensory encoding for future stimuli. We first quantified human subjects' ability to discriminate visual orientation under different adaptation conditions. Using an information theoretic analysis, we found that adaptation leads to a reallocation of coding resources such that encoding accuracy peaks at the mean orientation of the adaptor while total coding capacity remains constant. We then asked whether this characteristic change in encoding accuracy is predicted by the temporal statistics of natural visual input. Analyzing the retinal input of freely behaving human subjects showed that the distribution of local visual orientations in the retinal input stream indeed peaks at the mean orientation of the preceding input history (i.e., the adaptor). We further tested our hypothesis by analyzing the internal sensory representations of a recurrent neural network trained to predict the next frame of natural scene videos (PredNet). Simulating our human adaptation experiment with PredNet, we found that the network exhibited the same change in encoding accuracy as observed in human subjects. Taken together, our results suggest that adaptation-induced changes in encoding accuracy prepare the visual system for future stimuli.
Collapse
Affiliation(s)
- Jiang Mao
- Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | | | - Alan A Stocker
- Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
8
|
Vafaii H, Galor D, Yates JL. Poisson Variational Autoencoder. ARXIV 2024:arXiv:2405.14473v2. [PMID: 39713798 PMCID: PMC11661288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 12/24/2024]
Abstract
Variational autoencoders (VAEs) employ Bayesian inference to interpret sensory inputs, mirroring processes that occur in primate vision across both ventral [1] and dorsal [2] pathways. Despite their success, traditional VAEs rely on continuous latent variables, which deviates sharply from the discrete nature of biological neurons. Here, we developed the Poisson VAE ( 𝒫 -VAE), a novel architecture that combines principles of predictive coding with a VAE that encodes inputs into discrete spike counts. Combining Poisson-distributed latent variables with predictive coding introduces a metabolic cost term in the model loss function, suggesting a relationship with sparse coding which we verify empirically. Additionally, we analyze the geometry of learned representations, contrasting the 𝒫 -VAE to alternative VAE models. We find that the 𝒫 -VAE encodes its inputs in relatively higher dimensions, facilitating linear separability of categories in a downstream classification task with a much better (5×) sample efficiency. Our work provides an interpretable computational framework to study brain-like sensory processing and paves the way for a deeper understanding of perception as an inferential process.
Collapse
|
9
|
Nguyen TT, Bezdek MA, Gershman SJ, Bobick AF, Braver TS, Zacks JM. Modeling human activity comprehension at human scale: prediction, segmentation, and categorization. PNAS NEXUS 2024; 3:pgae459. [PMID: 39445050 PMCID: PMC11497596 DOI: 10.1093/pnasnexus/pgae459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Accepted: 10/01/2024] [Indexed: 10/25/2024]
Abstract
Humans form sequences of event models-representations of the current situation-to predict how activity will unfold. Multiple mechanisms have been proposed for how the cognitive system determines when to segment the stream of behavior and switch from one active event model to another. Here, we constructed a computational model that learns knowledge about event classes (event schemas), by combining recurrent neural networks for short-term dynamics with Bayesian inference over event classes for event-to-event transitions. This architecture represents event schemas and uses them to construct a series of event models. This architecture was trained on one pass through 18 h of naturalistic human activities. Another 3.5 h of activities were used to test each variant for agreement with human segmentation and categorization. The architecture was able to learn to predict human activity, and it developed segmentation and categorization approaching human-like performance. We then compared two variants of this architecture designed to better emulate human event segmentation: one transitioned when the active event model produced high uncertainty in its prediction and the other transitioned when the active event model produced a large prediction error. The two variants learned to segment and categorize events, and the prediction uncertainty variant provided a somewhat closer match to human segmentation and categorization-despite being given no feedback about segmentation or categorization. These results suggest that event model transitioning based on prediction uncertainty or prediction error can reproduce two important features of human event comprehension.
Collapse
Affiliation(s)
- Tan T Nguyen
- Department of Psychological and Brain Sciences, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Matthew A Bezdek
- Department of Psychological and Brain Sciences, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Samuel J Gershman
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
| | - Aaron F Bobick
- Computer Science and Engineering, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Todd S Braver
- Department of Psychological and Brain Sciences, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Jeffrey M Zacks
- Department of Psychological and Brain Sciences, Washington University in St. Louis, St. Louis, MO 63130, USA
| |
Collapse
|
10
|
Ye Z, Wessel R, Franken TP. Brain-like border ownership signals support prediction of natural videos. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.11.607040. [PMID: 39185218 PMCID: PMC11343161 DOI: 10.1101/2024.08.11.607040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/27/2024]
Abstract
To make sense of visual scenes, the brain must segment foreground from background. This is thought to be facilitated by neurons in the primate visual system that encode border ownership (BOS), i.e. whether a local border is part of an object on one or the other side of the border. It is unclear how these signals emerge in neural networks without a teaching signal of what is foreground and background. In this study, we investigated whether BOS signals exist in PredNet, a self-supervised artificial neural network trained to predict the next image frame of natural video sequences. We found that a significant number of units in PredNet are selective for BOS. Moreover these units share several other properties with the BOS neurons in the brain, including robustness to scene variations that constitute common object transformations in natural videos, and hysteresis of BOS signals. Finally, we performed ablation experiments and found that BOS units contribute more to prediction than non-BOS units for videos with moving objects. Our findings indicate that BOS units are especially useful to predict future input in natural videos, even when networks are not required to segment foreground from background. This suggests that BOS neurons in the brain might be the result of evolutionary or developmental pressure to predict future input in natural, complex dynamic visual environments.
Collapse
Affiliation(s)
- Zeyuan Ye
- Department of Physics, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Ralf Wessel
- Department of Physics, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Tom P. Franken
- Department of Neuroscience, Washington University School of Medicine, St. Louis, Missouri, USA
| |
Collapse
|
11
|
Chen Y, Zhang H, Cameron M, Sejnowski T. Predictive sequence learning in the hippocampal formation. Neuron 2024; 112:2645-2658.e4. [PMID: 38917804 DOI: 10.1016/j.neuron.2024.05.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2023] [Revised: 01/21/2024] [Accepted: 05/22/2024] [Indexed: 06/27/2024]
Abstract
The hippocampus receives sequences of sensory inputs from the cortex during exploration and encodes the sequences with millisecond precision. We developed a predictive autoencoder model of the hippocampus including the trisynaptic and monosynaptic circuits from the entorhinal cortex (EC). CA3 was trained as a self-supervised recurrent neural network to predict its next input. We confirmed that CA3 is predicting ahead by analyzing the spike coupling between simultaneously recorded neurons in the dentate gyrus, CA3, and CA1 of the mouse hippocampus. In the model, CA1 neurons signal prediction errors by comparing CA3 predictions to the next direct EC input. The model exhibits the rapid appearance and slow fading of CA1 place cells and displays replay and phase precession from CA3. The model could be learned in a biologically plausible way with error-encoding neurons. Similarities between the hippocampal and thalamocortical circuits suggest that such computation motif could also underlie self-supervised sequence learning in the cortex.
Collapse
Affiliation(s)
- Yusi Chen
- Computational Neurobiology Laboratory, Salk Institute for Biological Sciences, La Jolla, CA 92037, USA; Department of Neurobiology, University of California, San Diego, La Jolla, CA 92093, USA; Computational Neuroscience Center, University of Washington, Seattle, WA 98195, USA.
| | - Huanqiu Zhang
- Computational Neurobiology Laboratory, Salk Institute for Biological Sciences, La Jolla, CA 92037, USA; Neurosciences Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Mia Cameron
- Computational Neurobiology Laboratory, Salk Institute for Biological Sciences, La Jolla, CA 92037, USA; Department of Neurobiology, University of California, San Diego, La Jolla, CA 92093, USA
| | - Terrence Sejnowski
- Computational Neurobiology Laboratory, Salk Institute for Biological Sciences, La Jolla, CA 92037, USA; Department of Neurobiology, University of California, San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
12
|
Umar TP, Jain N, Papageorgakopoulou M, Shaheen RS, Alsamhori JF, Muzzamil M, Kostiks A. Artificial intelligence for screening and diagnosis of amyotrophic lateral sclerosis: a systematic review and meta-analysis. Amyotroph Lateral Scler Frontotemporal Degener 2024; 25:425-436. [PMID: 38563056 DOI: 10.1080/21678421.2024.2334836] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 03/04/2024] [Accepted: 03/18/2024] [Indexed: 04/04/2024]
Abstract
INTRODUCTION Amyotrophic lateral sclerosis (ALS) is a rare and fatal neurological disease that leads to progressive motor function degeneration. Diagnosing ALS is challenging due to the absence of a specific detection test. The use of artificial intelligence (AI) can assist in the investigation and treatment of ALS. METHODS We searched seven databases for literature on the application of AI in the early diagnosis and screening of ALS in humans. The findings were summarized using random-effects summary receiver operating characteristic curve. The risk of bias (RoB) analysis was carried out using QUADAS-2 or QUADAS-C tools. RESULTS In the 34 analyzed studies, a meta-prevalence of 47% for ALS was noted. For ALS detection, the pooled sensitivity of AI models was 94.3% (95% CI - 63.2% to 99.4%) with a pooled specificity of 98.9% (95% CI - 92.4% to 99.9%). For ALS classification, the pooled sensitivity of AI models was 90.9% (95% CI - 86.5% to 93.9%) with a pooled specificity of 92.3% (95% CI - 84.8% to 96.3%). Based on type of input for classification, the pooled sensitivity of AI models for gait, electromyography, and magnetic resonance signals was 91.2%, 92.6%, and 82.2%, respectively. The pooled specificity for gait, electromyography, and magnetic resonance signals was 94.1%, 96.5%, and 77.3%, respectively. CONCLUSIONS Although AI can play a significant role in the screening and diagnosis of ALS due to its high sensitivities and specificities, concerns remain regarding quality of evidence reported in the literature.
Collapse
Affiliation(s)
- Tungki Pratama Umar
- Department of Medical Profession, Faculty of Medicine, Universitas Sriwijaya, Palembang, Indonesia
| | - Nityanand Jain
- Faculty of Medicine, Riga Stradinš University, Riga, Latvia
| | | | | | | | - Muhammad Muzzamil
- Department of Public Health, Health Services Academy, Islamabad, Pakistan, and
| | - Andrejs Kostiks
- Department of Neurology, Riga East University Clinical Hospital, Riga, Latvia
| |
Collapse
|
13
|
Straka Z, Svoboda T, Hoffmann M. PreCNet: Next-Frame Video Prediction Based on Predictive Coding. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:10353-10367. [PMID: 37022810 DOI: 10.1109/tnnls.2023.3240857] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Predictive coding, currently a highly influential theory in neuroscience, has not been widely adopted in machine learning yet. In this work, we transform the seminal model of Rao and Ballard (1999) into a modern deep learning framework while remaining maximally faithful to the original schema. The resulting network we propose (PreCNet) is tested on a widely used next-frame video prediction benchmark, which consists of images from an urban environment recorded from a car-mounted camera, and achieves state-of-the-art performance. Performance on all measures (MSE, PSNR, and SSIM) was further improved when a larger training set (2M images from BDD100k) pointed to the limitations of the KITTI training set. This work demonstrates that an architecture carefully based on a neuroscience model, without being explicitly tailored to the task at hand, can exhibit exceptional performance.
Collapse
|
14
|
Lee K, Dora S, Mejias JF, Bohte SM, Pennartz CMA. Predictive coding with spiking neurons and feedforward gist signaling. Front Comput Neurosci 2024; 18:1338280. [PMID: 38680678 PMCID: PMC11045951 DOI: 10.3389/fncom.2024.1338280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 03/14/2024] [Indexed: 05/01/2024] Open
Abstract
Predictive coding (PC) is an influential theory in neuroscience, which suggests the existence of a cortical architecture that is constantly generating and updating predictive representations of sensory inputs. Owing to its hierarchical and generative nature, PC has inspired many computational models of perception in the literature. However, the biological plausibility of existing models has not been sufficiently explored due to their use of artificial neurons that approximate neural activity with firing rates in the continuous time domain and propagate signals synchronously. Therefore, we developed a spiking neural network for predictive coding (SNN-PC), in which neurons communicate using event-driven and asynchronous spikes. Adopting the hierarchical structure and Hebbian learning algorithms from previous PC neural network models, SNN-PC introduces two novel features: (1) a fast feedforward sweep from the input to higher areas, which generates a spatially reduced and abstract representation of input (i.e., a neural code for the gist of a scene) and provides a neurobiological alternative to an arbitrary choice of priors; and (2) a separation of positive and negative error-computing neurons, which counters the biological implausibility of a bi-directional error neuron with a very high baseline firing rate. After training with the MNIST handwritten digit dataset, SNN-PC developed hierarchical internal representations and was able to reconstruct samples it had not seen during training. SNN-PC suggests biologically plausible mechanisms by which the brain may perform perceptual inference and learning in an unsupervised manner. In addition, it may be used in neuromorphic applications that can utilize its energy-efficient, event-driven, local learning, and parallel information processing nature.
Collapse
Affiliation(s)
- Kwangjun Lee
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, Faculty of Science, University of Amsterdam, Amsterdam, Netherlands
| | - Shirin Dora
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, Faculty of Science, University of Amsterdam, Amsterdam, Netherlands
- Department of Computer Science, School of Science, Loughborough University, Loughborough, United Kingdom
| | - Jorge F. Mejias
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, Faculty of Science, University of Amsterdam, Amsterdam, Netherlands
| | - Sander M. Bohte
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, Faculty of Science, University of Amsterdam, Amsterdam, Netherlands
- Machine Learning Group, Centre of Mathematics and Computer Science, Amsterdam, Netherlands
| | - Cyriel M. A. Pennartz
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, Faculty of Science, University of Amsterdam, Amsterdam, Netherlands
| |
Collapse
|
15
|
Mograbi DC, Hall S, Arantes B, Huntley J. The cognitive neuroscience of self-awareness: Current framework, clinical implications, and future research directions. WILEY INTERDISCIPLINARY REVIEWS. COGNITIVE SCIENCE 2024; 15:e1670. [PMID: 38043919 DOI: 10.1002/wcs.1670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 11/04/2023] [Accepted: 11/06/2023] [Indexed: 12/05/2023]
Abstract
Self-awareness, the ability to take oneself as the object of awareness, has been an enigma for our species, with different answers to this question being provided by religion, philosophy, and, more recently, science. The current review aims to discuss the neurocognitive mechanisms underlying self-awareness. The multidimensional nature of self-awareness will be explored, suggesting how it can be thought of as an emergent property observed in different cognitive complexity levels, within a predictive coding approach. A presentation of alterations of self-awareness in neuropsychiatric conditions will ground a discussion on alternative frameworks to understand this phenomenon, in health and psychopathology, with future research directions being indicated to fill current gaps in the literature. This article is categorized under: Philosophy > Consciousness Psychology > Brain Function and Dysfunction Neuroscience > Cognition.
Collapse
Affiliation(s)
- Daniel C Mograbi
- Department of Psychology, Pontifical Catholic University of Rio de Janeiro, Rio de Janeiro, Brazil
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
| | - Simon Hall
- Camden and Islington NHS Foundation Trust, London, UK
| | - Beatriz Arantes
- Department of Psychology, Pontifical Catholic University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Jonathan Huntley
- Division of Psychiatry, University College London, London, UK
- Faculty of Health and Life Sciences, University of Exeter, Exeter, UK
| |
Collapse
|
16
|
Jiang LP, Rao RPN. Dynamic predictive coding: A model of hierarchical sequence learning and prediction in the neocortex. PLoS Comput Biol 2024; 20:e1011801. [PMID: 38330098 PMCID: PMC10880975 DOI: 10.1371/journal.pcbi.1011801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2023] [Revised: 02/21/2024] [Accepted: 01/04/2024] [Indexed: 02/10/2024] Open
Abstract
We introduce dynamic predictive coding, a hierarchical model of spatiotemporal prediction and sequence learning in the neocortex. The model assumes that higher cortical levels modulate the temporal dynamics of lower levels, correcting their predictions of dynamics using prediction errors. As a result, lower levels form representations that encode sequences at shorter timescales (e.g., a single step) while higher levels form representations that encode sequences at longer timescales (e.g., an entire sequence). We tested this model using a two-level neural network, where the top-down modulation creates low-dimensional combinations of a set of learned temporal dynamics to explain input sequences. When trained on natural videos, the lower-level model neurons developed space-time receptive fields similar to those of simple cells in the primary visual cortex while the higher-level responses spanned longer timescales, mimicking temporal response hierarchies in the cortex. Additionally, the network's hierarchical sequence representation exhibited both predictive and postdictive effects resembling those observed in visual motion processing in humans (e.g., in the flash-lag illusion). When coupled with an associative memory emulating the role of the hippocampus, the model allowed episodic memories to be stored and retrieved, supporting cue-triggered recall of an input sequence similar to activity recall in the visual cortex. When extended to three hierarchical levels, the model learned progressively more abstract temporal representations along the hierarchy. Taken together, our results suggest that cortical processing and learning of sequences can be interpreted as dynamic predictive coding based on a hierarchical spatiotemporal generative model of the visual world.
Collapse
Affiliation(s)
- Linxing Preston Jiang
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, Washington, United States of America
- Center for Neurotechnology, University of Washington, Seattle, Washington, United States of America
- Computational Neuroscience Center, University of Washington, Seattle, Washington, United States of America
| | - Rajesh P. N. Rao
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, Washington, United States of America
- Center for Neurotechnology, University of Washington, Seattle, Washington, United States of America
- Computational Neuroscience Center, University of Washington, Seattle, Washington, United States of America
| |
Collapse
|
17
|
Turner W, Sexton C, Hogendoorn H. Neural mechanisms of visual motion extrapolation. Neurosci Biobehav Rev 2024; 156:105484. [PMID: 38036162 DOI: 10.1016/j.neubiorev.2023.105484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 11/21/2023] [Accepted: 11/25/2023] [Indexed: 12/02/2023]
Abstract
Because neural processing takes time, the brain only has delayed access to sensory information. When localising moving objects this is problematic, as an object will have moved on by the time its position has been determined. Here, we consider predictive motion extrapolation as a fundamental delay-compensation strategy. From a population-coding perspective, we outline how extrapolation can be achieved by a forwards shift in the population-level activity distribution. We identify general mechanisms underlying such shifts, involving various asymmetries which facilitate the targeted 'enhancement' and/or 'dampening' of population-level activity. We classify these on the basis of their potential implementation (intra- vs inter-regional processes) and consider specific examples in different visual regions. We consider how motion extrapolation can be achieved during inter-regional signaling, and how asymmetric connectivity patterns which support extrapolation can emerge spontaneously from local synaptic learning rules. Finally, we consider how more abstract 'model-based' predictive strategies might be implemented. Overall, we present an integrative framework for understanding how the brain determines the real-time position of moving objects, despite neural delays.
Collapse
Affiliation(s)
- William Turner
- Queensland University of Technology, Brisbane 4059, Australia; The University of Melbourne, Melbourne 3010, Australia.
| | | | - Hinze Hogendoorn
- Queensland University of Technology, Brisbane 4059, Australia; The University of Melbourne, Melbourne 3010, Australia
| |
Collapse
|
18
|
Cheng FL, Horikawa T, Majima K, Tanaka M, Abdelhack M, Aoki SC, Hirano J, Kamitani Y. Reconstructing visual illusory experiences from human brain activity. SCIENCE ADVANCES 2023; 9:eadj3906. [PMID: 37967184 PMCID: PMC10651116 DOI: 10.1126/sciadv.adj3906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Accepted: 10/13/2023] [Indexed: 11/17/2023]
Abstract
Visual illusions provide valuable insights into the brain's interpretation of the world given sensory inputs. However, the precise manner in which brain activity translates into illusory experiences remains largely unknown. Here, we leverage a brain decoding technique combined with deep neural network (DNN) representations to reconstruct illusory percepts as images from brain activity. The reconstruction model was trained on natural images to establish a link between brain activity and perceptual features and then tested on two types of illusions: illusory lines and neon color spreading. Reconstructions revealed lines and colors consistent with illusory experiences, which varied across the source visual cortical areas. This framework offers a way to materialize subjective experiences, shedding light on the brain's internal representations of the world.
Collapse
Affiliation(s)
- Fan L. Cheng
- Graduate School of Informatics, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan
- ATR Computational Neuroscience Laboratories, Soraku, Kyoto 619-0288, Japan
| | - Tomoyasu Horikawa
- ATR Computational Neuroscience Laboratories, Soraku, Kyoto 619-0288, Japan
| | - Kei Majima
- Graduate School of Informatics, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan
| | - Misato Tanaka
- Graduate School of Informatics, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan
| | - Mohamed Abdelhack
- Graduate School of Informatics, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan
| | - Shuntaro C. Aoki
- Graduate School of Informatics, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan
| | - Jin Hirano
- Graduate School of Informatics, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan
| | - Yukiyasu Kamitani
- Graduate School of Informatics, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan
- ATR Computational Neuroscience Laboratories, Soraku, Kyoto 619-0288, Japan
| |
Collapse
|
19
|
Chapman GW, Hasselmo ME. Predictive learning by a burst-dependent learning rule. Neurobiol Learn Mem 2023; 205:107826. [PMID: 37696414 DOI: 10.1016/j.nlm.2023.107826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 08/05/2023] [Accepted: 09/03/2023] [Indexed: 09/13/2023]
Abstract
Humans and other animals are able to quickly generalize latent dynamics of spatiotemporal sequences, often from a minimal number of previous experiences. Additionally, internal representations of external stimuli must remain stable, even in the presence of sensory noise, in order to be useful for informing behavior. In contrast, typical machine learning approaches require many thousands of samples, and generalize poorly to unexperienced examples, or fail completely to predict at long timescales. Here, we propose a novel neural network module which incorporates hierarchy and recurrent feedback terms, constituting a simplified model of neocortical microcircuits. This microcircuit predicts spatiotemporal trajectories at the input layer using a temporal error minimization algorithm. We show that this module is able to predict with higher accuracy into the future compared to traditional models. Investigating this model we find that successive predictive models learn representations which are increasingly removed from the raw sensory space, namely as successive temporal derivatives of the positional information. Next, we introduce a spiking neural network model which implements the rate-model through the use of a recently proposed biological learning rule utilizing dual-compartment neurons. We show that this network performs well on the same tasks as the mean-field models, by developing intrinsic dynamics that follow the dynamics of the external stimulus, while coordinating transmission of higher-order dynamics. Taken as a whole, these findings suggest that hierarchical temporal abstraction of sequences, rather than feed-forward reconstruction, may be responsible for the ability of neural systems to quickly adapt to novel situations.
Collapse
Affiliation(s)
- G William Chapman
- Center for Systems Neuroscience, Boston University, Boston, MA, USA.
| | | |
Collapse
|
20
|
Singer Y, Taylor L, Willmore BDB, King AJ, Harper NS. Hierarchical temporal prediction captures motion processing along the visual pathway. eLife 2023; 12:e52599. [PMID: 37844199 PMCID: PMC10629830 DOI: 10.7554/elife.52599] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Accepted: 10/04/2023] [Indexed: 10/18/2023] Open
Abstract
Visual neurons respond selectively to features that become increasingly complex from the eyes to the cortex. Retinal neurons prefer flashing spots of light, primary visual cortical (V1) neurons prefer moving bars, and those in higher cortical areas favor complex features like moving textures. Previously, we showed that V1 simple cell tuning can be accounted for by a basic model implementing temporal prediction - representing features that predict future sensory input from past input (Singer et al., 2018). Here, we show that hierarchical application of temporal prediction can capture how tuning properties change across at least two levels of the visual system. This suggests that the brain does not efficiently represent all incoming information; instead, it selectively represents sensory inputs that help in predicting the future. When applied hierarchically, temporal prediction extracts time-varying features that depend on increasingly high-level statistics of the sensory input.
Collapse
Affiliation(s)
- Yosef Singer
- Department of Physiology, Anatomy and Genetics, University of OxfordOxfordUnited Kingdom
| | - Luke Taylor
- Department of Physiology, Anatomy and Genetics, University of OxfordOxfordUnited Kingdom
| | - Ben DB Willmore
- Department of Physiology, Anatomy and Genetics, University of OxfordOxfordUnited Kingdom
| | - Andrew J King
- Department of Physiology, Anatomy and Genetics, University of OxfordOxfordUnited Kingdom
| | - Nicol S Harper
- Department of Physiology, Anatomy and Genetics, University of OxfordOxfordUnited Kingdom
| |
Collapse
|
21
|
Lee S, Park Y, Liu P, Kim M, Kim HU, Kim T. Artificial-Neural-Network-Driven Innovations in Time-Varying Process Diagnosis of Low-K Oxide Deposition. SENSORS (BASEL, SWITZERLAND) 2023; 23:8226. [PMID: 37837056 PMCID: PMC10575315 DOI: 10.3390/s23198226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 09/24/2023] [Accepted: 09/30/2023] [Indexed: 10/15/2023]
Abstract
To address the challenges in real-time process diagnosis within the semiconductor manufacturing industry, this paper presents a novel machine learning approach for analyzing the time-varying 10th harmonics during the deposition of low-k oxide (SiOF) on a 600 Å undoped silicate glass thin liner using a high-density plasma chemical vapor deposition system. The 10th harmonics, which are high-frequency components 10 times the fundamental frequency, are generated in the plasma sheath because of their nonlinear nature. An artificial neural network with a three-hidden-layer architecture was applied and optimized using k-fold cross-validation to analyze the harmonics generated in the plasma sheath during the deposition process. The model exhibited a binary cross-entropy loss of 0.1277 and achieved an accuracy of 0.9461. This approach enables the accurate prediction of process performance, resulting in significant cost reduction and enhancement of semiconductor manufacturing processes. This model has the potential to improve defect control and yield, thereby benefiting the semiconductor industry. Despite the limitations imposed by the limited dataset, the model demonstrated promising results, and further performance improvements are anticipated with the inclusion of additional data in future studies.
Collapse
Affiliation(s)
- Seunghwan Lee
- School of Mechanical Engineering, Sungkyunkwan University, Suwon 16419, Republic of Korea; (S.L.); (Y.P.); (P.L.)
| | - Yonggyun Park
- School of Mechanical Engineering, Sungkyunkwan University, Suwon 16419, Republic of Korea; (S.L.); (Y.P.); (P.L.)
| | - Pengzhan Liu
- School of Mechanical Engineering, Sungkyunkwan University, Suwon 16419, Republic of Korea; (S.L.); (Y.P.); (P.L.)
| | - Muyoung Kim
- Department of Plasma Engineering, Korea Institute of Machinery and Materials (KIMM), Daejeon 34103, Republic of Korea;
| | - Hyeong-U Kim
- Department of Plasma Engineering, Korea Institute of Machinery and Materials (KIMM), Daejeon 34103, Republic of Korea;
| | - Taesung Kim
- School of Mechanical Engineering, Sungkyunkwan University, Suwon 16419, Republic of Korea; (S.L.); (Y.P.); (P.L.)
- SKKU Advanced Institute of Nanotechnology (SAINT), Sungkyunkwan University, Suwon 16419, Republic of Korea
| |
Collapse
|
22
|
Brucklacher M, Bohté SM, Mejias JF, Pennartz CMA. Local minimization of prediction errors drives learning of invariant object representations in a generative network model of visual perception. Front Comput Neurosci 2023; 17:1207361. [PMID: 37818157 PMCID: PMC10561268 DOI: 10.3389/fncom.2023.1207361] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 08/31/2023] [Indexed: 10/12/2023] Open
Abstract
The ventral visual processing hierarchy of the cortex needs to fulfill at least two key functions: perceived objects must be mapped to high-level representations invariantly of the precise viewing conditions, and a generative model must be learned that allows, for instance, to fill in occluded information guided by visual experience. Here, we show how a multilayered predictive coding network can learn to recognize objects from the bottom up and to generate specific representations via a top-down pathway through a single learning rule: the local minimization of prediction errors. Trained on sequences of continuously transformed objects, neurons in the highest network area become tuned to object identity invariant of precise position, comparable to inferotemporal neurons in macaques. Drawing on this, the dynamic properties of invariant object representations reproduce experimentally observed hierarchies of timescales from low to high levels of the ventral processing stream. The predicted faster decorrelation of error-neuron activity compared to representation neurons is of relevance for the experimental search for neural correlates of prediction errors. Lastly, the generative capacity of the network is confirmed by reconstructing specific object images, robust to partial occlusion of the inputs. By learning invariance from temporal continuity within a generative model, the approach generalizes the predictive coding framework to dynamic inputs in a more biologically plausible way than self-supervised networks with non-local error-backpropagation. This was achieved simply by shifting the training paradigm to dynamic inputs, with little change in architecture and learning rule from static input-reconstructing Hebbian predictive coding networks.
Collapse
Affiliation(s)
- Matthias Brucklacher
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, Netherlands
| | - Sander M. Bohté
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, Netherlands
- Machine Learning Group, Centrum Wiskunde & Informatica, Amsterdam, Netherlands
| | - Jorge F. Mejias
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, Netherlands
| | - Cyriel M. A. Pennartz
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, Netherlands
| |
Collapse
|
23
|
Doerig A, Sommers RP, Seeliger K, Richards B, Ismael J, Lindsay GW, Kording KP, Konkle T, van Gerven MAJ, Kriegeskorte N, Kietzmann TC. The neuroconnectionist research programme. Nat Rev Neurosci 2023:10.1038/s41583-023-00705-w. [PMID: 37253949 DOI: 10.1038/s41583-023-00705-w] [Citation(s) in RCA: 47] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/21/2023] [Indexed: 06/01/2023]
Abstract
Artificial neural networks (ANNs) inspired by biology are beginning to be widely used to model behavioural and neural data, an approach we call 'neuroconnectionism'. ANNs have been not only lauded as the current best models of information processing in the brain but also criticized for failing to account for basic cognitive functions. In this Perspective article, we propose that arguing about the successes and failures of a restricted set of current ANNs is the wrong approach to assess the promise of neuroconnectionism for brain science. Instead, we take inspiration from the philosophy of science, and in particular from Lakatos, who showed that the core of a scientific research programme is often not directly falsifiable but should be assessed by its capacity to generate novel insights. Following this view, we present neuroconnectionism as a general research programme centred around ANNs as a computational language for expressing falsifiable theories about brain computation. We describe the core of the programme, the underlying computational framework and its tools for testing specific neuroscientific hypotheses and deriving novel understanding. Taking a longitudinal view, we review past and present neuroconnectionist projects and their responses to challenges and argue that the research programme is highly progressive, generating new and otherwise unreachable insights into the workings of the brain.
Collapse
Affiliation(s)
- Adrien Doerig
- Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany.
- Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands.
| | - Rowan P Sommers
- Department of Neurobiology of Language, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | - Katja Seeliger
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Blake Richards
- Department of Neurology and Neurosurgery, McGill University, Montréal, QC, Canada
- School of Computer Science, McGill University, Montréal, QC, Canada
- Mila, Montréal, QC, Canada
- Montréal Neurological Institute, Montréal, QC, Canada
- Learning in Machines and Brains Program, CIFAR, Toronto, ON, Canada
| | | | | | - Konrad P Kording
- Learning in Machines and Brains Program, CIFAR, Toronto, ON, Canada
- Bioengineering, Neuroscience, University of Pennsylvania, Pennsylvania, PA, USA
| | | | | | | | - Tim C Kietzmann
- Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany
| |
Collapse
|
24
|
Aceituno PV, Farinha MT, Loidl R, Grewe BF. Learning cortical hierarchies with temporal Hebbian updates. Front Comput Neurosci 2023; 17:1136010. [PMID: 37293353 PMCID: PMC10244748 DOI: 10.3389/fncom.2023.1136010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Accepted: 04/25/2023] [Indexed: 06/10/2023] Open
Abstract
A key driver of mammalian intelligence is the ability to represent incoming sensory information across multiple abstraction levels. For example, in the visual ventral stream, incoming signals are first represented as low-level edge filters and then transformed into high-level object representations. Similar hierarchical structures routinely emerge in artificial neural networks (ANNs) trained for object recognition tasks, suggesting that similar structures may underlie biological neural networks. However, the classical ANN training algorithm, backpropagation, is considered biologically implausible, and thus alternative biologically plausible training methods have been developed such as Equilibrium Propagation, Deep Feedback Control, Supervised Predictive Coding, and Dendritic Error Backpropagation. Several of those models propose that local errors are calculated for each neuron by comparing apical and somatic activities. Notwithstanding, from a neuroscience perspective, it is not clear how a neuron could compare compartmental signals. Here, we propose a solution to this problem in that we let the apical feedback signal change the postsynaptic firing rate and combine this with a differential Hebbian update, a rate-based version of classical spiking time-dependent plasticity (STDP). We prove that weight updates of this form minimize two alternative loss functions that we prove to be equivalent to the error-based losses used in machine learning: the inference latency and the amount of top-down feedback necessary. Moreover, we show that the use of differential Hebbian updates works similarly well in other feedback-based deep learning frameworks such as Predictive Coding or Equilibrium Propagation. Finally, our work removes a key requirement of biologically plausible models for deep learning and proposes a learning mechanism that would explain how temporal Hebbian learning rules can implement supervised hierarchical learning.
Collapse
Affiliation(s)
- Pau Vilimelis Aceituno
- Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, Switzerland
- ETH AI Center, ETH Zurich, Zurich, Switzerland
| | | | - Reinhard Loidl
- Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Benjamin F. Grewe
- Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, Switzerland
- ETH AI Center, ETH Zurich, Zurich, Switzerland
| |
Collapse
|
25
|
Zhao R, Hu Z, Wang X, Tao P, Wang Y, Liu T, Wei Y, Xu H, He X. Process optimization of line patterns in extreme ultraviolet lithography using machine learning and a simulated annealing algorithm. APPLIED OPTICS 2023; 62:2892-2898. [PMID: 37133133 DOI: 10.1364/ao.485006] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Resolution, line edge/width roughness, and sensitivity (RLS) are critical indicators for evaluating the imaging performance of resists. As the technology node gradually shrinks, stricter indicator control is required for high-resolution imaging. However, current research can improve only part of the RLS indicators of resists for line patterns, and it is difficult to improve the overall imaging performance of resists in extreme ultraviolet lithography. Here, we report a lithographic process optimization system of line patterns, where RLS models are first established by adopting a machine learning method, and then these models are optimized using a simulated annealing algorithm. Finally, the process parameter combination with optimal imaging quality of line patterns can be obtained. This system can control resist RLS indicators, and it exhibits high optimization accuracy, which facilitates the reduction of process optimization time and cost and accelerates the development of the lithography process.
Collapse
|
26
|
Lindeberg T. A time-causal and time-recursive scale-covariant scale-space representation of temporal signals and past time. BIOLOGICAL CYBERNETICS 2023; 117:21-59. [PMID: 36689001 PMCID: PMC10160219 DOI: 10.1007/s00422-022-00953-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Accepted: 11/21/2022] [Indexed: 05/05/2023]
Abstract
This article presents an overview of a theory for performing temporal smoothing on temporal signals in such a way that: (i) temporally smoothed signals at coarser temporal scales are guaranteed to constitute simplifications of corresponding temporally smoothed signals at any finer temporal scale (including the original signal) and (ii) the temporal smoothing process is both time-causal and time-recursive, in the sense that it does not require access to future information and can be performed with no other temporal memory buffer of the past than the resulting smoothed temporal scale-space representations themselves. For specific subsets of parameter settings for the classes of linear and shift-invariant temporal smoothing operators that obey this property, it is shown how temporal scale covariance can be additionally obtained, guaranteeing that if the temporal input signal is rescaled by a uniform temporal scaling factor, then also the resulting temporal scale-space representations of the rescaled temporal signal will constitute mere rescalings of the temporal scale-space representations of the original input signal, complemented by a shift along the temporal scale dimension. The resulting time-causal limit kernel that obeys this property constitutes a canonical temporal kernel for processing temporal signals in real-time scenarios when the regular Gaussian kernel cannot be used, because of its non-causal access to information from the future, and we cannot additionally require the temporal smoothing process to comprise a complementary memory of the past beyond the information contained in the temporal smoothing process itself, which in this way also serves as a multi-scale temporal memory of the past. We describe how the time-causal limit kernel relates to previously used temporal models, such as Koenderink's scale-time kernels and the ex-Gaussian kernel. We do also give an overview of how the time-causal limit kernel can be used for modelling the temporal processing in models for spatio-temporal and spectro-temporal receptive fields, and how it more generally has a high potential for modelling neural temporal response functions in a purely time-causal and time-recursive way, that can also handle phenomena at multiple temporal scales in a theoretically well-founded manner. We detail how this theory can be efficiently implemented for discrete data, in terms of a set of recursive filters coupled in cascade. Hence, the theory is generally applicable for both: (i) modelling continuous temporal phenomena over multiple temporal scales and (ii) digital processing of measured temporal signals in real time. We conclude by stating implications of the theory for modelling temporal phenomena in biological, perceptual, neural and memory processes by mathematical models, as well as implications regarding the philosophy of time and perceptual agents. Specifically, we propose that for A-type theories of time, as well as for perceptual agents, the notion of a non-infinitesimal inner temporal scale of the temporal receptive fields has to be included in representations of the present, where the inherent nonzero temporal delay of such time-causal receptive fields implies a need for incorporating predictions from the actual time-delayed present in the layers of a perceptual hierarchy, to make it possible for a representation of the perceptual present to constitute a representation of the environment with timing properties closer to the actual present.
Collapse
Affiliation(s)
- Tony Lindeberg
- Computational Brain Science Lab, Division of Computational Science and Technology, KTH Royal Institute of Technology, 100 44, Stockholm, Sweden.
| |
Collapse
|
27
|
Fan J, Zeng Y. Challenging deep learning models with image distortion based on the abutting grating illusion. PATTERNS (NEW YORK, N.Y.) 2023; 4:100695. [PMID: 36960449 PMCID: PMC10028432 DOI: 10.1016/j.patter.2023.100695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 11/07/2022] [Accepted: 02/01/2023] [Indexed: 03/06/2023]
Abstract
Even state-of-the-art deep learning models lack fundamental abilities compared with humans. While many image distortions have been proposed to compare deep learning with humans, they depend on mathematical transformations instead of human cognitive functions. Here, we propose an image distortion based on the abutting grating illusion, which is a phenomenon discovered in humans and animals. The distortion generates illusory contour perception using line gratings abutting each other. We applied the method to MNIST, high-resolution MNIST, and "16-class-ImageNet" silhouettes. Many models, including models trained from scratch and 109 models pretrained with ImageNet or various data augmentation techniques, were tested. Our results show that abutting grating distortion is challenging even for state-of-the-art deep learning models. We discovered that DeepAugment models outperformed other pretrained models. Visualization of early layers indicates that better-performing models exhibit the endstopping property, which is consistent with neuroscience discoveries. Twenty-four human subjects classified distorted samples to validate the distortion.
Collapse
Affiliation(s)
- Jinyu Fan
- Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
| | - Yi Zeng
- Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
- National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
- School of Future Technology, University of Chinese Academy of Sciences, Beijing 100049, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China
- Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai 200031, China
- Corresponding author
| |
Collapse
|
28
|
Kirubeswaran OR, Storrs KR. Inconsistent illusory motion in predictive coding deep neural networks. Vision Res 2023; 206:108195. [PMID: 36801664 DOI: 10.1016/j.visres.2023.108195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 01/31/2023] [Accepted: 01/31/2023] [Indexed: 02/19/2023]
Abstract
Why do we perceive illusory motion in some static images? Several accounts point to eye movements, response latencies to different image elements, or interactions between image patterns and motion energy detectors. Recently PredNet, a recurrent deep neural network (DNN) based on predictive coding principles, was reported to reproduce the "Rotating Snakes" illusion, suggesting a role for predictive coding. We begin by replicating this finding, then use a series of "in silico" psychophysics and electrophysiology experiments to examine whether PredNet behaves consistently with human observers and non-human primate neural data. A pretrained PredNet predicted illusory motion for all subcomponents of the Rotating Snakes pattern, consistent with human observers. However, we found no simple response delays in internal units, unlike evidence from electrophysiological data. PredNet's detection of motion in gradients seemed dependent on contrast, but depends predominantly on luminance in humans. Finally, we examined the robustness of the illusion across ten PredNets of identical architecture, retrained on the same video data. There was large variation across network instances in whether they reproduced the Rotating Snakes illusion, and what motion, if any, they predicted for simplified variants. Unlike human observers, no network predicted motion for greyscale variants of the Rotating Snakes pattern. Our results sound a cautionary note: even when a DNN successfully reproduces some idiosyncrasy of human vision, more detailed investigation can reveal inconsistencies between humans and the network, and between different instances of the same network. These inconsistencies suggest that predictive coding does not reliably give rise to human-like illusory motion.
Collapse
Affiliation(s)
| | - Katherine R Storrs
- Department of Experimental Psychology, Justus Liebig University Giessen, Germany; Centre for Mind, Brain and Behaviour (CMBB), University of Marburg and Justus Liebig University Giessen, Germany; School of Psychology, University of Auckland, New Zealand
| |
Collapse
|
29
|
Zhao R, Wei Y, Xu H, He X. Process optimization of contact hole patterns via a simulated annealing algorithm in extreme ultraviolet lithography. APPLIED OPTICS 2023; 62:927-932. [PMID: 36821146 DOI: 10.1364/ao.479619] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 12/30/2022] [Indexed: 06/18/2023]
Abstract
The critical dimension (CD), roughness, and sensitivity are extremely significant indicators for evaluating the imaging performance of photoresists in extreme ultraviolet lithography. As the CD gradually shrinks, tighter indicator control is required for high fidelity imaging. However, current research primarily focuses on the optimization of one indicator of one-dimensional line patterns, and little attention has been paid to two-dimensional patterns. Here, we report an image quality optimization method of two-dimensional contact holes. This method takes horizontal and vertical contact widths, contact edge roughness, and sensitivity as evaluation indicators, and uses machine learning to establish the corresponding relationship between process parameters and each indicator. Then, the simulated annealing algorithm is applied to search for the optimal process parameters, and finally, a set of process parameters with optimum image quality is obtained. Rigorous imaging results of lithography demonstrate that this method has very high optimization accuracy and can improve the overall performance of the device, dramatically accelerating the development of the lithography process.
Collapse
|
30
|
Bracci S, Op de Beeck HP. Understanding Human Object Vision: A Picture Is Worth a Thousand Representations. Annu Rev Psychol 2023; 74:113-135. [PMID: 36378917 DOI: 10.1146/annurev-psych-032720-041031] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Objects are the core meaningful elements in our visual environment. Classic theories of object vision focus upon object recognition and are elegant and simple. Some of their proposals still stand, yet the simplicity is gone. Recent evolutions in behavioral paradigms, neuroscientific methods, and computational modeling have allowed vision scientists to uncover the complexity of the multidimensional representational space that underlies object vision. We review these findings and propose that the key to understanding this complexity is to relate object vision to the full repertoire of behavioral goals that underlie human behavior, running far beyond object recognition. There might be no such thing as core object recognition, and if it exists, then its importance is more limited than traditionally thought.
Collapse
Affiliation(s)
- Stefania Bracci
- Center for Mind/Brain Sciences, University of Trento, Rovereto, Italy;
| | - Hans P Op de Beeck
- Leuven Brain Institute, Research Unit Brain & Cognition, KU Leuven, Leuven, Belgium;
| |
Collapse
|
31
|
de Lange FP, Schmitt LM, Heilbron M. Reconstructing the predictive architecture of the mind and brain. Trends Cogn Sci 2022; 26:1018-1019. [PMID: 36150970 DOI: 10.1016/j.tics.2022.08.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 06/18/2022] [Accepted: 08/15/2022] [Indexed: 01/12/2023]
Abstract
Predictive processing has become an influential framework in cognitive neuroscience. However, it often lacks specificity and direct empirical support. How can we probe the nature and limits of the predictive brain? We highlight the potential of recent advances in artificial intelligence (AI) for providing a richer and more computationally explicit test of this theory of cortical function.
Collapse
Affiliation(s)
- Floris P de Lange
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Kapittelweg 29, 6525, EN, Nijmegen, The Netherlands.
| | - Lea-Maria Schmitt
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Kapittelweg 29, 6525, EN, Nijmegen, The Netherlands
| | - Micha Heilbron
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Kapittelweg 29, 6525, EN, Nijmegen, The Netherlands
| |
Collapse
|
32
|
Bowers JS, Malhotra G, Dujmović M, Llera Montero M, Tsvetkov C, Biscione V, Puebla G, Adolfi F, Hummel JE, Heaton RF, Evans BD, Mitchell J, Blything R. Deep problems with neural network models of human vision. Behav Brain Sci 2022; 46:e385. [PMID: 36453586 DOI: 10.1017/s0140525x22002813] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
Deep neural networks (DNNs) have had extraordinary successes in classifying photographic images of objects and are often described as the best models of biological vision. This conclusion is largely based on three sets of findings: (1) DNNs are more accurate than any other model in classifying images taken from various datasets, (2) DNNs do the best job in predicting the pattern of human errors in classifying objects taken from various behavioral datasets, and (3) DNNs do the best job in predicting brain signals in response to images taken from various brain datasets (e.g., single cell responses or fMRI data). However, these behavioral and brain datasets do not test hypotheses regarding what features are contributing to good predictions and we show that the predictions may be mediated by DNNs that share little overlap with biological vision. More problematically, we show that DNNs account for almost no results from psychological research. This contradicts the common claim that DNNs are good, let alone the best, models of human object recognition. We argue that theorists interested in developing biologically plausible models of human vision need to direct their attention to explaining psychological findings. More generally, theorists need to build models that explain the results of experiments that manipulate independent variables designed to test hypotheses rather than compete on making the best predictions. We conclude by briefly summarizing various promising modeling approaches that focus on psychological data.
Collapse
Affiliation(s)
- Jeffrey S Bowers
- School of Psychological Science, University of Bristol, Bristol, UK ; https://jeffbowers.blogs.bristol.ac.uk/
| | - Gaurav Malhotra
- School of Psychological Science, University of Bristol, Bristol, UK ; https://jeffbowers.blogs.bristol.ac.uk/
| | - Marin Dujmović
- School of Psychological Science, University of Bristol, Bristol, UK ; https://jeffbowers.blogs.bristol.ac.uk/
| | - Milton Llera Montero
- School of Psychological Science, University of Bristol, Bristol, UK ; https://jeffbowers.blogs.bristol.ac.uk/
| | - Christian Tsvetkov
- School of Psychological Science, University of Bristol, Bristol, UK ; https://jeffbowers.blogs.bristol.ac.uk/
| | - Valerio Biscione
- School of Psychological Science, University of Bristol, Bristol, UK ; https://jeffbowers.blogs.bristol.ac.uk/
| | - Guillermo Puebla
- School of Psychological Science, University of Bristol, Bristol, UK ; https://jeffbowers.blogs.bristol.ac.uk/
| | - Federico Adolfi
- School of Psychological Science, University of Bristol, Bristol, UK ; https://jeffbowers.blogs.bristol.ac.uk/
- Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, Frankfurt am Main, Germany
| | - John E Hummel
- Department of Psychology, University of Illinois Urbana-Champaign, Champaign, IL, USA
| | - Rachel F Heaton
- Department of Psychology, University of Illinois Urbana-Champaign, Champaign, IL, USA
| | - Benjamin D Evans
- Department of Informatics, School of Engineering and Informatics, University of Sussex, Brighton, UK
| | - Jeffrey Mitchell
- Department of Informatics, School of Engineering and Informatics, University of Sussex, Brighton, UK
| | - Ryan Blything
- School of Psychology, Aston University, Birmingham, UK
| |
Collapse
|
33
|
Ali A, Ahmad N, de Groot E, Johannes van Gerven MA, Kietzmann TC. Predictive coding is a consequence of energy efficiency in recurrent neural networks. PATTERNS (NEW YORK, N.Y.) 2022; 3:100639. [PMID: 36569556 PMCID: PMC9768680 DOI: 10.1016/j.patter.2022.100639] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 12/24/2021] [Accepted: 10/27/2022] [Indexed: 11/24/2022]
Abstract
Predictive coding is a promising framework for understanding brain function. It postulates that the brain continuously inhibits predictable sensory input, ensuring preferential processing of surprising elements. A central aspect of this view is its hierarchical connectivity, involving recurrent message passing between excitatory bottom-up signals and inhibitory top-down feedback. Here we use computational modeling to demonstrate that such architectural hardwiring is not necessary. Rather, predictive coding is shown to emerge as a consequence of energy efficiency. When training recurrent neural networks to minimize their energy consumption while operating in predictive environments, the networks self-organize into prediction and error units with appropriate inhibitory and excitatory interconnections and learn to inhibit predictable sensory input. Moving beyond the view of purely top-down-driven predictions, we demonstrate, via virtual lesioning experiments, that networks perform predictions on two timescales: fast lateral predictions among sensory units and slower prediction cycles that integrate evidence over time.
Collapse
Affiliation(s)
- Abdullahi Ali
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands,Corresponding author
| | - Nasir Ahmad
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands
| | - Elgar de Groot
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands,Department of Experimental Psychology, Utrecht University, Utrecht, the Netherlands
| | | | - Tim Christian Kietzmann
- Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany,Corresponding author
| |
Collapse
|
34
|
Neural representational geometry underlies few-shot concept learning. Proc Natl Acad Sci U S A 2022; 119:e2200800119. [PMID: 36251997 DOI: 10.1073/pnas.2200800119] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Understanding the neural basis of the remarkable human cognitive capacity to learn novel concepts from just one or a few sensory experiences constitutes a fundamental problem. We propose a simple, biologically plausible, mathematically tractable, and computationally powerful neural mechanism for few-shot learning of naturalistic concepts. We posit that the concepts that can be learned from few examples are defined by tightly circumscribed manifolds in the neural firing-rate space of higher-order sensory areas. We further posit that a single plastic downstream readout neuron learns to discriminate new concepts based on few examples using a simple plasticity rule. We demonstrate the computational power of our proposal by showing that it can achieve high few-shot learning accuracy on natural visual concepts using both macaque inferotemporal cortex representations and deep neural network (DNN) models of these representations and can even learn novel visual concepts specified only through linguistic descriptors. Moreover, we develop a mathematical theory of few-shot learning that links neurophysiology to predictions about behavioral outcomes by delineating several fundamental and measurable geometric properties of neural representations that can accurately predict the few-shot learning performance of naturalistic concepts across all our numerical simulations. This theory reveals, for instance, that high-dimensional manifolds enhance the ability to learn new concepts from few examples. Intriguingly, we observe striking mismatches between the geometry of manifolds in the primate visual pathway and in trained DNNs. We discuss testable predictions of our theory for psychophysics and neurophysiological experiments.
Collapse
|
35
|
Analysis and Dynamic Monitoring of Indoor Air Quality Based on Laser-Induced Breakdown Spectroscopy and Machine Learning. CHEMOSENSORS 2022. [DOI: 10.3390/chemosensors10070259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
The air quality of the living area influences human health to a certain extent. Therefore, it is particularly important to detect the quality of indoor air. However, traditional detection methods mainly depend on chemical analysis, which has long been criticized for its high time cost. In this research, a rapid air detection method for the indoor environment using laser-induced breakdown spectroscopy (LIBS) and machine learning was proposed. Four common scenes were simulated, including burning carbon, burning incense, spraying perfume and hot shower which often led to indoor air quality changes. Two steps of spectral measurements and algorithm analysis were used in the experiment. Moreover, the proposed method was found to be effective in distinguishing different kinds of aerosols and presenting sensitivity to the air compositions. In this paper, the signal was isolated by the forest, so the singular values were filtered out. Meanwhile, the spectra of different scenarios were analyzed via the principal component analysis (PCA), and the air environment was classified by K-Nearest Neighbor (KNN) algorithm with an accuracy of 99.2%. Moreover, based on the establishment of a high-precision quantitative detection model, a back propagation (BP) neural network was introduced to improve the robustness and accuracy of indoor environment. The results show that by taking this method, the dynamic prediction of elements concentration can be realized, and its recognition accuracy is 96.5%.
Collapse
|
36
|
Zhang G, Zhang X, Rong H, Paul P, Zhu M, Neri F, Ong YS. A Layered Spiking Neural System for Classification Problems. Int J Neural Syst 2022; 32:2250023. [PMID: 35416762 DOI: 10.1142/s012906572250023x] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Biological brains have a natural capacity for resolving certain classification tasks. Studies on biologically plausible spiking neurons, architectures and mechanisms of artificial neural systems that closely match biological observations while giving high classification performance are gaining momentum. Spiking neural P systems (SN P systems) are a class of membrane computing models and third-generation neural networks that are based on the behavior of biological neural cells and have been used in various engineering applications. Furthermore, SN P systems are characterized by a highly flexible structure that enables the design of a machine learning algorithm by mimicking the structure and behavior of biological cells without the over-simplification present in neural networks. Based on this aspect, this paper proposes a novel type of SN P system, namely, layered SN P system (LSN P system), to solve classification problems by supervised learning. The proposed LSN P system consists of a multi-layer network containing multiple weighted fuzzy SN P systems with adaptive weight adjustment rules. The proposed system employs specific ascending dimension techniques and a selection method of output neurons for classification problems. The experimental results obtained using benchmark datasets from the UCI machine learning repository and MNIST dataset demonstrated the feasibility and effectiveness of the proposed LSN P system. More importantly, the proposed LSN P system presents the first SN P system that demonstrates sufficient performance for use in addressing real-world classification problems.
Collapse
Affiliation(s)
- Gexiang Zhang
- School of Control Engineering, Chengdu University of Information Technology, Chengdu 610225, P. R. China
| | - Xihai Zhang
- School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072, P. R. China
| | - Haina Rong
- School of Electrical Engineering, Southwest Jiaotong University, Chengdu 610031, P. R. China
| | - Prithwineel Paul
- School of Control Engineering, Chengdu University of Information Technology, Chengdu 610225, P. R. China
| | - Ming Zhu
- School of Control Engineering, Chengdu University of Information Technology, Chengdu 610225, P. R. China
| | - Ferrante Neri
- NICE Group, Department of Computer Science, University of Surrey, UK
| | - Yew-Soon Ong
- School of Computer Science and Engineering, Nanyang Technological University, Singapore
| |
Collapse
|
37
|
Kobayashi T, Kitaoka A, Kosaka M, Tanaka K, Watanabe E. Motion illusion-like patterns extracted from photo and art images using predictive deep neural networks. Sci Rep 2022; 12:3893. [PMID: 35273206 PMCID: PMC8913633 DOI: 10.1038/s41598-022-07438-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Accepted: 02/18/2022] [Indexed: 11/09/2022] Open
Abstract
In our previous study, we successfully reproduced the illusory motion perceived in the rotating snakes illusion using deep neural networks incorporating predictive coding theory. In the present study, we further examined the properties of the network using a set of 1500 images, including ordinary static images of paintings and photographs and images of various types of motion illusions. Results showed that the networks clearly classified a group of illusory images and others and reproduced illusory motions against various types of illusions similar to human perception. Notably, the networks occasionally detected anomalous motion vectors, even in ordinally static images where humans were unable to perceive any illusory motion. Additionally, illusion-like designs with repeating patterns were generated using areas where anomalous vectors were detected, and psychophysical experiments were conducted, in which illusory motion perception in the generated designs was detected. The observed inaccuracy of the networks will provide useful information for further understanding information processing associated with human vision.
Collapse
Affiliation(s)
- Taisuke Kobayashi
- Laboratory of Neurophysiology, National Institute for Basic Biology, Higashiyama 5-1, Myodaiji-cho, Okazaki, Aichi, 444-8787, Japan.
| | - Akiyoshi Kitaoka
- College of Comprehensive Psychology, Ritsumeikan University, Iwakura-cho 2-150, Ibaraki, Osaka, 567-8570, Japan
| | - Manabu Kosaka
- Code_monsters group, Laboratory of Neurophysiology, National Institute for Basic Biology, Higashiyama 5-1, Myodaiji-cho, Okazaki, Aichi, 444-8787, Japan
| | - Kenta Tanaka
- Code_monsters group, Laboratory of Neurophysiology, National Institute for Basic Biology, Higashiyama 5-1, Myodaiji-cho, Okazaki, Aichi, 444-8787, Japan
| | - Eiji Watanabe
- Laboratory of Neurophysiology, National Institute for Basic Biology, Higashiyama 5-1, Myodaiji-cho, Okazaki, Aichi, 444-8787, Japan. .,Department of Basic Biology, The Graduate University for Advanced Studies (SOKENDAI), Miura, Kanagawa, 240-0193, Japan.
| |
Collapse
|
38
|
Alipour A, Beggs JM, Brown JW, James TW. A computational examination of the two-streams hypothesis: which pathway needs a longer memory? Cogn Neurodyn 2022; 16:149-165. [PMID: 35126775 PMCID: PMC8807798 DOI: 10.1007/s11571-021-09703-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Revised: 06/26/2021] [Accepted: 07/14/2021] [Indexed: 02/03/2023] Open
Abstract
The two visual streams hypothesis is a robust example of neural functional specialization that has inspired countless studies over the past four decades. According to one prominent version of the theory, the fundamental goal of the dorsal visual pathway is the transformation of retinal information for visually-guided motor behavior. To that end, the dorsal stream processes input using absolute (or veridical) metrics only when the movement is initiated, necessitating very little, or no, memory. Conversely, because the ventral visual pathway does not involve motor behavior (its output does not influence the real world), the ventral stream processes input using relative (or illusory) metrics and can accumulate or integrate sensory evidence over long time constants, which provides a substantial capacity for memory. In this study, we tested these relations between functional specialization, processing metrics, and memory by training identical recurrent neural networks to perform either a viewpoint-invariant object classification task or an orientation/size determination task. The former task relies on relative metrics, benefits from accumulating sensory evidence, and is usually attributed to the ventral stream. The latter task relies on absolute metrics, can be computed accurately in the moment, and is usually attributed to the dorsal stream. To quantify the amount of memory required for each task, we chose two types of neural network models. Using a long-short-term memory (LSTM) recurrent network, we found that viewpoint-invariant object categorization (object task) required a longer memory than orientation/size determination (orientation task). Additionally, to dissect this memory effect, we considered factors that contributed to longer memory in object tasks. First, we used two different sets of objects, one with self-occlusion of features and one without. Second, we defined object classes either strictly by visual feature similarity or (more liberally) by semantic label. The models required greater memory when features were self-occluded and when object classes were defined by visual feature similarity, showing that self-occlusion and visual similarity among object task samples are contributing to having a long memory. The same set of tasks modeled using modified leaky-integrator echo state recurrent networks (LiESN), however, did not replicate the results, except under some conditions. This may be because LiESNs cannot perform fine-grained memory adjustments due to their network-wide memory coefficient and fixed recurrent weights. In sum, the LSTM simulations suggest that longer memory is advantageous for performing viewpoint-invariant object classification (a putative ventral stream function) because it allows for interpolation of features across viewpoints. The results further suggest that orientation/size determination (a putative dorsal stream function) does not benefit from longer memory. These findings are consistent with the two visual streams theory of functional specialization. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s11571-021-09703-z.
Collapse
Affiliation(s)
- Abolfazl Alipour
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN USA
- Program in Neuroscience, Indiana University, Bloomington, IN USA
| | - John M Beggs
- Program in Neuroscience, Indiana University, Bloomington, IN USA
- Department of Physics, Indiana University, Bloomington, IN USA
| | - Joshua W Brown
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN USA
- Program in Neuroscience, Indiana University, Bloomington, IN USA
| | - Thomas W James
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN USA
- Program in Neuroscience, Indiana University, Bloomington, IN USA
| |
Collapse
|
39
|
Konkle T, Alvarez GA. A self-supervised domain-general learning framework for human ventral stream representation. Nat Commun 2022; 13:491. [PMID: 35078981 PMCID: PMC8789817 DOI: 10.1038/s41467-022-28091-4] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Accepted: 12/13/2021] [Indexed: 12/25/2022] Open
Abstract
Anterior regions of the ventral visual stream encode substantial information about object categories. Are top-down category-level forces critical for arriving at this representation, or can this representation be formed purely through domain-general learning of natural image structure? Here we present a fully self-supervised model which learns to represent individual images, rather than categories, such that views of the same image are embedded nearby in a low-dimensional feature space, distinctly from other recently encountered views. We find that category information implicitly emerges in the local similarity structure of this feature space. Further, these models learn hierarchical features which capture the structure of brain responses across the human ventral visual stream, on par with category-supervised models. These results provide computational support for a domain-general framework guiding the formation of visual representation, where the proximate goal is not explicitly about category information, but is instead to learn unique, compressed descriptions of the visual world.
Collapse
Affiliation(s)
- Talia Konkle
- Department of Psychology & Center for Brain Science, Harvard University, Cambridge, MA, USA.
| | - George A Alvarez
- Department of Psychology & Center for Brain Science, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
40
|
Sun ED, Dekel R. ImageNet-trained deep neural networks exhibit illusion-like response to the Scintillating grid. J Vis 2021; 21:15. [PMID: 34677575 PMCID: PMC8543405 DOI: 10.1167/jov.21.11.15] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Deep neural network (DNN) models for computer vision are capable of human-level object recognition. Consequently, similarities between DNN and human vision are of interest. Here, we characterize DNN representations of Scintillating grid visual illusion images in which white disks are perceived to be partially black. Specifically, we use VGG-19 and ResNet-101 DNN models that were trained for image classification and consider the representational dissimilarity (\(L^1\) distance in the penultimate layer) between pairs of images: one with white Scintillating grid disks and the other with disks of decreasing luminance levels. Results showed a nonmonotonic relation, such that decreasing disk luminance led to an increase and subsequently a decrease in representational dissimilarity. That is, the Scintillating grid image with white disks was closer, in terms of the representation, to images with black disks than images with gray disks. In control nonillusion images, such nonmonotonicity was rare. These results suggest that nonmonotonicity in a deep computational representation is a potential test for illusion-like response geometry in DNN models.
Collapse
Affiliation(s)
- Eric D Sun
- Mather House, Harvard University, Cambridge, MA, USA.,
| | - Ron Dekel
- Department of Neurobiology, Weizmann Institute of Science, Rehovot, PA, Israel.,
| |
Collapse
|
41
|
Storrs KR, Fleming RW. Learning About the World by Learning About Images. CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE 2021. [DOI: 10.1177/0963721421990334] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
One of the deepest insights in neuroscience is that sensory encoding should take advantage of statistical regularities. Humans’ visual experience contains many redundancies: Scenes mostly stay the same from moment to moment, and nearby image locations usually have similar colors. A visual system that knows which regularities shape natural images can exploit them to encode scenes compactly or guess what will happen next. Although these principles have been appreciated for more than 60 years, until recently it has been possible to convert them into explicit models only for the earliest stages of visual processing. But recent advances in unsupervised deep learning have changed that. Neural networks can be taught to compress images or make predictions in space or time. In the process, they learn the statistical regularities that structure images, which in turn often reflect physical objects and processes in the outside world. The astonishing accomplishments of unsupervised deep learning reaffirm the importance of learning statistical regularities for sensory coding and provide a coherent framework for how knowledge of the outside world gets into visual cortex.
Collapse
Affiliation(s)
| | - Roland W. Fleming
- Department of Experimental Psychology, Justus Liebig University Giessen
- Centre for Mind, Brain and Behaviour (CMBB), University of Marburg and Justus Liebig University Giessen
| |
Collapse
|
42
|
Zhuang C, Yan S, Nayebi A, Schrimpf M, Frank MC, DiCarlo JJ, Yamins DLK. Unsupervised neural network models of the ventral visual stream. Proc Natl Acad Sci U S A 2021; 118:e2014196118. [PMID: 33431673 PMCID: PMC7826371 DOI: 10.1073/pnas.2014196118] [Citation(s) in RCA: 131] [Impact Index Per Article: 32.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Deep neural networks currently provide the best quantitative models of the response patterns of neurons throughout the primate ventral visual stream. However, such networks have remained implausible as a model of the development of the ventral stream, in part because they are trained with supervised methods requiring many more labels than are accessible to infants during development. Here, we report that recent rapid progress in unsupervised learning has largely closed this gap. We find that neural network models learned with deep unsupervised contrastive embedding methods achieve neural prediction accuracy in multiple ventral visual cortical areas that equals or exceeds that of models derived using today's best supervised methods and that the mapping of these neural network models' hidden layers is neuroanatomically consistent across the ventral stream. Strikingly, we find that these methods produce brain-like representations even when trained solely with real human child developmental data collected from head-mounted cameras, despite the fact that these datasets are noisy and limited. We also find that semisupervised deep contrastive embeddings can leverage small numbers of labeled examples to produce representations with substantially improved error-pattern consistency to human behavior. Taken together, these results illustrate a use of unsupervised learning to provide a quantitative model of a multiarea cortical brain system and present a strong candidate for a biologically plausible computational theory of primate sensory learning.
Collapse
Affiliation(s)
- Chengxu Zhuang
- Department of Psychology, Stanford University, Stanford, CA 94305;
| | - Siming Yan
- Department of Computer Science, The University of Texas at Austin, Austin, TX 78712
| | - Aran Nayebi
- Neurosciences PhD Program, Stanford University, Stanford, CA 94305
| | - Martin Schrimpf
- Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Michael C Frank
- Department of Psychology, Stanford University, Stanford, CA 94305
| | - James J DiCarlo
- Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Daniel L K Yamins
- Department of Psychology, Stanford University, Stanford, CA 94305
- Department of Computer Science, Stanford University, Stanford, CA 94305
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA 94305
| |
Collapse
|
43
|
Khan SR, Al Rijjal D, Piro A, Wheeler MB. Integration of AI and traditional medicine in drug discovery. Drug Discov Today 2021; 26:982-992. [PMID: 33476566 DOI: 10.1016/j.drudis.2021.01.008] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 12/01/2020] [Accepted: 01/11/2021] [Indexed: 11/24/2022]
Abstract
AI integration in plant-based traditional medicine could be used to overcome drug discovery challenges.
Collapse
Affiliation(s)
- Saifur R Khan
- Endocrine and Diabetes Platform, Department of Physiology, University of Toronto, Medical Sciences Building, Room 3352, 1 King's College Circle, Toronto, ON M5S 1A8, Canada; Advanced Diagnostics, Metabolism, Toronto General Hospital Research Institute, Toronto, ON, Canada.
| | - Dana Al Rijjal
- Endocrine and Diabetes Platform, Department of Physiology, University of Toronto, Medical Sciences Building, Room 3352, 1 King's College Circle, Toronto, ON M5S 1A8, Canada; Advanced Diagnostics, Metabolism, Toronto General Hospital Research Institute, Toronto, ON, Canada
| | - Anthony Piro
- Endocrine and Diabetes Platform, Department of Physiology, University of Toronto, Medical Sciences Building, Room 3352, 1 King's College Circle, Toronto, ON M5S 1A8, Canada; Advanced Diagnostics, Metabolism, Toronto General Hospital Research Institute, Toronto, ON, Canada
| | - Michael B Wheeler
- Endocrine and Diabetes Platform, Department of Physiology, University of Toronto, Medical Sciences Building, Room 3352, 1 King's College Circle, Toronto, ON M5S 1A8, Canada; Advanced Diagnostics, Metabolism, Toronto General Hospital Research Institute, Toronto, ON, Canada
| |
Collapse
|
44
|
Chen X, Zhou M, Gong Z, Xu W, Liu X, Huang T, Zhen Z, Liu J. DNNBrain: A Unifying Toolbox for Mapping Deep Neural Networks and Brains. Front Comput Neurosci 2020; 14:580632. [PMID: 33328946 PMCID: PMC7734148 DOI: 10.3389/fncom.2020.580632] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 10/27/2020] [Indexed: 01/24/2023] Open
Abstract
Deep neural networks (DNNs) have attained human-level performance on dozens of challenging tasks via an end-to-end deep learning strategy. Deep learning allows data representations that have multiple levels of abstraction; however, it does not explicitly provide any insights into the internal operations of DNNs. Deep learning's success is appealing to neuroscientists not only as a method for applying DNNs to model biological neural systems but also as a means of adopting concepts and methods from cognitive neuroscience to understand the internal representations of DNNs. Although general deep learning frameworks, such as PyTorch and TensorFlow, could be used to allow such cross-disciplinary investigations, the use of these frameworks typically requires high-level programming expertise and comprehensive mathematical knowledge. A toolbox specifically designed as a mechanism for cognitive neuroscientists to map both DNNs and brains is urgently needed. Here, we present DNNBrain, a Python-based toolbox designed for exploring the internal representations of DNNs as well as brains. Through the integration of DNN software packages and well-established brain imaging tools, DNNBrain provides application programming and command line interfaces for a variety of research scenarios. These include extracting DNN activation, probing and visualizing DNN representations, and mapping DNN representations onto the brain. We expect that our toolbox will accelerate scientific research by both applying DNNs to model biological neural systems and utilizing paradigms of cognitive neuroscience to unveil the black box of DNNs.
Collapse
Affiliation(s)
- Xiayu Chen
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China
| | - Ming Zhou
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China
| | - Zhengxin Gong
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China
| | - Wei Xu
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China
| | - Xingyu Liu
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China
| | - Taicheng Huang
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China
| | - Zonglei Zhen
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China
| | - Jia Liu
- Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, China
| |
Collapse
|
45
|
Vinken K, Boix X, Kreiman G. Incorporating intrinsic suppression in deep neural networks captures dynamics of adaptation in neurophysiology and perception. SCIENCE ADVANCES 2020; 6:eabd4205. [PMID: 33055170 PMCID: PMC7556832 DOI: 10.1126/sciadv.abd4205] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/21/2020] [Accepted: 08/26/2020] [Indexed: 06/11/2023]
Abstract
Adaptation is a fundamental property of sensory systems that can change subjective experiences in the context of recent information. Adaptation has been postulated to arise from recurrent circuit mechanisms or as a consequence of neuronally intrinsic suppression. However, it is unclear whether intrinsic suppression by itself can account for effects beyond reduced responses. Here, we test the hypothesis that complex adaptation phenomena can emerge from intrinsic suppression cascading through a feedforward model of visual processing. A deep convolutional neural network with intrinsic suppression captured neural signatures of adaptation including novelty detection, enhancement, and tuning curve shifts, while producing aftereffects consistent with human perception. When adaptation was trained in a task where repeated input affects recognition performance, an intrinsic mechanism generalized better than a recurrent neural network. Our results demonstrate that feedforward propagation of intrinsic suppression changes the functional state of the network, reproducing key neurophysiological and perceptual properties of adaptation.
Collapse
Affiliation(s)
- K Vinken
- Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, USA.
- Center for Brains, Minds and Machines, Cambridge, MA 02139, USA
- Laboratory for Neuro- and Psychophysiology, Department of Neurosciences, KU Leuven, 3000, Leuven, Belgium
| | - X Boix
- Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, USA
- Center for Brains, Minds and Machines, Cambridge, MA 02139, USA
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, USA
| | - G Kreiman
- Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, USA
- Center for Brains, Minds and Machines, Cambridge, MA 02139, USA
| |
Collapse
|