1
|
Chengaiyan S, Retnapandian AS, Anandan K. Identification of vowels in consonant-vowel-consonant words from speech imagery based EEG signals. Cogn Neurodyn 2019; 14:1-19. [PMID: 32015764 DOI: 10.1007/s11571-019-09558-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Revised: 09/18/2019] [Accepted: 09/30/2019] [Indexed: 10/25/2022] Open
Abstract
Retrieval of unintelligible speech is a basic need for speech impaired and is under research for several decades. But retrieval of random words from thoughts needs a substantial and consistent approach. This work focuses on the preliminary steps of retrieving vowels from Electroencephalography (EEG) signals acquired while speaking and imagining of speaking a consonant-vowel-consonant (CVC) word. The process, referred to as Speech imagery is imagining of speaking to oneself silently in the mind. Speech imagery is a form of mental imagery. Brain connectivity estimators such as EEG coherence, Partial Directed Coherence, Directed Transfer Function and Transfer Entropy have been used to estimate the concurrency and causal dependence (direction and strength) between different brain regions. From brain connectivity results it has been observed that the left frontal and left temporal electrodes were activated for speech and speech imagery processes. These brain connectivity estimators have been used for training Recurrent Neural Networks (RNN) and Deep Belief Networks (DBN) for identifying the vowel from the subject's thought. Though the accuracy level was found to be varying for each vowel while speaking and imagining of speaking the CVC word, the overall classification accuracy was found to be 72% while using RNN whereas a classification accuracy of 80% was observed while using DBN. DBN was found to outperform RNN in both the speech and speech imagery processes. Thus, the combination of brain connectivity estimators and deep learning techniques appear to be effective in identifying the vowel from EEG signals of subjects' thought.
Collapse
Affiliation(s)
- Sandhya Chengaiyan
- Department of Biomedical Engineering, Centre for Healthcare Technologies, SSN College of Engineering, Chennai, Tamilnadu India
| | - Anandha Sree Retnapandian
- Department of Biomedical Engineering, Centre for Healthcare Technologies, SSN College of Engineering, Chennai, Tamilnadu India
| | - Kavitha Anandan
- Department of Biomedical Engineering, Centre for Healthcare Technologies, SSN College of Engineering, Chennai, Tamilnadu India
| |
Collapse
|
2
|
Ordin M. Speech rhythm as naturally occurring and culturally transmitted behavioral patterns. Ann N Y Acad Sci 2019; 1453:5-11. [PMID: 31502260 DOI: 10.1111/nyas.14234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Revised: 08/19/2019] [Accepted: 08/27/2019] [Indexed: 11/29/2022]
Abstract
Rhythm is fundamental to every motor activity. Neural and physiological mechanisms that underlie rhythmic cognition, in general, and rhythmic pattern generation, in particular, are evolutionarily ancient. As speech production is a kind of motor activity, investigating speech rhythm can provide insight into how general motor patterns have been adapted for more specific use in articulation and speech production. Studies on speech rhythm may further provide insight into the development of speech capacity in humans. As speech capacity is putatively a prerequisite for developing a language faculty, studies on speech rhythm may cast some light on the mystery of language evolution in the human genus. Hereby, we propose an approach to exploring speech rhythm as a window on speech emergence in ontogenesis and phylogenesis, as well as on diachronic linguistic changes.
Collapse
Affiliation(s)
- Mikhail Ordin
- Basque Centre on Cognition, Brain and Language (BCBL) and Ikerbasque - Basque Foundation for Science
| |
Collapse
|
3
|
Vander Ghinst M, Bourguignon M, Op de Beeck M, Wens V, Marty B, Hassid S, Choufani G, Jousmäki V, Hari R, Van Bogaert P, Goldman S, De Tiège X. Left Superior Temporal Gyrus Is Coupled to Attended Speech in a Cocktail-Party Auditory Scene. J Neurosci 2016; 36:1596-606. [PMID: 26843641 DOI: 10.1523/JNEUROSCI.1730-15.2016] [Citation(s) in RCA: 69] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
UNLABELLED Using a continuous listening task, we evaluated the coupling between the listener's cortical activity and the temporal envelopes of different sounds in a multitalker auditory scene using magnetoencephalography and corticovocal coherence analysis. Neuromagnetic signals were recorded from 20 right-handed healthy adult humans who listened to five different recorded stories (attended speech streams), one without any multitalker background (No noise) and four mixed with a "cocktail party" multitalker background noise at four signal-to-noise ratios (5, 0, -5, and -10 dB) to produce speech-in-noise mixtures, here referred to as Global scene. Coherence analysis revealed that the modulations of the attended speech stream, presented without multitalker background, were coupled at ∼0.5 Hz to the activity of both superior temporal gyri, whereas the modulations at 4-8 Hz were coupled to the activity of the right supratemporal auditory cortex. In cocktail party conditions, with the multitalker background noise, the coupling was at both frequencies stronger for the attended speech stream than for the unattended Multitalker background. The coupling strengths decreased as the Multitalker background increased. During the cocktail party conditions, the ∼0.5 Hz coupling became left-hemisphere dominant, compared with bilateral coupling without the multitalker background, whereas the 4-8 Hz coupling remained right-hemisphere lateralized in both conditions. The brain activity was not coupled to the multitalker background or to its individual talkers. The results highlight the key role of listener's left superior temporal gyri in extracting the slow ∼0.5 Hz modulations, likely reflecting the attended speech stream within a multitalker auditory scene. SIGNIFICANCE STATEMENT When people listen to one person in a "cocktail party," their auditory cortex mainly follows the attended speech stream rather than the entire auditory scene. However, how the brain extracts the attended speech stream from the whole auditory scene and how increasing background noise corrupts this process is still debated. In this magnetoencephalography study, subjects had to attend a speech stream with or without multitalker background noise. Results argue for frequency-dependent cortical tracking mechanisms for the attended speech stream. The left superior temporal gyrus tracked the ∼0.5 Hz modulations of the attended speech stream only when the speech was embedded in multitalker background, whereas the right supratemporal auditory cortex tracked 4-8 Hz modulations during both noiseless and cocktail-party conditions.
Collapse
|
4
|
Abstract
Work-related exposure to noise and other ototoxins can cause damage to the cochlea, synapses between the inner hair cells, the auditory nerve fibers, and higher auditory pathways, leading to difficulties in recognizing speech. Procedures designed to determine speech recognition scores (SRS) in an objective manner can be helpful in disability compensation cases where the worker claims to have poor speech perception due to exposure to noise or ototoxins. Such measures can also be helpful in determining SRS in individuals who cannot provide reliable responses to speech stimuli, including patients with Alzheimer's disease, traumatic brain injuries, and infants with and without hearing loss. Cost-effective neural monitoring hardware and software is being rapidly refined due to the high demand for neurogaming (games involving the use of brain-computer interfaces), health, and other applications. More specifically, two related advances in neuro-technology include relative ease in recording neural activity and availability of sophisticated analysing techniques. These techniques are reviewed in the current article and their applications for developing objective SRS procedures are proposed. Issues related to neuroaudioethics (ethics related to collection of neural data evoked by auditory stimuli including speech) and neurosecurity (preservation of a person's neural mechanisms and free will) are also discussed.
Collapse
Affiliation(s)
- Vishakha Waman Rawool
- a Department of Communication Sciences & Disorders , West Virginia University , Morgantown , USA
| |
Collapse
|
5
|
Crangle CE, Perreau-Guimaraes M, Suppes P. Structural similarities between brain and linguistic data provide evidence of semantic relations in the brain. PLoS One 2013; 8:e65366. [PMID: 23799009 PMCID: PMC3682999 DOI: 10.1371/journal.pone.0065366] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2012] [Accepted: 04/30/2013] [Indexed: 11/23/2022] Open
Abstract
This paper presents a new method of analysis by which structural similarities between brain data and linguistic data can be assessed at the semantic level. It shows how to measure the strength of these structural similarities and so determine the relatively better fit of the brain data with one semantic model over another. The first model is derived from WordNet, a lexical database of English compiled by language experts. The second is given by the corpus-based statistical technique of latent semantic analysis (LSA), which detects relations between words that are latent or hidden in text. The brain data are drawn from experiments in which statements about the geography of Europe were presented auditorily to participants who were asked to determine their truth or falsity while electroencephalographic (EEG) recordings were made. The theoretical framework for the analysis of the brain and semantic data derives from axiomatizations of theories such as the theory of differences in utility preference. Using brain-data samples from individual trials time-locked to the presentation of each word, ordinal relations of similarity differences are computed for the brain data and for the linguistic data. In each case those relations that are invariant with respect to the brain and linguistic data, and are correlated with sufficient statistical strength, amount to structural similarities between the brain and linguistic data. Results show that many more statistically significant structural similarities can be found between the brain data and the WordNet-derived data than the LSA-derived data. The work reported here is placed within the context of other recent studies of semantics and the brain. The main contribution of this paper is the new method it presents for the study of semantics and the brain and the focus it permits on networks of relations detected in brain data and represented by a semantic model.
Collapse
Affiliation(s)
- Colleen E Crangle
- School of Computing and Communications, Lancaster University, Lancaster, United Kingdom.
| | | | | |
Collapse
|
6
|
Abstract
A key feature of speech is the quasi-regular rhythmic information contained in its slow amplitude modulations. In this article we review the information conveyed by speech rhythm, and the role of ongoing brain oscillations in listeners' processing of this content. Our starting point is the fact that speech is inherently temporal, and that rhythmic information conveyed by the amplitude envelope contains important markers for place and manner of articulation, segmental information, and speech rate. Behavioral studies demonstrate that amplitude envelope information is relied upon by listeners and plays a key role in speech intelligibility. Extending behavioral findings, data from neuroimaging - particularly electroencephalography (EEG) and magnetoencephalography (MEG) - point to phase locking by ongoing cortical oscillations to low-frequency information (~4-8 Hz) in the speech envelope. This phase modulation effectively encodes a prediction of when important events (such as stressed syllables) are likely to occur, and acts to increase sensitivity to these relevant acoustic cues. We suggest a framework through which such neural entrainment to speech rhythm can explain effects of speech rate on word and segment perception (i.e., that the perception of phonemes and words in connected speech is influenced by preceding speech rate). Neuroanatomically, acoustic amplitude modulations are processed largely bilaterally in auditory cortex, with intelligible speech resulting in differential recruitment of left-hemisphere regions. Notable among these is lateral anterior temporal cortex, which we propose functions in a domain-general fashion to support ongoing memory and integration of meaningful input. Together, the reviewed evidence suggests that low-frequency oscillations in the acoustic speech signal form the foundation of a rhythmic hierarchy supporting spoken language, mirrored by phase-locked oscillations in the human brain.
Collapse
Affiliation(s)
- Jonathan E. Peelle
- Center for Cognitive Neuroscience and Department of Neurology, University of PennsylvaniaPhiladelphia, PA, USA
| | - Matthew H. Davis
- Medical Research Council Cognition and Brain Sciences UnitCambridge, UK
| |
Collapse
|
7
|
Koskinen M, Viinikanoja J, Kurimo M, Klami A, Kaski S, Hari R. Identifying fragments of natural speech from the listener's MEG signals. Hum Brain Mapp 2012; 34:1477-89. [PMID: 22344824 DOI: 10.1002/hbm.22004] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2011] [Revised: 10/25/2011] [Accepted: 11/09/2011] [Indexed: 11/07/2022] Open
Abstract
It is a challenge for current signal analysis approaches to identify the electrophysiological brain signatures of continuous natural speech that the subject is listening to. To relate magnetoencephalographic (MEG) brain responses to the physical properties of such speech stimuli, we applied canonical correlation analysis (CCA) and a Bayesian mixture of CCA analyzers to extract MEG features related to the speech envelope. Seven healthy adults listened to news for an hour while their brain signals were recorded with whole-scalp MEG. We found shared signal time series (canonical variates) between the MEG signals and speech envelopes at 0.5-12 Hz. By splitting the test signals into equal-length fragments from 2 to 65 s (corresponding to 703 down to 21 pieces per the total speech stimulus) we obtained better than chance-level identification for speech fragments longer than 2-3 s, not used in the model training. The applied analysis approach thus allowed identification of segments of natural speech by means of partial reconstruction of the continuous speech envelope (i.e., the intensity variations of the speech sounds) from MEG responses, provided means to empirically assess the time scales obtainable in speech decoding with the canonical variates, and it demonstrated accurate identification of the heard speech fragments from the MEG data.
Collapse
Affiliation(s)
- Miika Koskinen
- Brain Research Unit, MEG Core, and Advanced Magnetic Imaging Centre, Low Temperature Laboratory, Aalto University, Finland.
| | | | | | | | | | | |
Collapse
|
8
|
Vassilieva E, Pinto G, Acacio de Barros J, Suppes P. Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators. ACTA ACUST UNITED AC 2011; 22:84-95. [DOI: 10.1109/tnn.2010.2086476] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
9
|
Chan AM, Halgren E, Marinkovic K, Cash SS. Decoding word and category-specific spatiotemporal representations from MEG and EEG. Neuroimage 2010; 54:3028-39. [PMID: 21040796 DOI: 10.1016/j.neuroimage.2010.10.073] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2010] [Revised: 10/20/2010] [Accepted: 10/26/2010] [Indexed: 10/18/2022] Open
Abstract
The organization and localization of lexico-semantic information in the brain has been debated for many years. Specifically, lesion and imaging studies have attempted to map the brain areas representing living versus nonliving objects, however, results remain variable. This may be due, in part, to the fact that the univariate statistical mapping analyses used to detect these brain areas are typically insensitive to subtle, but widespread, effects. Decoding techniques, on the other hand, allow for a powerful multivariate analysis of multichannel neural data. In this study, we utilize machine-learning algorithms to first demonstrate that semantic category, as well as individual words, can be decoded from EEG and MEG recordings of subjects performing a language task. Mean accuracies of 76% (chance=50%) and 83% (chance=20%) were obtained for the decoding of living vs. nonliving category or individual words respectively. Furthermore, we utilize this decoding analysis to demonstrate that the representations of words and semantic category are highly distributed both spatially and temporally. In particular, bilateral anterior temporal, bilateral inferior frontal, and left inferior temporal-occipital sensors are most important for discrimination. Successful intersubject and intermodality decoding shows that semantic representations between stimulus modalities and individuals are reasonably consistent. These results suggest that both word and category-specific information are present in extracranially recorded neural activity and that these representations may be more distributed, both spatially and temporally, than previous studies suggest.
Collapse
Affiliation(s)
- Alexander M Chan
- Medical Engineering and Medical Physics, Harvard-MIT Division of Health Sciences and Technology, Cambridge, MA, USA.
| | | | | | | |
Collapse
|
10
|
Abstract
Normal listeners possess the remarkable perceptual ability to select a single speech stream among many competing talkers. However, few studies of selective attention have addressed the unique nature of speech as a temporally extended and complex auditory object. We hypothesized that sustained selective attention to speech in a multitalker environment would act as gain control on the early auditory cortical representations of speech. Using high-density electroencephalography and a template-matching analysis method, we found selective gain to the continuous speech content of an attended talker, greatest at a frequency of 4-8 Hz, in auditory cortex. In addition, the difference in alpha power (8-12 Hz) at parietal sites across hemispheres indicated the direction of auditory attention to speech, as has been previously found in visual tasks. The strength of this hemispheric alpha lateralization, in turn, predicted an individual's attentional gain of the cortical speech signal. These results support a model of spatial speech stream segregation, mediated by a supramodal attention mechanism, enabling selection of the attended representation in auditory cortex.
Collapse
|
11
|
Suppes P, Perreau-Guimaraes M, Wong DK. Partial orders of similarity differences invariant between EEG-recorded brain and perceptual representations of language. Neural Comput 2009; 21:3228-69. [PMID: 19686069 DOI: 10.1162/neco.2009.04-08-764] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
The idea of a hierarchical structure of language constituents of phonemes, syllables, words, and sentences is robust and widely accepted. Empirical similarity differences at every level of this hierarchy have been analyzed in the form of confusion matrices for many years. By normalizing such data so that differences are represented by conditional probabilities, semiorders of similarity differences can be constructed. The intersection of two such orderings is an invariant partial ordering with respect to the two given orders. These invariant partial orderings, especially between perceptual and brain representations, but also for comparison of brain images of words generated by auditory or visual presentations, are the focus of this letter. Data from four experiments are analyzed, with some success in finding conceptually significant invariants.
Collapse
Affiliation(s)
- Patrick Suppes
- Center for the Study of Language and Information, Stanford University, Stanford, CA 94305-4101, USA.
| | | | | |
Collapse
|
12
|
Luo H, Poeppel D. Phase patterns of neuronal responses reliably discriminate speech in human auditory cortex. Neuron 2007; 54:1001-10. [PMID: 17582338 PMCID: PMC2703451 DOI: 10.1016/j.neuron.2007.06.004] [Citation(s) in RCA: 595] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2007] [Revised: 05/11/2007] [Accepted: 06/04/2007] [Indexed: 10/23/2022]
Abstract
How natural speech is represented in the auditory cortex constitutes a major challenge for cognitive neuroscience. Although many single-unit and neuroimaging studies have yielded valuable insights about the processing of speech and matched complex sounds, the mechanisms underlying the analysis of speech dynamics in human auditory cortex remain largely unknown. Here, we show that the phase pattern of theta band (4-8 Hz) responses recorded from human auditory cortex with magnetoencephalography (MEG) reliably tracks and discriminates spoken sentences and that this discrimination ability is correlated with speech intelligibility. The findings suggest that an approximately 200 ms temporal window (period of theta oscillation) segments the incoming speech signal, resetting and sliding to track speech dynamics. This hypothesized mechanism for cortical speech analysis is based on the stimulus-induced modulation of inherent cortical rhythms and provides further evidence implicating the syllable as a computational primitive for the representation of spoken language.
Collapse
Affiliation(s)
- Huan Luo
- Neuroscience and Cognitive Science Program, University of Maryland College Park, College Park MD 20742
- Department of Biology, University of Maryland College Park, College Park MD 20742
| | - David Poeppel
- Neuroscience and Cognitive Science Program, University of Maryland College Park, College Park MD 20742
- Department of Biology, University of Maryland College Park, College Park MD 20742
- Department of Linguistics, University of Maryland College Park, College Park MD 20742
| |
Collapse
|
13
|
Wong DK, Grosenick L, Uy ET, Perreau Guimaraes M, Carvalhaes CG, Desain P, Suppes P. Quantifying inter-subject agreement in brain-imaging analyses. Neuroimage 2007; 39:1051-63. [PMID: 18023210 DOI: 10.1016/j.neuroimage.2007.07.064] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2007] [Revised: 06/29/2007] [Accepted: 07/27/2007] [Indexed: 11/19/2022] Open
Abstract
In brain-imaging research, we are often interested in making quantitative claims about effects across subjects. Given that most imaging data consist of tens to thousands of spatially correlated time series, inter-subject comparisons are typically accomplished with simple combinations of inter-subject data, for example methods relying on group means. Further, these data are frequently taken from reduced channel subsets defined either a priori using anatomical considerations, or functionally using p-value thresholding to choose cluster boundaries. While such methods are effective for data reduction, means are sensitive to outliers, and current methods for subset selection can be somewhat arbitrary. Here, we introduce a novel "partial-ranking" approach to test for inter-subject agreement at the channel level. This non-parametric method effectively tests whether channel concordance is present across subjects, how many channels are necessary for maximum concordance, and which channels are responsible for this agreement. We validate the method on two previously published and two simulated EEG data sets.
Collapse
Affiliation(s)
- Dik Kin Wong
- Center for the Study of Language and Information, Ventura Hall, 200 Panama St., Stanford University, CA, USA.
| | | | | | | | | | | | | |
Collapse
|
14
|
Wong DK, Timothy Uy E, Perreau Guimaraes M, Yang W, Suppes P. Interpretation of perceptron weights as constructed time series for EEG classification. Neurocomputing 2006. [DOI: 10.1016/j.neucom.2006.01.020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
15
|
Abstract
Recent advances in human neuroimaging have shown that it is possible to accurately decode a person's conscious experience based only on non-invasive measurements of their brain activity. Such 'brain reading' has mostly been studied in the domain of visual perception, where it helps reveal the way in which individual experiences are encoded in the human brain. The same approach can also be extended to other types of mental state, such as covert attitudes and lie detection. Such applications raise important ethical issues concerning the privacy of personal thought.
Collapse
Affiliation(s)
- John-Dylan Haynes
- Max Planck Institute for Cognitive and Brain Sciences, Stephanstrasse 1a, 04103 Leipzig, Germany.
| | | |
Collapse
|
16
|
Wong DK, Perreau Guimaraes M, Timothy Uy E, Suppes P. Classification of individual trials based on the best independent component of EEG-recorded sentences. Neurocomputing 2004. [DOI: 10.1016/j.neucom.2004.06.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
17
|
Abstract
Data from three previous experiments were analyzed to test the hypothesis that brain waves of spoken or written words can be represented by the superposition of a few sine waves. First, we averaged the data over trials and a set of subjects, and, in one case, over experimental conditions as well. Next we applied a Fourier transform to the averaged data and selected those frequencies with high energy, in no case more than nine in number. The superpositions of these selected sine waves were taken as prototypes. The averaged unfiltered data were the test samples. The prototypes were used to classify the test samples according to a least-squares criterion of fit. The results were seven of seven correct classifications for the first experiment using only three frequencies, six of eight for the second experiment using nine frequencies, and eight of eight for the third experiment using five frequencies.
Collapse
Affiliation(s)
- P Suppes
- Center for the Study of Language and Information and Department of Management Science and Engineering, Stanford University, Stanford, CA 94305, USA.
| | | |
Collapse
|
18
|
Abstract
In two experiments, electric brain waves of 14 subjects were recorded under several different conditions to study the invariance of brain-wave representations of simple patches of colors and simple visual shapes and their names, the words blue, circle, etc. As in our earlier work, the analysis consisted of averaging over trials to create prototypes and test samples, to both of which Fourier transforms were applied, followed by filtering and an inverse transformation to the time domain. A least-squares criterion of fit between prototypes and test samples was used for classification. The most significant results were these. By averaging over different subjects, as well as trials, we created prototypes from brain waves evoked by simple visual images and test samples from brain waves evoked by auditory or visual words naming the visual images. We correctly recognized from 60% to 75% of the test-sample brain waves. The general conclusion is that simple shapes such as circles and single-color displays generate brain waves surprisingly similar to those generated by their verbal names. These results, taken together with extensive psychological studies of auditory and visual memory, strongly support the solution proposed for visual shapes, by Bishop Berkeley and David Hume in the 18th century, to the long-standing problem of how the mind represents simple abstract ideas.
Collapse
Affiliation(s)
- P Suppes
- Center for the Study of Language and Information, Stanford University, Stanford, CA 94305-4115, USA.
| | | | | | | |
Collapse
|