1
|
Regev J, Relaño-Iborra H, Zaar J, Dau T. Disentangling the effects of hearing loss and age on amplitude modulation frequency selectivity. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:2589-2602. [PMID: 38607268 DOI: 10.1121/10.0025541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 03/19/2024] [Indexed: 04/13/2024]
Abstract
The processing and perception of amplitude modulation (AM) in the auditory system reflect a frequency-selective process, often described as a modulation filterbank. Previous studies on perceptual AM masking reported similar results for older listeners with hearing impairment (HI listeners) and young listeners with normal hearing (NH listeners), suggesting no effects of age or hearing loss on AM frequency selectivity. However, recent evidence has shown that age, independently of hearing loss, adversely affects AM frequency selectivity. Hence, this study aimed to disentangle the effects of hearing loss and age. A simultaneous AM masking paradigm was employed, using a sinusoidal carrier at 2.8 kHz, narrowband noise modulation maskers, and target modulation frequencies of 4, 16, 64, and 128 Hz. The results obtained from young (n = 3, 24-30 years of age) and older (n = 10, 63-77 years of age) HI listeners were compared to previously obtained data from young and older NH listeners. Notably, the HI listeners generally exhibited lower (unmasked) AM detection thresholds and greater AM frequency selectivity than their NH counterparts in both age groups. Overall, the results suggest that age negatively affects AM frequency selectivity for both NH and HI listeners, whereas hearing loss improves AM detection and AM selectivity, likely due to the loss of peripheral compression.
Collapse
Affiliation(s)
- Jonathan Regev
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
| | - Helia Relaño-Iborra
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
| | - Johannes Zaar
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
- Eriksholm Research Centre, Snekkersten, 3070, Denmark
| | - Torsten Dau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
- Copenhagen Hearing and Balance Center, Copenhagen University Hospital, Rigshospitalet, Copenhagen, 2100, Denmark
| |
Collapse
|
2
|
Gulati D, Ray S. Auditory and Visual Gratings Elicit Distinct Gamma Responses. eNeuro 2024; 11:ENEURO.0116-24.2024. [PMID: 38604776 PMCID: PMC11046261 DOI: 10.1523/eneuro.0116-24.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Revised: 04/01/2024] [Accepted: 04/03/2024] [Indexed: 04/13/2024] Open
Abstract
Sensory stimulation is often accompanied by fluctuations at high frequencies (>30 Hz) in brain signals. These could be "narrowband" oscillations in the gamma band (30-70 Hz) or nonoscillatory "broadband" high-gamma (70-150 Hz) activity. Narrowband gamma oscillations, which are induced by presenting some visual stimuli such as gratings and have been shown to weaken with healthy aging and the onset of Alzheimer's disease, hold promise as potential biomarkers. However, since delivering visual stimuli is cumbersome as it requires head stabilization for eye tracking, an equivalent auditory paradigm could be useful. Although simple auditory stimuli have been shown to produce high-gamma activity, whether specific auditory stimuli can also produce narrowband gamma oscillations is unknown. We tested whether auditory ripple stimuli, which are considered an analog to visual gratings, could elicit narrowband oscillations in auditory areas. We recorded 64-channel electroencephalogram from male and female (18 each) subjects while they either fixated on the monitor while passively viewing static visual gratings or listened to stationary and moving ripples, played using loudspeakers, with their eyes open or closed. We found that while visual gratings induced narrowband gamma oscillations with suppression in the alpha band (8-12 Hz), auditory ripples did not produce narrowband gamma but instead elicited very strong broadband high-gamma response and suppression in the beta band (14-26 Hz). Even though we used equivalent stimuli in both modalities, our findings indicate that the underlying neuronal circuitry may not share ubiquitous strategies for stimulus processing.
Collapse
Affiliation(s)
- Divya Gulati
- Centre for Neuroscience, Indian Institute of Science, Bengaluru 560012, India
| | - Supratim Ray
- Centre for Neuroscience, Indian Institute of Science, Bengaluru 560012, India
| |
Collapse
|
3
|
van der Willigen RF, Versnel H, van Opstal AJ. Spectral-temporal processing of naturalistic sounds in monkeys and humans. J Neurophysiol 2024; 131:38-63. [PMID: 37965933 DOI: 10.1152/jn.00129.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 10/23/2023] [Accepted: 11/13/2023] [Indexed: 11/16/2023] Open
Abstract
Human speech and vocalizations in animals are rich in joint spectrotemporal (S-T) modulations, wherein acoustic changes in both frequency and time are functionally related. In principle, the primate auditory system could process these complex dynamic sounds based on either an inseparable representation of S-T features or, alternatively, a separable representation. The separability hypothesis implies an independent processing of spectral and temporal modulations. We collected comparative data on the S-T hearing sensitivity in humans and macaque monkeys to a wide range of broadband dynamic spectrotemporal ripple stimuli employing a yes-no signal-detection task. Ripples were systematically varied, as a function of density (spectral modulation frequency), velocity (temporal modulation frequency), or modulation depth, to cover a listener's full S-T modulation sensitivity, derived from a total of 87 psychometric ripple detection curves. Audiograms were measured to control for normal hearing. Determined were hearing thresholds, reaction time distributions, and S-T modulation transfer functions (MTFs), both at the ripple detection thresholds and at suprathreshold modulation depths. Our psychophysically derived MTFs are consistent with the hypothesis that both monkeys and humans employ analogous perceptual strategies: S-T acoustic information is primarily processed separable. Singular value decomposition (SVD), however, revealed a small, but consistent, inseparable spectral-temporal interaction. Finally, SVD analysis of the known visual spatiotemporal contrast sensitivity function (CSF) highlights that human vision is space-time inseparable to a much larger extent than is the case for S-T sensitivity in hearing. Thus, the specificity with which the primate brain encodes natural sounds appears to be less strict than is required to adequately deal with natural images.NEW & NOTEWORTHY We provide comparative data on primate audition of naturalistic sounds comprising hearing thresholds, reaction time distributions, and spectral-temporal modulation transfer functions. Our psychophysical experiments demonstrate that auditory information is primarily processed in a spectral-temporal-independent manner by both monkeys and humans. Singular value decomposition of known visual spatiotemporal contrast sensitivity, in comparison to our auditory spectral-temporal sensitivity, revealed a striking contrast in how the brain encodes natural sounds as opposed to natural images, as vision appears to be space-time inseparable.
Collapse
Affiliation(s)
- Robert F van der Willigen
- Section Neurophysics, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
- School of Communication, Media and Information Technology, Rotterdam University of Applied Sciences, Rotterdam, The Netherlands
- Research Center Creating 010, Rotterdam University of Applied Sciences, Rotterdam, The Netherlands
| | - Huib Versnel
- Section Neurophysics, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
- Department of Otorhinolaryngology and Head & Neck Surgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - A John van Opstal
- Section Neurophysics, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| |
Collapse
|
4
|
Banno T, Shirley H, Fishman YI, Cohen YE. Changes in neural readout of response magnitude during auditory streaming do not correlate with behavioral choice in the auditory cortex. Cell Rep 2023; 42:113493. [PMID: 38039133 PMCID: PMC10784988 DOI: 10.1016/j.celrep.2023.113493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 08/01/2023] [Accepted: 11/09/2023] [Indexed: 12/03/2023] Open
Abstract
A fundamental goal of the auditory system is to group stimuli from the auditory environment into a perceptual unit (i.e., "stream") or segregate the stimuli into multiple different streams. Although previous studies have clarified the psychophysical and neural mechanisms that may underlie this ability, the relationship between these mechanisms remains elusive. Here, we recorded multiunit activity (MUA) from the auditory cortex of monkeys while they participated in an auditory-streaming task consisting of interleaved low- and high-frequency tone bursts. As the streaming stimulus unfolded over time, MUA amplitude habituated; the magnitude of this habituation was correlated with the frequency difference between the tone bursts. An ideal-observer model could classify these time- and frequency-dependent changes into reports of "one stream" or "two streams" in a manner consistent with the behavioral literature. However, because classification was not modulated by the monkeys' behavioral choices, this MUA habituation may not directly reflect perceptual reports.
Collapse
Affiliation(s)
- Taku Banno
- Department of Otorhinolaryngology - Head and Neck Surgery, University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA
| | - Harry Shirley
- Department of Otorhinolaryngology - Head and Neck Surgery, University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA
| | - Yonatan I Fishman
- Departments of Neurology and Neuroscience, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Yale E Cohen
- Department of Otorhinolaryngology - Head and Neck Surgery, University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA; Department of Neuroscience, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
5
|
López Espejo M, David SV. A sparse code for natural sound context in auditory cortex. CURRENT RESEARCH IN NEUROBIOLOGY 2023; 6:100118. [PMID: 38152461 PMCID: PMC10749876 DOI: 10.1016/j.crneur.2023.100118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 10/27/2023] [Accepted: 11/14/2023] [Indexed: 12/29/2023] Open
Abstract
Accurate sound perception can require integrating information over hundreds of milliseconds or even seconds. Spectro-temporal models of sound coding by single neurons in auditory cortex indicate that the majority of sound-evoked activity can be attributed to stimuli with a few tens of milliseconds. It remains uncertain how the auditory system integrates information about sensory context on a longer timescale. Here we characterized long-lasting contextual effects in auditory cortex (AC) using a diverse set of natural sound stimuli. We measured context effects as the difference in a neuron's response to a single probe sound following two different context sounds. Many AC neurons showed context effects lasting longer than the temporal window of a traditional spectro-temporal receptive field. The duration and magnitude of context effects varied substantially across neurons and stimuli. This diversity of context effects formed a sparse code across the neural population that encoded a wider range of contexts than any constituent neuron. Encoding model analysis indicates that context effects can be explained by activity in the local neural population, suggesting that recurrent local circuits support a long-lasting representation of sensory context in auditory cortex.
Collapse
Affiliation(s)
- Mateo López Espejo
- Neuroscience Graduate Program, Oregon Health & Science University, Portland, OR, USA
| | - Stephen V. David
- Otolaryngology, Oregon Health & Science University, Portland, OR, USA
| |
Collapse
|
6
|
Lu S, Ang GW, Steadman M, Kozlov AS. Composite receptive fields in the mouse auditory cortex. J Physiol 2023; 601:4091-4104. [PMID: 37578817 PMCID: PMC10952747 DOI: 10.1113/jp285003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Accepted: 07/12/2023] [Indexed: 08/15/2023] Open
Abstract
A central question in sensory neuroscience is how neurons represent complex natural stimuli. This process involves multiple steps of feature extraction to obtain a condensed, categorical representation useful for classification and behaviour. It has previously been shown that central auditory neurons in the starling have composite receptive fields composed of multiple features. Whether this property is an idiosyncratic characteristic of songbirds, a group of highly specialized vocal learners or a generic property of sensory processing is unknown. To address this question, we have recorded responses from auditory cortical neurons in mice, and characterized their receptive fields using mouse ultrasonic vocalizations (USVs) as a natural and ethologically relevant stimulus and pitch-shifted starling songs as a natural but ethologically irrelevant control stimulus. We have found that these neurons display composite receptive fields with multiple excitatory and inhibitory subunits. Moreover, this was the case with either the conspecific or the heterospecific vocalizations. We then trained the sparse filtering algorithm on both classes of natural stimuli to obtain statistically optimal features, and compared the natural and artificial features using UMAP, a dimensionality-reduction algorithm previously used to analyse mouse USVs and birdsongs. We have found that the receptive-field features obtained with both types of the natural stimuli clustered together, as did the sparse-filtering features. However, the natural and artificial receptive-field features clustered mostly separately. Based on these results, our general conclusion is that composite receptive fields are not a unique characteristic of specialized vocal learners but are likely a generic property of central auditory systems. KEY POINTS: Auditory cortical neurons in the mouse have composite receptive fields with several excitatory and inhibitory features. Receptive-field features capture temporal and spectral modulations of natural stimuli. Ethological relevance of the stimulus affects the estimation of receptive-field dimensionality.
Collapse
Affiliation(s)
- Sihao Lu
- Department of BioengineeringImperial College LondonLondonUK
| | - Grace W.Y. Ang
- Department of BioengineeringImperial College LondonLondonUK
| | - Mark Steadman
- Department of BioengineeringImperial College LondonLondonUK
| | | |
Collapse
|
7
|
Maruyama H, Okada K, Motoyoshi I. A two-stage spectral model for sound texture perception: Synthesis and psychophysics. Iperception 2023; 14:20416695231157349. [PMID: 36845027 PMCID: PMC9950610 DOI: 10.1177/20416695231157349] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 01/30/2023] [Indexed: 02/25/2023] Open
Abstract
The natural environment is filled with a variety of auditory events such as wind blowing, water flowing, and fire crackling. It has been suggested that the perception of such textural sounds is based on the statistics of the natural auditory events. Inspired by a recent spectral model for visual texture perception, we propose a model that can describe the perceived sound texture only with the linear spectrum and the energy spectrum. We tested the validity of the model by using synthetic noise sounds that preserve the two-stage amplitude spectra of the original sound. Psychophysical experiment showed that our synthetic noises were perceived as like the original sounds for 120 real-world auditory events. The performance was comparable with the synthetic sounds produced by McDermott-Simoncelli's model which considers various classes of auditory statistics. The results support the notion that the perception of natural sound textures is predictable by the two-stage spectral signals.
Collapse
Affiliation(s)
| | | | - Isamu Motoyoshi
- Isamu Motoyoshi, Department of Life
Sciences, The University of Tokyo, Japan.
| |
Collapse
|
8
|
Gallun FJ, Coco L, Koerner TK, de Larrea-Mancera ESL, Molis MR, Eddins DA, Seitz AR. Relating Suprathreshold Auditory Processing Abilities to Speech Understanding in Competition. Brain Sci 2022; 12:brainsci12060695. [PMID: 35741581 PMCID: PMC9221421 DOI: 10.3390/brainsci12060695] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 05/17/2022] [Accepted: 05/25/2022] [Indexed: 11/28/2022] Open
Abstract
(1) Background: Difficulty hearing in noise is exacerbated in older adults. Older adults are more likely to have audiometric hearing loss, although some individuals with normal pure-tone audiograms also have difficulty perceiving speech in noise. Additional variables also likely account for speech understanding in noise. It has been suggested that one important class of variables is the ability to process auditory information once it has been detected. Here, we tested a set of these “suprathreshold” auditory processing abilities and related them to performance on a two-part test of speech understanding in competition with and without spatial separation of the target and masking speech. Testing was administered in the Portable Automated Rapid Testing (PART) application developed by our team; PART facilitates psychoacoustic assessments of auditory processing. (2) Methods: Forty-one individuals (average age 51 years), completed assessments of sensitivity to temporal fine structure (TFS) and spectrotemporal modulation (STM) detection via an iPad running the PART application. Statistical models were used to evaluate the strength of associations between performance on the auditory processing tasks and speech understanding in competition. Age and pure-tone-average (PTA) were also included as potential predictors. (3) Results: The model providing the best fit also included age and a measure of diotic frequency modulation (FM) detection but none of the other potential predictors. However, even the best fitting models accounted for 31% or less of the variance, supporting work suggesting that other variables (e.g., cognitive processing abilities) also contribute significantly to speech understanding in noise. (4) Conclusions: The results of the current study do not provide strong support for previous suggestions that suprathreshold processing abilities alone can be used to explain difficulties in speech understanding in competition among older adults. This discrepancy could be due to the speech tests used, the listeners tested, or the suprathreshold tests chosen. Future work with larger numbers of participants is warranted, including a range of cognitive tests and additional assessments of suprathreshold auditory processing abilities.
Collapse
Affiliation(s)
- Frederick J. Gallun
- Oregon Hearing Research Center, Oregon Health & Science University, Portland, OR 97239, USA; (L.C.); (T.K.K.)
- VA RR&D National Center for Rehabilitative Auditory Research, VA Portland Health Care System, Portland, OR 97239, USA;
- Correspondence: ; Tel.: +1-503-494-4331
| | - Laura Coco
- Oregon Hearing Research Center, Oregon Health & Science University, Portland, OR 97239, USA; (L.C.); (T.K.K.)
- VA RR&D National Center for Rehabilitative Auditory Research, VA Portland Health Care System, Portland, OR 97239, USA;
| | - Tess K. Koerner
- Oregon Hearing Research Center, Oregon Health & Science University, Portland, OR 97239, USA; (L.C.); (T.K.K.)
- VA RR&D National Center for Rehabilitative Auditory Research, VA Portland Health Care System, Portland, OR 97239, USA;
| | | | - Michelle R. Molis
- VA RR&D National Center for Rehabilitative Auditory Research, VA Portland Health Care System, Portland, OR 97239, USA;
| | - David A. Eddins
- Department of Communication Science & Disorders, University of South Florida, Tampa, FL 33620, USA;
| | - Aaron R. Seitz
- Department of Psychology, University of California, Riverside, CA 92521, USA; (E.S.L.d.L.-M.); (A.R.S.)
| |
Collapse
|
9
|
Conroy C, Byrne AJ, Kidd G. Forward masking of spectrotemporal modulation detection. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:1181. [PMID: 35232084 PMCID: PMC8865928 DOI: 10.1121/10.0009404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Revised: 01/14/2022] [Accepted: 01/15/2022] [Indexed: 06/14/2023]
Abstract
Recent work has suggested that there may be specialized mechanisms in the auditory system for coding spectrotemporal modulations (STMs), tuned to different combinations of spectral modulation frequency, temporal modulation frequency, and STM sweep direction. The current study sought evidence of such mechanisms using a psychophysical forward masking paradigm. The detectability of a target comprising upward sweeping STMs was measured following the presentation of modulated maskers applied to the same carrier. Four maskers were tested, which had either (1) the same spectral modulation frequency as the target but a flat temporal envelope, (2) the same temporal modulation frequency as the target but a flat spectral envelope, (3) the same spectral and temporal modulation frequencies as the target but the opposite sweep direction (downward sweeping STMs), or (4) the same spectral and temporal modulation frequencies as the target and the same sweep direction (upward sweeping STMs). Forward masking was greatest for the masker fully matched to the target (4), intermediate for the masker with the opposite sweep direction (3), and negligible for the other two (1, 2). These findings are consistent with the suggestion that the detectability of the target was mediated by an STM-specific coding mechanism with sweep-direction selectivity.
Collapse
Affiliation(s)
- Christopher Conroy
- Department of Speech, Language & Hearing Sciences and Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Andrew J Byrne
- Department of Speech, Language & Hearing Sciences and Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Gerald Kidd
- Department of Speech, Language & Hearing Sciences and Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| |
Collapse
|
10
|
de Larrea-Mancera ESL, Philipp MA, Stavropoulos T, Carrillo AA, Cheung S, Koerner TK, Molis MR, Gallun FJ, Seitz AR. Training with an auditory perceptual learning game transfers to speech in competition. JOURNAL OF COGNITIVE ENHANCEMENT 2021; 6:47-66. [PMID: 34568741 PMCID: PMC8453468 DOI: 10.1007/s41465-021-00224-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Accepted: 08/24/2021] [Indexed: 12/23/2022]
Abstract
Understanding speech in the presence of acoustical competition is a major complaint of those with hearing difficulties. Here, a novel perceptual learning game was tested for its effectiveness in reducing difficulties with hearing speech in competition. The game was designed to train a mixture of auditory processing skills thought to underlie speech in competition, such as spectral-temporal processing, sound localization, and auditory working memory. Training on these skills occurred both in quiet and in competition with noise. Thirty college-aged participants without any known hearing difficulties were assigned either to this mixed-training condition or an active control consisting of frequency discrimination training within the same gamified setting. To assess training effectiveness, tests of speech in competition (primary outcome), as well as basic supra-threshold auditory processing and cognitive processing abilities (secondary outcomes) were administered before and after training. Results suggest modest improvements on speech in competition tests in the mixed-training compared to the frequency-discrimination control condition (Cohen’s d = 0.68). While the sample is small, and in normally hearing individuals, these data suggest promise of future study in populations with hearing difficulties.
Collapse
Affiliation(s)
- E Sebastian Lelo de Larrea-Mancera
- Psychology Department, University of California, Riverside, Riverside, CA USA.,Brain Game Center, University of California, Riverside, Riverside, CA USA
| | - Mark A Philipp
- Brain Game Center, University of California, Riverside, Riverside, CA USA
| | | | | | - Sierra Cheung
- Brain Game Center, University of California, Riverside, Riverside, CA USA
| | - Tess K Koerner
- Oregon Health and Science University, Portland, OR USA.,VA RR&D National Center for Rehabilitative Auditory Research, Portland, OR USA
| | - Michelle R Molis
- Oregon Health and Science University, Portland, OR USA.,VA RR&D National Center for Rehabilitative Auditory Research, Portland, OR USA
| | - Frederick J Gallun
- Oregon Health and Science University, Portland, OR USA.,VA RR&D National Center for Rehabilitative Auditory Research, Portland, OR USA
| | - Aaron R Seitz
- Psychology Department, University of California, Riverside, Riverside, CA USA.,Brain Game Center, University of California, Riverside, Riverside, CA USA
| |
Collapse
|
11
|
Occelli F, Hasselmann F, Bourien J, Puel JL, Desvignes N, Wiszniowski B, Edeline JM, Gourévitch B. Temporal Alterations to Central Auditory Processing without Synaptopathy after Lifetime Exposure to Environmental Noise. Cereb Cortex 2021; 32:1737-1754. [PMID: 34494109 DOI: 10.1093/cercor/bhab310] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Revised: 07/28/2021] [Accepted: 07/29/2021] [Indexed: 11/13/2022] Open
Abstract
People are increasingly exposed to environmental noise through the cumulation of occupational and recreational activities, which is considered harmless to the auditory system, if the sound intensity remains <80 dB. However, recent evidence of noise-induced peripheral synaptic damage and central reorganizations in the auditory cortex, despite normal audiometry results, has cast doubt on the innocuousness of lifetime exposure to environmental noise. We addressed this issue by exposing adult rats to realistic and nontraumatic environmental noise, within the daily permissible noise exposure limit for humans (80 dB sound pressure level, 8 h/day) for between 3 and 18 months. We found that temporary hearing loss could be detected after 6 months of daily exposure, without leading to permanent hearing loss or to missing synaptic ribbons in cochlear hair cells. The degraded temporal representation of sounds in the auditory cortex after 18 months of exposure was very different from the effects observed after only 3 months of exposure, suggesting that modifications to the neural code continue throughout a lifetime of exposure to noise.
Collapse
Affiliation(s)
- Florian Occelli
- NeuroScience Paris-Saclay Institute (NeuroPSI), CNRS, University of Paris-Saclay, Orsay F-91405, France
| | - Florian Hasselmann
- Institute for Neurosciences of Montpellier (INM), INSERM, University of Montpellier, Montpellier F-34091, France
| | - Jérôme Bourien
- Institute for Neurosciences of Montpellier (INM), INSERM, University of Montpellier, Montpellier F-34091, France
| | - Jean-Luc Puel
- Institute for Neurosciences of Montpellier (INM), INSERM, University of Montpellier, Montpellier F-34091, France
| | - Nathalie Desvignes
- NeuroScience Paris-Saclay Institute (NeuroPSI), CNRS, University of Paris-Saclay, Orsay F-91405, France
| | - Bernadette Wiszniowski
- NeuroScience Paris-Saclay Institute (NeuroPSI), CNRS, University of Paris-Saclay, Orsay F-91405, France
| | - Jean-Marc Edeline
- NeuroScience Paris-Saclay Institute (NeuroPSI), CNRS, University of Paris-Saclay, Orsay F-91405, France
| | - Boris Gourévitch
- NeuroScience Paris-Saclay Institute (NeuroPSI), CNRS, University of Paris-Saclay, Orsay F-91405, France.,Institut de l'Audition, Institut Pasteur, INSERM, Paris F-75012, France.,CNRS, France
| |
Collapse
|
12
|
Characteristics of the Deconvolved Transient AEP from 80 Hz Steady-State Responses to Amplitude Modulation Stimulation. J Assoc Res Otolaryngol 2021; 22:741-753. [PMID: 34415469 DOI: 10.1007/s10162-021-00806-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Accepted: 07/02/2021] [Indexed: 10/20/2022] Open
Abstract
This study aimed to validate the existence and investigate the characteristics of the transient responses from conventional auditory steady-state responses (ASSRs) using deconvolution methods capable of dealing with amplitude modulated (AM) stimulation. Conventional ASSRs to seven stimulus rates were recorded from 17 participants. A deconvolution method was selected and modified to accommodate the AM stimulation. The calculated responses were examined in terms of temporal features with respect to different combinations of stimulus rates. Stable transient responses consisting of early stage brainstem responses and middle latency responses were reconstructed consistently for all rate combinations, which indicates that the superposition hypothesis is applicable to the generation of approximately 80 Hz ASSRs evoked by AM tones (AM-ASSRs). The new transient responses are characterized by three pairs of peak-troughs named as n0p0, n1p1, and n2p2 within 40 ms. Compared with conventional ABR-MLRs, the n0p0 indicates the first neural activity where p0 might represent the main ABR components; the n1 is the counterpart of N10; the p2 is corresponding to the robust Pa at about 30 ms; the p1 and n2 are absent of real counterparts. The peak-peak amplitudes show a slight decrease with increasing stimulation rate from 75 to 95 Hz whereas the peak latencies change differently, which is consistent with the known rate-effect on AEPs. This is direct evidence for a transient response derived from AM-ASSRs for the first time. The characteristic components offer insight into the constitution of AM-ASSRs and may be promising in clinical applications and fundamental studies.
Collapse
|
13
|
Natural Statistics as Inference Principles of Auditory Tuning in Biological and Artificial Midbrain Networks. eNeuro 2021; 8:ENEURO.0525-20.2021. [PMID: 33947687 PMCID: PMC8211468 DOI: 10.1523/eneuro.0525-20.2021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Revised: 03/10/2021] [Accepted: 04/27/2021] [Indexed: 12/04/2022] Open
Abstract
Bats provide a powerful mammalian model to explore the neural representation of complex sounds, as they rely on hearing to survive in their environment. The inferior colliculus (IC) is a central hub of the auditory system that receives converging projections from the ascending pathway and descending inputs from auditory cortex. In this work, we build an artificial neural network to replicate auditory characteristics in IC neurons of the big brown bat. We first test the hypothesis that spectro-temporal tuning of IC neurons is optimized to represent the natural statistics of conspecific vocalizations. We estimate spectro-temporal receptive fields (STRFs) of IC neurons and compare tuning characteristics to statistics of bat calls. The results indicate that the FM tuning of IC neurons is matched with the statistics. Then, we investigate this hypothesis on the network optimized to represent natural sound statistics and to compare its output with biological responses. We also estimate biomimetic STRFs from the artificial network and correlate their characteristics to those of biological neurons. Tuning properties of both biological and artificial neurons reveal strong agreement along both spectral and temporal dimensions, and suggest the presence of nonlinearity, sparsity, and complexity constraints that underlie the neural representation in the auditory midbrain. Additionally, the artificial neurons replicate IC neural activities in discrimination of social calls, and provide simulated results for a noise robust discrimination. In this way, the biomimetic network allows us to infer the neural mechanisms by which the bat’s IC processes natural sounds used to construct the auditory scene.
Collapse
|
14
|
Pennington JR, David SV. Complementary Effects of Adaptation and Gain Control on Sound Encoding in Primary Auditory Cortex. eNeuro 2020; 7:ENEURO.0205-20.2020. [PMID: 33109632 PMCID: PMC7675144 DOI: 10.1523/eneuro.0205-20.2020] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Revised: 08/15/2020] [Accepted: 09/05/2020] [Indexed: 11/24/2022] Open
Abstract
An important step toward understanding how the brain represents complex natural sounds is to develop accurate models of auditory coding by single neurons. A commonly used model is the linear-nonlinear spectro-temporal receptive field (STRF; LN model). The LN model accounts for many features of auditory tuning, but it cannot account for long-lasting effects of sensory context on sound-evoked activity. Two mechanisms that may support these contextual effects are short-term plasticity (STP) and contrast-dependent gain control (GC), which have inspired expanded versions of the LN model. Both models improve performance over the LN model, but they have never been compared directly. Thus, it is unclear whether they account for distinct processes or describe one phenomenon in different ways. To address this question, we recorded activity of neurons in primary auditory cortex (A1) of awake ferrets during presentation of natural sounds. We then fit models incorporating one nonlinear mechanism (GC or STP) or both (GC+STP) using this single dataset, and measured the correlation between the models' predictions and the recorded neural activity. Both the STP and GC models performed significantly better than the LN model, but the GC+STP model outperformed both individual models. We also quantified the equivalence of STP and GC model predictions and found only modest similarity. Consistent results were observed for a dataset collected in clean and noisy acoustic contexts. These results establish general methods for evaluating the equivalence of arbitrarily complex encoding models and suggest that the STP and GC models describe complementary processes in the auditory system.
Collapse
Affiliation(s)
- Jacob R Pennington
- Department of Mathematics, Washington State University, Vancouver, WA, 98686
| | - Stephen V David
- Department of Otolaryngology, Oregon Health and Science University, Portland, OR, 97239
| |
Collapse
|
15
|
Lelo de Larrea-Mancera ES, Stavropoulos T, Hoover EC, Eddins DA, Gallun FJ, Seitz AR. Portable Automated Rapid Testing (PART) for auditory assessment: Validation in a young adult normal-hearing population. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 148:1831. [PMID: 33138479 PMCID: PMC7541091 DOI: 10.1121/10.0002108] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/26/2019] [Revised: 09/14/2020] [Accepted: 09/16/2020] [Indexed: 05/23/2023]
Abstract
This study aims to determine the degree to which Portable Automated Rapid Testing (PART), a freely available program running on a tablet computer, is capable of reproducing standard laboratory results. Undergraduate students were assigned to one of three within-subject conditions that examined repeatability of performance on a battery of psychoacoustical tests of temporal fine structure processing, spectro-temporal amplitude modulation, and targets in competition. The repeatability condition examined test/retest with the same system, the headphones condition examined the effects of varying headphones (passive and active noise-attenuating), and the noise condition examined repeatability in the presence of recorded cafeteria noise. In general, performance on the test battery showed high repeatability, even across manipulated conditions, and was similar to that reported in the literature. These data serve as validation that suprathreshold psychoacoustical tests can be made accessible to run on consumer-grade hardware and perform in less controlled settings. This dataset also provides a distribution of thresholds that can be used as a normative baseline against which auditory dysfunction can be identified in future work.
Collapse
Affiliation(s)
| | - Trevor Stavropoulos
- Brain Game Center, University of California Riverside, 1201 University Avenue, Riverside California 92521, USA
| | - Eric C Hoover
- University of Maryland, College Park, Maryland 20742, USA
| | | | | | - Aaron R Seitz
- Psychology Department, University of California, Riverside, 900 University Avenue, Riverside, California 92521, USA
| |
Collapse
|
16
|
Kaya EM, Huang N, Elhilali M. Pitch, Timbre and Intensity Interdependently Modulate Neural Responses to Salient Sounds. Neuroscience 2020; 440:1-14. [PMID: 32445938 DOI: 10.1016/j.neuroscience.2020.05.018] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Revised: 04/28/2020] [Accepted: 05/10/2020] [Indexed: 01/31/2023]
Abstract
As we listen to everyday sounds, auditory perception is heavily shaped by interactions between acoustic attributes such as pitch, timbre and intensity; though it is not clear how such interactions affect judgments of acoustic salience in dynamic soundscapes. Salience perception is believed to rely on an internal brain model that tracks the evolution of acoustic characteristics of a scene and flags events that do not fit this model as salient. The current study explores how the interdependency between attributes of dynamic scenes affects the neural representation of this internal model and shapes encoding of salient events. Specifically, the study examines how deviations along combinations of acoustic attributes interact to modulate brain responses, and subsequently guide perception of certain sound events as salient given their context. Human volunteers have their attention focused on a visual task and ignore acoustic melodies playing in the background while their brain activity using electroencephalography is recorded. Ambient sounds consist of musical melodies with probabilistically-varying acoustic attributes. Salient notes embedded in these scenes deviate from the melody's statistical distribution along pitch, timbre and/or intensity. Recordings of brain responses to salient notes reveal that neural power in response to the melodic rhythm as well as cross-trial phase alignment in the theta band are modulated by degree of salience of the notes, estimated across all acoustic attributes given their probabilistic context. These neural nonlinear effects across attributes strongly parallel behavioral nonlinear interactions observed in perceptual judgments of auditory salience using similar dynamic melodies; suggesting a neural underpinning of nonlinear interactions that underlie salience perception.
Collapse
Affiliation(s)
- Emine Merve Kaya
- Laboratory for Computational Audio Perception, Department of Electrical and Computer Engineering Johns Hopkins University, Baltimore, MD, USA
| | - Nicolas Huang
- Laboratory for Computational Audio Perception, Department of Electrical and Computer Engineering Johns Hopkins University, Baltimore, MD, USA
| | - Mounya Elhilali
- Laboratory for Computational Audio Perception, Department of Electrical and Computer Engineering Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
17
|
Ten Oever S, Sack AT. Interactions Between Rhythmic and Feature Predictions to Create Parallel Time-Content Associations. Front Neurosci 2019; 13:791. [PMID: 31427917 PMCID: PMC6688653 DOI: 10.3389/fnins.2019.00791] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Accepted: 07/15/2019] [Indexed: 11/13/2022] Open
Abstract
The brain is inherently proactive, constantly predicting the when (moment) and what (content) of future input in order to optimize information processing. Previous research on such predictions has mainly studied the "when" or "what" domain separately, missing to investigate the potential integration of both types of predictive information. In the absence of such integration, temporal cues are assumed to enhance any upcoming content at the predicted moment in time (general temporal predictor). However, if the when and what prediction domain were integrated, a much more flexible neural mechanism may be proposed in which temporal-feature interactions would allow for the creation of multiple concurrent time-content predictions (parallel time-content predictor). Here, we used a temporal association paradigm in two experiments in which sound identity was systematically paired with a specific time delay after the offset of a rhythmic visual input stream. In Experiment 1, we revealed that participants associated the time delay of presentation with the identity of the sound. In Experiment 2, we unexpectedly found that the strength of this temporal association was negatively related to the EEG steady-state evoked responses (SSVEP) in preceding trials, showing that after high neuronal responses participants responded inconsistent with the time-content associations, similar to adaptation mechanisms. In this experiment, time-content associations were only present for low SSVEP responses in previous trials. These results tentatively show that it is possible to represent multiple time-content paired predictions in parallel, however, future research is needed to investigate this interaction further.
Collapse
Affiliation(s)
- Sanne Ten Oever
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands.,Maastricht Brain Imaging Centre, Maastricht, Netherlands
| | - Alexander T Sack
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands.,Maastricht Brain Imaging Centre, Maastricht, Netherlands
| |
Collapse
|
18
|
Galindo-Leon EE, Stitt I, Pieper F, Stieglitz T, Engler G, Engel AK. Context-specific modulation of intrinsic coupling modes shapes multisensory processing. SCIENCE ADVANCES 2019; 5:eaar7633. [PMID: 30989107 PMCID: PMC6457939 DOI: 10.1126/sciadv.aar7633] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/17/2017] [Accepted: 02/14/2019] [Indexed: 06/05/2023]
Abstract
Intrinsically generated patterns of coupled neuronal activity are associated with the dynamics of specific brain states. Sensory inputs are extrinsic factors that can perturb these intrinsic coupling modes, creating a complex scenario in which forthcoming stimuli are processed. Studying this intrinsic-extrinsic interplay is necessary to better understand perceptual integration and selection. Here, we show that this interplay leads to a reconfiguration of functional cortical connectivity that acts as a mechanism to facilitate stimulus processing. Using audiovisual stimulation in anesthetized ferrets, we found that this reconfiguration of coupling modes is context specific, depending on long-term modulation by repetitive sensory inputs. These reconfigured coupling modes lead to changes in latencies and power of local field potential responses that support multisensory integration. Our study demonstrates that this interplay extends across multiple time scales and involves different types of intrinsic coupling. These results suggest a previously unknown large-scale mechanism that facilitates multisensory integration.
Collapse
Affiliation(s)
- Edgar E. Galindo-Leon
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, 20246 Hamburg, Germany
| | - Iain Stitt
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, 20246 Hamburg, Germany
| | - Florian Pieper
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, 20246 Hamburg, Germany
| | - Thomas Stieglitz
- Department of Microsystems Engineering, University of Freiburg, 79110 Freiburg, Germany
| | - Gerhard Engler
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, 20246 Hamburg, Germany
| | - Andreas K. Engel
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, 20246 Hamburg, Germany
| |
Collapse
|
19
|
Flinker A, Doyle WK, Mehta AD, Devinsky O, Poeppel D. Spectrotemporal modulation provides a unifying framework for auditory cortical asymmetries. Nat Hum Behav 2019; 3:393-405. [PMID: 30971792 PMCID: PMC6650286 DOI: 10.1038/s41562-019-0548-z] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Accepted: 01/28/2019] [Indexed: 11/29/2022]
Abstract
The principles underlying functional asymmetries in cortex remain debated. For example, it is accepted that speech is processed bilaterally in auditory cortex, but a left hemisphere dominance emerges when the input is interpreted linguistically. The mechanisms, however, are contested: what sound features or processing principles underlie laterality? Recent findings across species (humans, canines, bats) provide converging evidence that spectrotemporal sound features drive asymmetrical responses. Typically, accounts invoke models wherein the hemispheres differ in time-frequency resolution or integration window size. We develop a framework that builds on and unifies prevailing models, using spectrotemporal modulation space. Using signal processing techniques motivated by neural responses, we test this approach employing behavioral and neurophysiological measures. We show how psychophysical judgments align with spectrotemporal modulations and then characterize the neural sensitivities to temporal and spectral modulations. We demonstrate differential contributions from both hemispheres, with a left lateralization for temporal modulations and a weaker right lateralization for spectral modulations. We argue that representations in the modulation domain provide a more mechanistic basis to account for lateralization in auditory cortex.
Collapse
Affiliation(s)
- Adeen Flinker
- Department of Psychology, New York University, New York, NY, USA. .,Department of Neurology, New York University School of Medicine, New York, NY, USA.
| | - Werner K Doyle
- Department of Neurosurgery, New York University School of Medicine, New York, NY, USA
| | - Ashesh D Mehta
- Department of Neurosurgery, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Manhasset, NY, USA
| | - Orrin Devinsky
- Department of Neurology, New York University School of Medicine, New York, NY, USA
| | - David Poeppel
- Department of Psychology, New York University, New York, NY, USA.,Max Planck Institute for Empirical Aesthetics, Frankfurt, Germany
| |
Collapse
|
20
|
Venezia JH, Thurman SM, Richards VM, Hickok G. Hierarchy of speech-driven spectrotemporal receptive fields in human auditory cortex. Neuroimage 2018; 186:647-666. [PMID: 30500424 DOI: 10.1016/j.neuroimage.2018.11.049] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2018] [Revised: 10/11/2018] [Accepted: 11/26/2018] [Indexed: 12/22/2022] Open
Abstract
Existing data indicate that cortical speech processing is hierarchically organized. Numerous studies have shown that early auditory areas encode fine acoustic details while later areas encode abstracted speech patterns. However, it remains unclear precisely what speech information is encoded across these hierarchical levels. Estimation of speech-driven spectrotemporal receptive fields (STRFs) provides a means to explore cortical speech processing in terms of acoustic or linguistic information associated with characteristic spectrotemporal patterns. Here, we estimate STRFs from cortical responses to continuous speech in fMRI. Using a novel approach based on filtering randomly-selected spectrotemporal modulations (STMs) from aurally-presented sentences, STRFs were estimated for a group of listeners and categorized using a data-driven clustering algorithm. 'Behavioral STRFs' highlighting STMs crucial for speech recognition were derived from intelligibility judgments. Clustering revealed that STRFs in the supratemporal plane represented a broad range of STMs, while STRFs in the lateral temporal lobe represented circumscribed STM patterns important to intelligibility. Detailed analysis recovered a bilateral organization with posterior-lateral regions preferentially processing STMs associated with phonological information and anterior-lateral regions preferentially processing STMs associated with word- and phrase-level information. Regions in lateral Heschl's gyrus preferentially processed STMs associated with vocalic information (pitch).
Collapse
Affiliation(s)
- Jonathan H Venezia
- VA Loma Linda Healthcare System, Loma Linda, CA, USA; Dept. of Otolaryngology, School of Medicine, Loma Linda University, Loma Linda, CA, USA.
| | | | - Virginia M Richards
- Depts. of Cognitive Sciences and Language Science, University of California, Irvine, Irvine, CA, USA
| | - Gregory Hickok
- Depts. of Cognitive Sciences and Language Science, University of California, Irvine, Irvine, CA, USA
| |
Collapse
|
21
|
The effect of presentation level on spectrotemporal modulation detection. Hear Res 2018; 371:11-18. [PMID: 30439570 DOI: 10.1016/j.heares.2018.10.017] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/13/2017] [Revised: 10/23/2018] [Accepted: 10/29/2018] [Indexed: 11/24/2022]
Abstract
The understanding of speech in noise relies (at least partially) on spectrotemporal modulation sensitivity. This sensitivity can be measured by spectral ripple tests, which can be administered at different presentation levels. However, it is not known how presentation level affects spectrotemporal modulation thresholds. In this work, we present behavioral data for normal-hearing adults which show that at higher ripple densities (2 and 4 ripples/oct), increasing presentation level led to worse discrimination thresholds. Results of a computational model suggested that the higher thresholds could be explained by a worsening of the spectrotemporal representation in the auditory nerve due to broadening of cochlear filters and neural activity saturation. Our results demonstrate the importance of taking presentation level into account when administering spectrotemporal modulation detection tests.
Collapse
|
22
|
Ozmeral EJ, Eddins AC, Eddins DA. How Do Age and Hearing Loss Impact Spectral Envelope Perception? JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2018; 61:2376-2385. [PMID: 30178062 PMCID: PMC6195040 DOI: 10.1044/2018_jslhr-h-18-0056] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2018] [Revised: 05/09/2018] [Accepted: 05/16/2018] [Indexed: 06/01/2023]
Abstract
Purpose The goal was to evaluate the potential effects of increasing hearing loss and advancing age on spectral envelope perception. Method Spectral modulation detection was measured as a function of spectral modulation frequency from 0.5 to 8.0 cycles/octave. The spectral modulation task involved discrimination of a noise carrier (3 octaves wide from 400 to 3200 Hz) with a flat spectral envelope from a noise having a sinusoidal spectral envelope across a logarithmic audio frequency scale. Spectral modulation transfer functions (SMTFs; modulation threshold vs. modulation frequency) were computed and compared 4 listener groups: young normal hearing, older normal hearing, older with mild hearing loss, and older with moderate hearing loss. Estimates of the internal spectral contrast were obtained by computing excitation patterns. Results SMTFs for young listeners with normal hearing were bandpass with a minimum modulation detection threshold at 2 cycles/octave, and older listeners with normal hearing were remarkably similar to those of the young listeners. SMTFs for older listeners with mild and moderate hearing loss had a low-pass rather than a bandpass shape. Excitation patterns revealed that limited spectral resolution dictated modulation detection thresholds at high but not low spectral modulation frequencies. Even when factoring out (presumed) differences in frequency resolution among groups, the spectral envelope perception was worse for the group with moderate hearing loss than the other 3 groups. Conclusions The spectral envelope perception as measured by spectral modulation detection thresholds is compromised by hearing loss at higher spectral modulation frequencies, consistent with predictions of reduced spectral resolution known to accompany sensorineural hearing loss. Spectral envelope perception is not negatively impacted by advancing age at any spectral modulation frequency between 0.5 and 8.0 cycles/octave.
Collapse
Affiliation(s)
- Erol J. Ozmeral
- Department of Communication Sciences & Disorders, University of South Florida, Tampa
| | - Ann C. Eddins
- Department of Communication Sciences & Disorders, University of South Florida, Tampa
| | - David A. Eddins
- Department of Communication Sciences & Disorders, University of South Florida, Tampa
| |
Collapse
|
23
|
David SV. Incorporating behavioral and sensory context into spectro-temporal models of auditory encoding. Hear Res 2018; 360:107-123. [PMID: 29331232 PMCID: PMC6292525 DOI: 10.1016/j.heares.2017.12.021] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/27/2017] [Revised: 12/18/2017] [Accepted: 12/26/2017] [Indexed: 01/11/2023]
Abstract
For several decades, auditory neuroscientists have used spectro-temporal encoding models to understand how neurons in the auditory system represent sound. Derived from early applications of systems identification tools to the auditory periphery, the spectro-temporal receptive field (STRF) and more sophisticated variants have emerged as an efficient means of characterizing representation throughout the auditory system. Most of these encoding models describe neurons as static sensory filters. However, auditory neural coding is not static. Sensory context, reflecting the acoustic environment, and behavioral context, reflecting the internal state of the listener, can both influence sound-evoked activity, particularly in central auditory areas. This review explores recent efforts to integrate context into spectro-temporal encoding models. It begins with a brief tutorial on the basics of estimating and interpreting STRFs. Then it describes three recent studies that have characterized contextual effects on STRFs, emerging over a range of timescales, from many minutes to tens of milliseconds. An important theme of this work is not simply that context influences auditory coding, but also that contextual effects span a large continuum of internal states. The added complexity of these context-dependent models introduces new experimental and theoretical challenges that must be addressed in order to be used effectively. Several new methodological advances promise to address these limitations and allow the development of more comprehensive context-dependent models in the future.
Collapse
Affiliation(s)
- Stephen V David
- Oregon Hearing Research Center, Oregon Health & Science University, 3181 SW Sam Jackson Park Rd, MC L335A, Portland, OR 97239, United States.
| |
Collapse
|
24
|
Hoover EC, Eddins AC, Eddins DA. Distribution of spectral modulation transfer functions in a young, normal-hearing population. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:306. [PMID: 29390785 PMCID: PMC5777922 DOI: 10.1121/1.5020787] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Spectral modulation transfer functions (SMTFs) were measured in 49 young (18-35 years of age) normal-hearing listeners. Noise carriers spanned six octaves from 200 to 12 800 Hz. Sinusoidal (on a log-amplitude scale) spectral modulation with random starting phase was superimposed on the carrier at spectral modulation frequencies of 0.25, 0.5, 1.0, 2.0, 4.0, and 8.0 cycles/octave. Modulation detection thresholds (in dB) yielded SMTFs that were bandpass in nature, consistent with previous investigations reporting data for only a few subjects. Thresholds were notably consistent across subjects despite minimal practice. Population statistics are reported that may serve as reference data for future studies.
Collapse
Affiliation(s)
- Eric C Hoover
- Department of Communication Sciences and Disorders, University of South Florida, 4202 East Fowler Avenue, PCD 1017, Tampa, Florida 32620, USA
| | - Ann C Eddins
- Department of Communication Sciences and Disorders, University of South Florida, 4202 East Fowler Avenue, PCD 1017, Tampa, Florida 32620, USA
| | - David A Eddins
- Department of Communication Sciences and Disorders, University of South Florida, 4202 East Fowler Avenue, PCD 1017, Tampa, Florida 32620, USA
| |
Collapse
|
25
|
Cluster-based analysis improves predictive validity of spike-triggered receptive field estimates. PLoS One 2017; 12:e0183914. [PMID: 28877194 PMCID: PMC5587334 DOI: 10.1371/journal.pone.0183914] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2016] [Accepted: 08/14/2017] [Indexed: 11/19/2022] Open
Abstract
Spectrotemporal receptive field (STRF) characterization is a central goal of auditory physiology. STRFs are often approximated by the spike-triggered average (STA), which reflects the average stimulus preceding a spike. In many cases, the raw STA is subjected to a threshold defined by gain values expected by chance. However, such correction methods have not been universally adopted, and the consequences of specific gain-thresholding approaches have not been investigated systematically. Here, we evaluate two classes of statistical correction techniques, using the resulting STRF estimates to predict responses to a novel validation stimulus. The first, more traditional technique eliminated STRF pixels (time-frequency bins) with gain values expected by chance. This correction method yielded significant increases in prediction accuracy, including when the threshold setting was optimized for each unit. The second technique was a two-step thresholding procedure wherein clusters of contiguous pixels surviving an initial gain threshold were then subjected to a cluster mass threshold based on summed pixel values. This approach significantly improved upon even the best gain-thresholding techniques. Additional analyses suggested that allowing threshold settings to vary independently for excitatory and inhibitory subfields of the STRF resulted in only marginal additional gains, at best. In summary, augmenting reverse correlation techniques with principled statistical correction choices increased prediction accuracy by over 80% for multi-unit STRFs and by over 40% for single-unit STRFs, furthering the interpretational relevance of the recovered spectrotemporal filters for auditory systems analysis.
Collapse
|
26
|
Zheng Y, Escabí M, Litovsky RY. Spectro-temporal cues enhance modulation sensitivity in cochlear implant users. Hear Res 2017; 351:45-54. [PMID: 28601530 DOI: 10.1016/j.heares.2017.05.009] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/11/2017] [Revised: 05/12/2017] [Accepted: 05/23/2017] [Indexed: 10/19/2022]
Abstract
Although speech understanding is highly variable amongst cochlear implants (CIs) subjects, the remarkably high speech recognition performance of many CI users is unexpected and not well understood. Numerous factors, including neural health and degradation of the spectral information in the speech signal of CIs, likely contribute to speech understanding. We studied the ability to use spectro-temporal modulations, which may be critical for speech understanding and discrimination, and hypothesize that CI users adopt a different perceptual strategy than normal-hearing (NH) individuals, whereby they rely more heavily on joint spectro-temporal cues to enhance detection of auditory cues. Modulation detection sensitivity was studied in CI users and NH subjects using broadband "ripple" stimuli that were modulated spectrally, temporally, or jointly, i.e., spectro-temporally. The spectro-temporal modulation transfer functions of CI users and NH subjects was decomposed into spectral and temporal dimensions and compared to those subjects' spectral-only and temporal-only modulation transfer functions. In CI users, the joint spectro-temporal sensitivity was better than that predicted by spectral-only and temporal-only sensitivity, indicating a heightened spectro-temporal sensitivity. Such an enhancement through the combined integration of spectral and temporal cues was not observed in NH subjects. The unique use of spectro-temporal cues by CI patients can yield benefits for use of cues that are important for speech understanding. This finding has implications for developing sound processing strategies that may rely on joint spectro-temporal modulations to improve speech comprehension of CI users, and the findings of this study may be valuable for developing clinical assessment tools to optimize CI processor performance.
Collapse
Affiliation(s)
- Yi Zheng
- Waisman Center, University of Wisconsin Madison, 1500 Highland Avenue, Madison, WI, 53705, USA
| | - Monty Escabí
- Biomedical Engineering, Electrical and Computer Engineering, University of Connecticut, 371 Fairfield Rd., U1157, Storrs, CT, 06269, USA
| | - Ruth Y Litovsky
- Waisman Center, University of Wisconsin Madison, 1500 Highland Avenue, Madison, WI, 53705, USA.
| |
Collapse
|
27
|
Boubenec Y, Lawlor J, Górska U, Shamma S, Englitz B. Detecting changes in dynamic and complex acoustic environments. eLife 2017; 6. [PMID: 28262095 PMCID: PMC5367897 DOI: 10.7554/elife.24910] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2017] [Accepted: 03/04/2017] [Indexed: 01/28/2023] Open
Abstract
Natural sounds such as wind or rain, are characterized by the statistical occurrence of their constituents. Despite their complexity, listeners readily detect changes in these contexts. We here address the neural basis of statistical decision-making using a combination of psychophysics, EEG and modelling. In a texture-based, change-detection paradigm, human performance and reaction times improved with longer pre-change exposure, consistent with improved estimation of baseline statistics. Change-locked and decision-related EEG responses were found in a centro-parietal scalp location, whose slope depended on change size, consistent with sensory evidence accumulation. The potential's amplitude scaled with the duration of pre-change exposure, suggesting a time-dependent decision threshold. Auditory cortex-related potentials showed no response to the change. A dual timescale, statistical estimation model accounted for subjects' performance. Furthermore, a decision-augmented auditory cortex model accounted for performance and reaction times, suggesting that the primary cortical representation requires little post-processing to enable change-detection in complex acoustic environments. DOI:http://dx.doi.org/10.7554/eLife.24910.001
Collapse
Affiliation(s)
- Yves Boubenec
- Laboratoire des Systèmes Perceptifs, CNRS UMR 8248, Paris, France.,Département d'études cognitives, École normale supérieure, PSL Research University, Paris, France
| | - Jennifer Lawlor
- Laboratoire des Systèmes Perceptifs, CNRS UMR 8248, Paris, France.,Département d'études cognitives, École normale supérieure, PSL Research University, Paris, France
| | - Urszula Górska
- Department of Neurophysiology, Donders Centre for Neuroscience, Radboud Universiteit, Nijmegen, Netherlands.,Psychophysiology Laboratory, Institute of Psychology, Jagiellonian University, Krakow, Poland.,Smoluchowski Institute of Physics, Jagiellonian University, Krakow, Poland
| | - Shihab Shamma
- Laboratoire des Systèmes Perceptifs, CNRS UMR 8248, Paris, France.,Département d'études cognitives, École normale supérieure, PSL Research University, Paris, France.,Department of Electrical and Computer Engineering, University of Maryland, College Park, United States.,Institute for Systems Research, University of Maryland, College Park, United States
| | - Bernhard Englitz
- Laboratoire des Systèmes Perceptifs, CNRS UMR 8248, Paris, France.,Département d'études cognitives, École normale supérieure, PSL Research University, Paris, France.,Department of Neurophysiology, Donders Centre for Neuroscience, Radboud Universiteit, Nijmegen, Netherlands
| |
Collapse
|
28
|
Oetjen A, Verhey JL. Characteristics of spectro-temporal modulation frequency selectivity in humans. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 141:1887. [PMID: 28372116 DOI: 10.1121/1.4976537] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
There is increasing evidence that the auditory system shows frequency selectivity for spectro-temporal modulations. A recent study of the authors has shown spectro-temporal modulation masking patterns that were in agreement with the hypothesis of spectro-temporal modulation filters in the human auditory system [Oetjen and Verhey (2015). J. Acoust. Soc. Am. 137(2), 714-723]. In the present study, that experimental data and additional data were used to model this spectro-temporal frequency selectivity. The additional data were collected to investigate to what extent the spectro-temporal modulation-frequency selectivity results from a combination of a purely temporal amplitude-modulation filter and a purely spectral amplitude-modulation filter. In contrast to the previous study, thresholds were measured for masker and target modulations with opposite directions, i.e., an upward pointing target modulation and a downward pointing masker modulation. The comparison of this data set with previous corresponding data with the same direction from target and masker modulations indicate that a specific spectro-temporal modulation filter is required to simulate all aspects of spectro-temporal modulation frequency selectivity. A model using a modified Gabor filter with a purely temporal and a purely spectral filter predicts the spectro-temporal modulation masking data.
Collapse
Affiliation(s)
- Arne Oetjen
- Acoustics group, Carl von Ossietzky University Oldenburg, Carl von Ossietzky Strasse 9-11, 26129 Oldenburg, Germany
| | - Jesko L Verhey
- Department of Experimental Audiology, Otto von Guericke University Magdeburg, 39120 Magdeburg, Germany
| |
Collapse
|
29
|
Keine C, Rübsamen R, Englitz B. Inhibition in the auditory brainstem enhances signal representation and regulates gain in complex acoustic environments. eLife 2016; 5. [PMID: 27855778 PMCID: PMC5148601 DOI: 10.7554/elife.19295] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2016] [Accepted: 11/17/2016] [Indexed: 12/30/2022] Open
Abstract
Inhibition plays a crucial role in neural signal processing, shaping and limiting responses. In the auditory system, inhibition already modulates second order neurons in the cochlear nucleus, e.g. spherical bushy cells (SBCs). While the physiological basis of inhibition and excitation is well described, their functional interaction in signal processing remains elusive. Using a combination of in vivo loose-patch recordings, iontophoretic drug application, and detailed signal analysis in the Mongolian Gerbil, we demonstrate that inhibition is widely co-tuned with excitation, and leads only to minor sharpening of the spectral response properties. Combinations of complex stimuli and neuronal input-output analysis based on spectrotemporal receptive fields revealed inhibition to render the neuronal output temporally sparser and more reproducible than the input. Overall, inhibition plays a central role in improving the temporal response fidelity of SBCs across a wide range of input intensities and thereby provides the basis for high-fidelity signal processing.
Collapse
Affiliation(s)
- Christian Keine
- Faculty of Bioscience, Pharmacy and Psychology, University of Leipzig, Leipzig, Germany
| | - Rudolf Rübsamen
- Faculty of Bioscience, Pharmacy and Psychology, University of Leipzig, Leipzig, Germany
| | - Bernhard Englitz
- Department of Neurophysiology, Donders Center for Neuroscience, Radboud University, Nijmegen, Netherlands
| |
Collapse
|
30
|
Deneux T, Kempf A, Daret A, Ponsot E, Bathellier B. Temporal asymmetries in auditory coding and perception reflect multi-layered nonlinearities. Nat Commun 2016; 7:12682. [PMID: 27580932 PMCID: PMC5025791 DOI: 10.1038/ncomms12682] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2016] [Accepted: 07/22/2016] [Indexed: 11/10/2022] Open
Abstract
Sound recognition relies not only on spectral cues, but also on temporal cues, as demonstrated by the profound impact of time reversals on perception of common sounds. To address the coding principles underlying such auditory asymmetries, we recorded a large sample of auditory cortex neurons using two-photon calcium imaging in awake mice, while playing sounds ramping up or down in intensity. We observed clear asymmetries in cortical population responses, including stronger cortical activity for up-ramping sounds, which matches perceptual saliency assessments in mice and previous measures in humans. Analysis of cortical activity patterns revealed that auditory cortex implements a map of spatially clustered neuronal ensembles, detecting specific combinations of spectral and intensity modulation features. Comparing different models, we show that cortical responses result from multi-layered nonlinearities, which, contrary to standard receptive field models of auditory cortex function, build divergent representations of sounds with similar spectral content, but different temporal structure. In humans, sounds that increase in intensity over time (up-ramp) are perceived as louder than down-ramping sounds. Here the authors show that in mice this bias also exists and is reflected in the complex nonlinearities of auditory cortex activity.
Collapse
Affiliation(s)
- Thomas Deneux
- Unité de Neuroscience, Information et Complexité (UNIC), Centre National de la Recherche Scientifique, UPR 3293, F-91198 Gif-sur-Yvette, France
| | - Alexandre Kempf
- Unité de Neuroscience, Information et Complexité (UNIC), Centre National de la Recherche Scientifique, UPR 3293, F-91198 Gif-sur-Yvette, France
| | - Aurélie Daret
- Unité de Neuroscience, Information et Complexité (UNIC), Centre National de la Recherche Scientifique, UPR 3293, F-91198 Gif-sur-Yvette, France
| | - Emmanuel Ponsot
- Institut de Recherche et de Coordination Acoustique/Musique (IRCAM), Centre National de la Recherche Scientifique, UMR 9912, F-75004 Paris, France
| | - Brice Bathellier
- Unité de Neuroscience, Information et Complexité (UNIC), Centre National de la Recherche Scientifique, UPR 3293, F-91198 Gif-sur-Yvette, France
| |
Collapse
|
31
|
Venezia JH, Hickok G, Richards VM. Auditory "bubbles": Efficient classification of the spectrotemporal modulations essential for speech intelligibility. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 140:1072. [PMID: 27586738 PMCID: PMC5848825 DOI: 10.1121/1.4960544] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Speech intelligibility depends on the integrity of spectrotemporal patterns in the signal. The current study is concerned with the speech modulation power spectrum (MPS), which is a two-dimensional representation of energy at different combinations of temporal and spectral (i.e., spectrotemporal) modulation rates. A psychophysical procedure was developed to identify the regions of the MPS that contribute to successful reception of auditory sentences. The procedure, based on the two-dimensional image classification technique known as "bubbles" (Gosselin and Schyns (2001). Vision Res. 41, 2261-2271), involves filtering (i.e., degrading) the speech signal by removing parts of the MPS at random, and relating filter patterns to observer performance (keywords identified) over a number of trials. The result is a classification image (CImg) or "perceptual map" that emphasizes regions of the MPS essential for speech intelligibility. This procedure was tested using normal-rate and 2×-time-compressed sentences. The results indicated: (a) CImgs could be reliably estimated in individual listeners in relatively few trials, (b) CImgs tracked changes in spectrotemporal modulation energy induced by time compression, though not completely, indicating that "perceptual maps" deviated from physical stimulus energy, and
Collapse
Affiliation(s)
- Jonathan H Venezia
- Department of Cognitive Sciences, University of California, Irvine, 3151 Social Science Plaza, Irvine, California 92697-5100, USA
| | - Gregory Hickok
- Department of Cognitive Sciences, University of California, Irvine, 3151 Social Science Plaza, Irvine, California 92697-5100, USA
| | - Virginia M Richards
- Department of Cognitive Sciences, University of California, Irvine, 3151 Social Science Plaza, Irvine, California 92697-5100, USA
| |
Collapse
|
32
|
Thorson IL, Liénard J, David SV. The Essential Complexity of Auditory Receptive Fields. PLoS Comput Biol 2015; 11:e1004628. [PMID: 26683490 PMCID: PMC4684325 DOI: 10.1371/journal.pcbi.1004628] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2015] [Accepted: 10/26/2015] [Indexed: 12/05/2022] Open
Abstract
Encoding properties of sensory neurons are commonly modeled using linear finite impulse response (FIR) filters. For the auditory system, the FIR filter is instantiated in the spectro-temporal receptive field (STRF), often in the framework of the generalized linear model. Despite widespread use of the FIR STRF, numerous formulations for linear filters are possible that require many fewer parameters, potentially permitting more efficient and accurate model estimates. To explore these alternative STRF architectures, we recorded single-unit neural activity from auditory cortex of awake ferrets during presentation of natural sound stimuli. We compared performance of > 1000 linear STRF architectures, evaluating their ability to predict neural responses to a novel natural stimulus. Many were able to outperform the FIR filter. Two basic constraints on the architecture lead to the improved performance: (1) factorization of the STRF matrix into a small number of spectral and temporal filters and (2) low-dimensional parameterization of the factorized filters. The best parameterized model was able to outperform the full FIR filter in both primary and secondary auditory cortex, despite requiring fewer than 30 parameters, about 10% of the number required by the FIR filter. After accounting for noise from finite data sampling, these STRFs were able to explain an average of 40% of A1 response variance. The simpler models permitted more straightforward interpretation of sensory tuning properties. They also showed greater benefit from incorporating nonlinear terms, such as short term plasticity, that provide theoretical advances over the linear model. Architectures that minimize parameter count while maintaining maximum predictive power provide insight into the essential degrees of freedom governing auditory cortical function. They also maximize statistical power available for characterizing additional nonlinear properties that limit current auditory models. Understanding how the brain solves sensory problems can provide useful insight for the development of automated systems such as speech recognizers and image classifiers. Recent developments in nonlinear regression and machine learning have produced powerful algorithms for characterizing the input-output relationship of complex systems. However, the complexity of sensory neural systems, combined with practical limitations on experimental data, make it difficult to apply arbitrarily complex analyses to neural data. In this study we pushed analysis in the opposite direction, toward simpler models. We asked how simple a model can be while still capturing the essential sensory properties of neurons in auditory cortex. We found that substantially simpler formulations of the widely-used spectro-temporal receptive field are able to perform as well as the best current models. These simpler formulations define new basis sets that can be incorporated into state-of-the-art machine learning algorithms for a more exhaustive exploration of sensory processing.
Collapse
Affiliation(s)
- Ivar L. Thorson
- Oregon Hearing Research Center, Oregon Health & Science University, Portland, Oregon, United States of America
| | - Jean Liénard
- Department of Mathematics, Washington State University, Vancouver, Washington, United States of America
| | - Stephen V. David
- Oregon Hearing Research Center, Oregon Health & Science University, Portland, Oregon, United States of America
- * E-mail:
| |
Collapse
|
33
|
Thakur CS, Wang RM, Afshar S, Hamilton TJ, Tapson JC, Shamma SA, van Schaik A. Sound stream segregation: a neuromorphic approach to solve the "cocktail party problem" in real-time. Front Neurosci 2015; 9:309. [PMID: 26388721 PMCID: PMC4557082 DOI: 10.3389/fnins.2015.00309] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2015] [Accepted: 08/18/2015] [Indexed: 11/13/2022] Open
Abstract
The human auditory system has the ability to segregate complex auditory scenes into a foreground component and a background, allowing us to listen to specific speech sounds from a mixture of sounds. Selective attention plays a crucial role in this process, colloquially known as the "cocktail party effect." It has not been possible to build a machine that can emulate this human ability in real-time. Here, we have developed a framework for the implementation of a neuromorphic sound segregation algorithm in a Field Programmable Gate Array (FPGA). This algorithm is based on the principles of temporal coherence and uses an attention signal to separate a target sound stream from background noise. Temporal coherence implies that auditory features belonging to the same sound source are coherently modulated and evoke highly correlated neural response patterns. The basis for this form of sound segregation is that responses from pairs of channels that are strongly positively correlated belong to the same stream, while channels that are uncorrelated or anti-correlated belong to different streams. In our framework, we have used a neuromorphic cochlea as a frontend sound analyser to extract spatial information of the sound input, which then passes through band pass filters that extract the sound envelope at various modulation rates. Further stages include feature extraction and mask generation, which is finally used to reconstruct the targeted sound. Using sample tonal and speech mixtures, we show that our FPGA architecture is able to segregate sound sources in real-time. The accuracy of segregation is indicated by the high signal-to-noise ratio (SNR) of the segregated stream (90, 77, and 55 dB for simple tone, complex tone, and speech, respectively) as compared to the SNR of the mixture waveform (0 dB). This system may be easily extended for the segregation of complex speech signals, and may thus find various applications in electronic devices such as for sound segregation and speech recognition.
Collapse
Affiliation(s)
- Chetan Singh Thakur
- Biomedical Engineering and Neuroscience, The MARCS Institute, University of Western SydneySydney, NSW, Australia
| | - Runchun M. Wang
- Biomedical Engineering and Neuroscience, The MARCS Institute, University of Western SydneySydney, NSW, Australia
| | - Saeed Afshar
- Biomedical Engineering and Neuroscience, The MARCS Institute, University of Western SydneySydney, NSW, Australia
| | - Tara J. Hamilton
- Biomedical Engineering and Neuroscience, The MARCS Institute, University of Western SydneySydney, NSW, Australia
| | - Jonathan C. Tapson
- Biomedical Engineering and Neuroscience, The MARCS Institute, University of Western SydneySydney, NSW, Australia
| | - Shihab A. Shamma
- Department of Electrical and Computer Engineering and Institute for Systems Research, University of MarylandCollege Park, MD, USA
| | - André van Schaik
- Biomedical Engineering and Neuroscience, The MARCS Institute, University of Western SydneySydney, NSW, Australia
| |
Collapse
|
34
|
Todd NPM, Lee CS. Source analysis of electrophysiological correlates of beat induction as sensory-guided action. Front Psychol 2015; 6:1178. [PMID: 26321991 PMCID: PMC4536380 DOI: 10.3389/fpsyg.2015.01178] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2014] [Accepted: 07/27/2015] [Indexed: 11/13/2022] Open
Abstract
In this paper we present a reanalysis of electrophysiological data originally collected to test a sensory-motor theory of beat induction (Todd et al., 2002; Todd and Seiss, 2004; Todd and Lee, 2015). The reanalysis is conducted in the light of more recent findings and in particular the demonstration that auditory evoked potentials contain a vestibular dependency. At the core of the analysis is a model which predicts brain dipole source current activity over time in temporal and frontal lobe areas during passive listening to a rhythm, or active synchronization, where it dissociates the frontal activity into distinct sources which can be identified as respectively pre-motor and motor in origin. The model successfully captures the main features of the rhythm in showing that the metrical structure is manifest in an increase in source current activity during strong compared to weak beats. In addition the outcomes of modeling suggest that: (1) activity in both temporal and frontal areas contribute to the metrical percept and that this activity is distributed over time; (2) transient, time-locked activity associated with anticipated beats is increased when a temporal expectation is confirmed following a previous violation, such as a syncopation; (3) two distinct processes are involved in auditory cortex, corresponding to tangential and radial (possibly vestibular dependent) current sources. We discuss the implications of these outcomes for the insights they give into the origin of metrical structure and the power of syncopation to induce movement and create a sense of groove.
Collapse
Affiliation(s)
- Neil P. M. Todd
- Faculty of Life Science, University of ManchesterManchester, UK
| | | |
Collapse
|
35
|
Bizley JK, Bajo VM, Nodal FR, King AJ. Cortico-Cortical Connectivity Within Ferret Auditory Cortex. J Comp Neurol 2015; 523:2187-210. [PMID: 25845831 PMCID: PMC4737260 DOI: 10.1002/cne.23784] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2014] [Revised: 03/29/2015] [Accepted: 04/01/2015] [Indexed: 12/29/2022]
Abstract
Despite numerous studies of auditory cortical processing in the ferret (Mustela putorius), very little is known about the connections between the different regions of the auditory cortex that have been characterized cytoarchitectonically and physiologically. We examined the distribution of retrograde and anterograde labeling after injecting tracers into one or more regions of ferret auditory cortex. Injections of different tracers at frequency‐matched locations in the core areas, the primary auditory cortex (A1) and anterior auditory field (AAF), of the same animal revealed the presence of reciprocal connections with overlapping projections to and from discrete regions within the posterior pseudosylvian and suprasylvian fields (PPF and PSF), suggesting that these connections are frequency specific. In contrast, projections from the primary areas to the anterior dorsal field (ADF) on the anterior ectosylvian gyrus were scattered and non‐overlapping, consistent with the non‐tonotopic organization of this field. The relative strength of the projections originating in each of the primary fields differed, with A1 predominantly targeting the posterior bank fields PPF and PSF, which in turn project to the ventral posterior field, whereas AAF projects more heavily to the ADF, which then projects to the anteroventral field and the pseudosylvian sulcal cortex. These findings suggest that parallel anterior and posterior processing networks may exist, although the connections between different areas often overlap and interactions were present at all levels. J. Comp. Neurol. 523:2187–2210, 2015. © 2015 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Jennifer K Bizley
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, OX1 3PT, United Kingdom.,Ear Institute, University College London, London, WC1X 8EE, United Kingdom
| | - Victoria M Bajo
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, OX1 3PT, United Kingdom
| | | | - Andrew J King
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, OX1 3PT, United Kingdom
| |
Collapse
|
36
|
Spectrotemporal response properties of core auditory cortex neurons in awake monkey. PLoS One 2015; 10:e0116118. [PMID: 25680187 PMCID: PMC4332665 DOI: 10.1371/journal.pone.0116118] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2014] [Accepted: 12/03/2014] [Indexed: 11/19/2022] Open
Abstract
So far, most studies of core auditory cortex (AC) have characterized the spectral and temporal tuning properties of cells in non-awake, anesthetized preparations. As experiments in awake animals are scarce, we here used dynamic spectral-temporal broadband ripples to study the properties of the spectrotemporal receptive fields (STRFs) of AC cells in awake monkeys. We show that AC neurons were typically most sensitive to low ripple densities (spectral) and low velocities (temporal), and that most cells were not selective for a particular spectrotemporal sweep direction. A substantial proportion of neurons preferred amplitude-modulated sounds (at zero ripple density) to dynamic ripples (at non-zero densities). The vast majority (>93%) of modulation transfer functions were separable with respect to spectral and temporal modulations, indicating that time and spectrum are independently processed in AC neurons. We also analyzed the linear predictability of AC responses to natural vocalizations on the basis of the STRF. We discuss our findings in the light of results obtained from the monkey midbrain inferior colliculus by comparing the spectrotemporal tuning properties and linear predictability of these two important auditory stages.
Collapse
|
37
|
Oetjen A, Verhey JL. Spectro-temporal modulation masking patterns reveal frequency selectivity. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 137:714-723. [PMID: 25698006 DOI: 10.1121/1.4906171] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
The present study investigated the possibility that the human auditory system demonstrates frequency selectivity to spectro-temporal amplitude modulations. Threshold modulation depth for detecting sinusoidal spectro-temporal modulations was measured using a generalized masked threshold pattern paradigm with narrowband masker modulations. Four target spectro-temporal modulations were examined, differing in their temporal and spectral modulation frequencies: a temporal modulation of -8, 8, or 16 Hz combined with a spectral modulation of 1 cycle/octave and a temporal modulation of 4 Hz combined with a spectral modulation of 0.5 cycles/octave. The temporal center frequencies of the masker modulation ranged from 0.25 to 4 times the target temporal modulation. The spectral masker-modulation center-frequencies were 0, 0.5, 1, 1.5, and 2 times the target spectral modulation. For all target modulations, the pattern of average thresholds for the eight normal-hearing listeners was consistent with the hypothesis of a spectro-temporal modulation filter. Such a pattern of modulation-frequency sensitivity was predicted on the basis of psychoacoustical data for purely temporal amplitude modulations and purely spectral amplitude modulations. An analysis of separability indicates that, for the present data set, selectivity in the spectro-temporal modulation domain can be described by a combination of a purely spectral and a purely temporal modulation filter function.
Collapse
Affiliation(s)
- Arne Oetjen
- Acoustics Group, Carl von Ossietzky University Oldenburg, Carl von Ossietzky Str. 9-11, 26129 Oldenburg, Germany
| | - Jesko L Verhey
- Department of Experimental Audiology, Otto von Guericke University Magdeburg, 39120 Magdeburg, Germany
| |
Collapse
|
38
|
Koka K, Tollin DJ. Linear coding of complex sound spectra by discharge rate in neurons of the medial nucleus of the trapezoid body (MNTB) and its inputs. Front Neural Circuits 2014; 8:144. [PMID: 25565971 PMCID: PMC4267272 DOI: 10.3389/fncir.2014.00144] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2014] [Accepted: 11/25/2014] [Indexed: 11/25/2022] Open
Abstract
The interaural level difference (ILD) cue to sound location is first encoded in the lateral superior olive (LSO). ILD sensitivity results because the LSO receives excitatory input from the ipsilateral cochlear nucleus and inhibitory input indirectly from the contralateral cochlear nucleus via glycinergic neurons of the ipsilateral medial nucleus of the trapezoid body (MNTB). It is hypothesized that in order for LSO neurons to encode ILDs, the sound spectra at both ears must be accurately encoded via spike rate by their afferents. This spectral-coding hypothesis has not been directly tested in MNTB, likely because MNTB neurons have been mostly described and studied recently in regards to their abilities to encode temporal aspects of sounds, not spectral. Here, we test the hypothesis that MNTB neurons and their inputs from the cochlear nucleus and auditory nerve code sound spectra via discharge rate. The Random Spectral Shape (RSS) method was used to estimate how the levels of 100-ms duration spectrally stationary stimuli were weighted, both linearly and non-linearly, across a wide band of frequencies. In general, MNTB neurons, and their globular bushy cell inputs, were found to be well-modeled by a linear weighting of spectra demonstrating that the pathways through the MNTB can accurately encode sound spectra including those resulting from the acoustical cues to sound location provided by head-related directional transfer functions (DTFs). Together with the anatomical and biophysical specializations for timing in the MNTB-LSO complex, these mechanisms may allow ILDs to be computed for complex stimuli with rapid spectrotemporally-modulated envelopes such as speech and animal vocalizations and moving sound sources.
Collapse
Affiliation(s)
- Kanthaiah Koka
- Department of Physiology and Biophysics, University of Colorado School of Medicine Aurora, CO, USA
| | - Daniel J Tollin
- Department of Physiology and Biophysics, University of Colorado School of Medicine Aurora, CO, USA
| |
Collapse
|
39
|
Akram S, Englitz B, Elhilali M, Simon JZ, Shamma SA. Investigating the neural correlates of a streaming percept in an informational-masking paradigm. PLoS One 2014; 9:e114427. [PMID: 25490720 PMCID: PMC4260833 DOI: 10.1371/journal.pone.0114427] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2014] [Accepted: 11/10/2014] [Indexed: 11/19/2022] Open
Abstract
Humans routinely segregate a complex acoustic scene into different auditory streams, through the extraction of bottom-up perceptual cues and the use of top-down selective attention. To determine the neural mechanisms underlying this process, neural responses obtained through magnetoencephalography (MEG) were correlated with behavioral performance in the context of an informational masking paradigm. In half the trials, subjects were asked to detect frequency deviants in a target stream, consisting of a rhythmic tone sequence, embedded in a separate masker stream composed of a random cloud of tones. In the other half of the trials, subjects were exposed to identical stimuli but asked to perform a different task—to detect tone-length changes in the random cloud of tones. In order to verify that the normalized neural response to the target sequence served as an indicator of streaming, we correlated neural responses with behavioral performance under a variety of stimulus parameters (target tone rate, target tone frequency, and the “protection zone”, that is, the spectral area with no tones around the target frequency) and attentional states (changing task objective while maintaining the same stimuli). In all conditions that facilitated target/masker streaming behaviorally, MEG normalized neural responses also changed in a manner consistent with the behavior. Thus, attending to the target stream caused a significant increase in power and phase coherence of the responses in recording channels correlated with an increase in the behavioral performance of the listeners. Normalized neural target responses also increased as the protection zone widened and as the frequency of the target tones increased. Finally, when the target sequence rate increased, the buildup of the normalized neural responses was significantly faster, mirroring the accelerated buildup of the streaming percepts. Our data thus support close links between the perceptual and neural consequences of the auditory stream segregation.
Collapse
Affiliation(s)
- Sahar Akram
- The Institute for Systems Research, University of Maryland, College Park, Maryland, United States of America
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland, United States of America
- * E-mail:
| | - Bernhard Englitz
- The Institute for Systems Research, University of Maryland, College Park, Maryland, United States of America
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland, United States of America
- Département d'Etudes Cognitives, Ecole normale supérieure, Paris, France
- Department of Neurophysiology, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| | - Mounya Elhilali
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Jonathan Z. Simon
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland, United States of America
- Department of Biology, University of Maryland University, College Park, Maryland, United States of America
| | - Shihab A. Shamma
- The Institute for Systems Research, University of Maryland, College Park, Maryland, United States of America
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland, United States of America
- Département d'Etudes Cognitives, Ecole normale supérieure, Paris, France
| |
Collapse
|
40
|
Lazar AA, Slutskiy YB. Channel identification machines for multidimensional receptive fields. Front Comput Neurosci 2014; 8:117. [PMID: 25309413 PMCID: PMC4176398 DOI: 10.3389/fncom.2014.00117] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2014] [Accepted: 08/31/2014] [Indexed: 12/04/2022] Open
Abstract
We present algorithms for identifying multidimensional receptive fields directly from spike trains produced by biophysically-grounded neuron models. We demonstrate that only the projection of a receptive field onto the input stimulus space may be perfectly identified and derive conditions under which this identification is possible. We also provide detailed examples of identification of neural circuits incorporating spatiotemporal and spectrotemporal receptive fields.
Collapse
Affiliation(s)
- Aurel A Lazar
- Bionet Group, Department of Electrical Engineering, Columbia University in the City of New York New York, NY, USA
| | - Yevgeniy B Slutskiy
- Bionet Group, Department of Electrical Engineering, Columbia University in the City of New York New York, NY, USA
| |
Collapse
|
41
|
Divenyi P. Decreased ability in the segregation of dynamically changing vowel-analog streams: a factor in the age-related cocktail-party deficit? Front Neurosci 2014; 8:144. [PMID: 24971047 PMCID: PMC4054799 DOI: 10.3389/fnins.2014.00144] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2014] [Accepted: 05/22/2014] [Indexed: 11/18/2022] Open
Abstract
Pairs of harmonic complexes with different fundamental frequencies f0 (105 and 189 Hz or 105 and 136 Hz) but identical bandwidth (0.25–3 kHz) were band-pass filtered using a filter having an identical center frequency of 1 kHz. The filter's center frequency was modulated using a triangular wave having a 5-Hz modulation frequency fmod to obtain a pair of vowel-analog waveforms with dynamically varying single-formant transitions. The target signal S contained a single modulation cycle starting either at a phase of −π/2 (up-down) or π/2 (down-up), whereas the longer distracter N contained several cycles of the modulating triangular wave starting at a random phase. The level at which the target formant's modulating phase could be correctly identified was adaptively determined for several distracter levels and several extents of frequency swing (10–55%) in a group of experienced normal-hearing young and a group of experienced elderly individuals with hearing loss not exceeding one considered moderate. The most important result was that, for the two f0 differences, all distracter levels, and all frequency swing extents tested, elderly listeners needed about 20 dB larger S/N ratios than the young. Results also indicate that identification thresholds of both the elderly and the young listeners are between 4 and 12 dB higher than similarly determined detection thresholds and that, contrary to detection, identification is not a linear function of distracter level. Since formant transitions represent potent cues for speech intelligibility, the large S/N ratios required by the elderly for correct discrimination of single-formant transition dynamics may at least partially explain the well-documented intelligibility loss of speech in babble noise by the elderly.
Collapse
Affiliation(s)
- Pierre Divenyi
- Department of Music, Center for Computer Research in Music and Acoustics, Stanford University Stanford, CA, USA ; Speech and Hearing Research, Veterans Affairs Northern California Health Care System Martinez, CA, USA
| |
Collapse
|
42
|
Chabot-Leclerc A, Jørgensen S, Dau T. The role of auditory spectro-temporal modulation filtering and the decision metric for speech intelligibility prediction. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 135:3502-12. [PMID: 24907813 DOI: 10.1121/1.4873517] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Speech intelligibility models typically consist of a preprocessing part that transforms stimuli into some internal (auditory) representation and a decision metric that relates the internal representation to speech intelligibility. The present study analyzed the role of modulation filtering in the preprocessing of different speech intelligibility models by comparing predictions from models that either assume a spectro-temporal (i.e., two-dimensional) or a temporal-only (i.e., one-dimensional) modulation filterbank. Furthermore, the role of the decision metric for speech intelligibility was investigated by comparing predictions from models based on the signal-to-noise envelope power ratio, SNRenv, and the modulation transfer function, MTF. The models were evaluated in conditions of noisy speech (1) subjected to reverberation, (2) distorted by phase jitter, or (3) processed by noise reduction via spectral subtraction. The results suggested that a decision metric based on the SNRenv may provide a more general basis for predicting speech intelligibility than a metric based on the MTF. Moreover, the one-dimensional modulation filtering process was found to be sufficient to account for the data when combined with a measure of across (audio) frequency variability at the output of the auditory preprocessing. A complex spectro-temporal modulation filterbank might therefore not be required for speech intelligibility prediction.
Collapse
Affiliation(s)
- Alexandre Chabot-Leclerc
- Department of Electrical Engineering, Centre for Applied Hearing Research, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
| | - Søren Jørgensen
- Department of Electrical Engineering, Centre for Applied Hearing Research, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
| | - Torsten Dau
- Department of Electrical Engineering, Centre for Applied Hearing Research, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
| |
Collapse
|
43
|
A new and fast characterization of multiple encoding properties of auditory neurons. Brain Topogr 2014; 28:379-400. [PMID: 24869676 DOI: 10.1007/s10548-014-0375-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2013] [Accepted: 05/07/2014] [Indexed: 10/25/2022]
Abstract
The functional properties of auditory cortex neurons are most often investigated separately, through spectrotemporal receptive fields (STRFs) for the frequency tuning and the use of frequency sweeps sounds for selectivity to velocity and direction. In fact, auditory neurons are sensitive to a multidimensional space of acoustic parameters where spectral, temporal and spatial dimensions interact. We designed a multi-parameter stimulus, the random double sweep (RDS), composed of two uncorrelated random sweeps, which gives an easy, fast and simultaneous access to frequency tuning as well as frequency modulation sweep direction and velocity selectivity, frequency interactions and temporal properties of neurons. Reverse correlation techniques applied to recordings from the primary auditory cortex of guinea pigs and rats in response to RDS stimulation revealed the variety of temporal dynamics of acoustic patterns evoking an enhanced or suppressed firing rate. Group results on these two species revealed less frequent suppression areas in frequency tuning STRFs, the absence of downward sweep selectivity, and lower phase locking abilities in the auditory cortex of rats compared to guinea pigs.
Collapse
|
44
|
Abstract
Complex natural and environmental sounds, such as speech and music, convey information along both spectral and temporal dimensions. The cortical representation of such stimuli rapidly adapts when animals become actively engaged in discriminating them. In this study, we examine the nature of these changes using simplified spectrotemporal versions (upward vs downward shifting tone sequences) with domestic ferrets (Mustela putorius). Cortical processing rapidly adapted to enhance the contrast between the two discriminated stimulus categories, by changing spectrotemporal receptive field properties to encode both the spectral and temporal structure of the tone sequences. Furthermore, the valence of the changes was closely linked to the task reward structure: stimuli associated with negative reward became enhanced relative to those associated with positive reward. These task- and-stimulus-related spectrotemporal receptive field changes occurred only in trained animals during, and immediately following, behavior. This plasticity was independently confirmed by parallel changes in a directionality function measured from the responses to the transition of tone sequences during task performance. The results demonstrate that induced patterns of rapid plasticity reflect closely the spectrotemporal structure of the task stimuli, thus extending the functional relevance of rapid task-related plasticity to the perception and learning of natural sounds such speech and animal vocalizations.
Collapse
|
45
|
Abstract
Speech and other natural vocalizations are characterized by large modulations in their sound envelope. The timing of these modulations contains critical information for discrimination of important features, such as phonemes. We studied how depression of synaptic inputs, a mechanism frequently reported in cortex, can contribute to the encoding of envelope dynamics. Using a nonlinear stimulus-response model that accounted for synaptic depression, we predicted responses of neurons in ferret primary auditory cortex (A1) to stimuli with natural temporal modulations. The depression model consistently performed better than linear and second-order models previously used to characterize A1 neurons, and it produced more biologically plausible fits. To test how synaptic depression can contribute to temporal stimulus integration, we used nonparametric maximum a posteriori decoding to compare the ability of neurons showing and not showing depression to reconstruct the stimulus envelope. Neurons showing evidence for depression reconstructed stimuli over a longer range of latencies. These findings suggest that variation in depression across the cortical population supports a rich code for representing the temporal dynamics of natural sounds.
Collapse
|
46
|
Santoro R, Moerel M, De Martino F, Goebel R, Ugurbil K, Yacoub E, Formisano E. Encoding of natural sounds at multiple spectral and temporal resolutions in the human auditory cortex. PLoS Comput Biol 2014; 10:e1003412. [PMID: 24391486 PMCID: PMC3879146 DOI: 10.1371/journal.pcbi.1003412] [Citation(s) in RCA: 121] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2013] [Accepted: 11/12/2013] [Indexed: 11/18/2022] Open
Abstract
Functional neuroimaging research provides detailed observations of the response patterns that natural sounds (e.g. human voices and speech, animal cries, environmental sounds) evoke in the human brain. The computational and representational mechanisms underlying these observations, however, remain largely unknown. Here we combine high spatial resolution (3 and 7 Tesla) functional magnetic resonance imaging (fMRI) with computational modeling to reveal how natural sounds are represented in the human brain. We compare competing models of sound representations and select the model that most accurately predicts fMRI response patterns to natural sounds. Our results show that the cortical encoding of natural sounds entails the formation of multiple representations of sound spectrograms with different degrees of spectral and temporal resolution. The cortex derives these multi-resolution representations through frequency-specific neural processing channels and through the combined analysis of the spectral and temporal modulations in the spectrogram. Furthermore, our findings suggest that a spectral-temporal resolution trade-off may govern the modulation tuning of neuronal populations throughout the auditory cortex. Specifically, our fMRI results suggest that neuronal populations in posterior/dorsal auditory regions preferably encode coarse spectral information with high temporal precision. Vice-versa, neuronal populations in anterior/ventral auditory regions preferably encode fine-grained spectral information with low temporal precision. We propose that such a multi-resolution analysis may be crucially relevant for flexible and behaviorally-relevant sound processing and may constitute one of the computational underpinnings of functional specialization in auditory cortex. How does the human brain analyze natural sounds? Previous functional neuroimaging research could only describe the response patterns that sounds evoke in the human brain at the level of preferential regional activations. A comprehensive account of the neural basis of human hearing, however, requires deriving computational models that are able to provide quantitative predictions of brain responses to natural sounds. Here, we make a significant step in this direction by combining functional magnetic resonance imaging (fMRI) with computational modeling. We compare competing computational models of sound representations and select the model that most accurately predicts the measured fMRI response patterns. The computational models describe the processing of three relevant properties of natural sounds: frequency, temporal modulations and spectral modulations. We find that a model that represents spectral and temporal modulations jointly and in a frequency-dependent fashion provides the best account of fMRI responses and that the functional specialization of auditory cortical fields can be partially accounted for by their modulation tuning. Our results provide insights on how natural sounds are encoded in human auditory cortex and our methodological approach constitutes an advance in the way this question can be addressed in future studies.
Collapse
Affiliation(s)
- Roberta Santoro
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands
- Maastricht Brain Imaging Center (MBIC), Maastricht, The Netherlands
| | - Michelle Moerel
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands
- Maastricht Brain Imaging Center (MBIC), Maastricht, The Netherlands
| | - Federico De Martino
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands
- Maastricht Brain Imaging Center (MBIC), Maastricht, The Netherlands
- Center for Magnetic Resonance Research, Department of Radiology, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Rainer Goebel
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands
- Maastricht Brain Imaging Center (MBIC), Maastricht, The Netherlands
- Department of Neuroimaging and Neuromodeling, Netherlands Institute for Neuroscience, Royal Netherlands Academy of Arts and Sciences (KNAW), Amsterdam, The Netherlands
| | - Kamil Ugurbil
- Center for Magnetic Resonance Research, Department of Radiology, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Essa Yacoub
- Center for Magnetic Resonance Research, Department of Radiology, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Elia Formisano
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands
- Maastricht Brain Imaging Center (MBIC), Maastricht, The Netherlands
- * E-mail:
| |
Collapse
|
47
|
Lazar AA, Slutskiy YB. Functional identification of spike-processing neural circuits. Neural Comput 2013; 26:264-305. [PMID: 24206386 DOI: 10.1162/neco_a_00543] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
We introduce a novel approach for a complete functional identification of biophysical spike-processing neural circuits. The circuits considered accept multidimensional spike trains as their input and comprise a multitude of temporal receptive fields and conductance-based models of action potential generation. Each temporal receptive field describes the spatiotemporal contribution of all synapses between any two neurons and incorporates the (passive) processing carried out by the dendritic tree. The aggregate dendritic current produced by a multitude of temporal receptive fields is encoded into a sequence of action potentials by a spike generator modeled as a nonlinear dynamical system. Our approach builds on the observation that during any experiment, an entire neural circuit, including its receptive fields and biophysical spike generators, is projected onto the space of stimuli used to identify the circuit. Employing the reproducing kernel Hilbert space (RKHS) of trigonometric polynomials to describe input stimuli, we quantitatively describe the relationship between underlying circuit parameters and their projections. We also derive experimental conditions under which these projections converge to the true parameters. In doing so, we achieve the mathematical tractability needed to characterize the biophysical spike generator and identify the multitude of receptive fields. The algorithms obviate the need to repeat experiments in order to compute the neurons' rate of response, rendering our methodology of interest to both experimental and theoretical neuroscientists.
Collapse
Affiliation(s)
- Aurel A Lazar
- Department of Electrical Engineering, Columbia University, New York, NY 10027, U.S.A.
| | | |
Collapse
|
48
|
Understanding the neurophysiological basis of auditory abilities for social communication: a perspective on the value of ethological paradigms. Hear Res 2013; 305:3-9. [PMID: 23994815 DOI: 10.1016/j.heares.2013.08.008] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/25/2013] [Revised: 08/11/2013] [Accepted: 08/19/2013] [Indexed: 11/21/2022]
Abstract
Acoustic communication between animals requires them to detect, discriminate, and categorize conspecific or heterospecific vocalizations in their natural environment. Laboratory studies of the auditory-processing abilities that facilitate these tasks have typically employed a broad range of acoustic stimuli, ranging from natural sounds like vocalizations to "artificial" sounds like pure tones and noise bursts. However, even when using vocalizations, laboratory studies often test abilities like categorization in relatively artificial contexts. Consequently, it is not clear whether neural and behavioral correlates of these tasks (1) reflect extensive operant training, which drives plastic changes in auditory pathways, or (2) the innate capacity of the animal and its auditory system. Here, we review a number of recent studies, which suggest that adopting more ethological paradigms utilizing natural communication contexts are scientifically important for elucidating how the auditory system normally processes and learns communication sounds. Additionally, since learning the meaning of communication sounds generally involves social interactions that engage neuromodulatory systems differently than laboratory-based conditioning paradigms, we argue that scientists need to pursue more ethological approaches to more fully inform our understanding of how the auditory system is engaged during acoustic communication. This article is part of a Special Issue entitled "Communication Sounds and the Brain: New Directions and Perspectives".
Collapse
|
49
|
Abstract
Receptive fields (RFs) of neurons in primary visual cortex have traditionally been subdivided into two major classes: "simple" and "complex" cells. Simple cells were originally defined by the existence of segregated subregions within their RF that respond to either the on- or offset of a light bar and by spatial summation within each of these regions, whereas complex cells had ON and OFF regions that were coextensive in space [Hubel DH, et al. (1962) J Physiol 160:106-154]. Although other definitions based on the linearity of response modulation have been proposed later [Movshon JA, et al. (1978) J Physiol 283:53-77; Skottun BC, et al. (1991) Vision Res 31(7-8):1079-1086], the segregation of ON and OFF subregions has remained an important criterion for the distinction between simple and complex cells. Here we report that response profiles of neurons in primary auditory cortex of monkeys show a similar distinction: one group of cells has segregated ON and OFF subregions in frequency space; and another group shows ON and OFF responses within largely overlapping response profiles. This observation is intriguing for two reasons: (i) spectrotemporal dissociation in the auditory domain provides a basic neural mechanism for the segregation of sounds, a fundamental prerequisite for auditory figure-ground discrimination; and (ii) the existence of similar types of RF organization in visual and auditory cortex would support the existence of a common canonical processing algorithm within cortical columns.
Collapse
|
50
|
Functional localization of the auditory thalamus in individual human subjects. Neuroimage 2013; 78:295-304. [PMID: 23603350 DOI: 10.1016/j.neuroimage.2013.04.035] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2012] [Revised: 03/20/2013] [Accepted: 04/08/2013] [Indexed: 01/14/2023] Open
Abstract
Here we describe an easily implemented protocol based on sparse MR acquisition and a scrambled 'music' auditory stimulus that allows for reliable measurement of functional activity within the medial geniculate body (MGB, the primary auditory thalamic nucleus) in individual subjects. We find that our method is equally accurate and reliable as previously developed structural methods, and offers significantly more accuracy in identifying the MGB than group based methods. We also find that lateralization and binaural summation within the MGB resemble those found in the auditory cortex.
Collapse
|