1
|
Mackey C, Tarabillo A, Ramachandran R. Three psychophysical metrics of auditory temporal integration in macaques. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:3176. [PMID: 34717465 PMCID: PMC8556002 DOI: 10.1121/10.0006658] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
The relationship between sound duration and detection threshold has long been thought to reflect temporal integration. Reports of species differences in this relationship are equivocal: some meta-analyses report no species differences, whereas others report substantial differences, particularly between humans and their close phylogenetic relatives, macaques. This renders translational work in macaques problematic. To reevaluate this difference, tone detection performance was measured in macaques using a go/no-go reaction time (RT) task at various tone durations and in the presence of broadband noise (BBN). Detection thresholds, RTs, and the dynamic range (DR) of the psychometric function decreased as the tone duration increased. The threshold by duration trends suggest macaques integrate at a similar rate to humans. The RT trends also resemble human data and are the first reported in animals. Whereas the BBN did not affect how the threshold or RT changed with the duration, it substantially reduced the DR at short durations. A probabilistic Poisson model replicated the effects of duration on threshold and DR and required integration from multiple simulated auditory nerve fibers to explain the performance at shorter durations. These data suggest that, contrary to previous studies, macaques are uniquely well-suited to model human temporal integration and form the baseline for future neurophysiological studies.
Collapse
Affiliation(s)
- Chase Mackey
- Neuroscience Graduate Program, Vanderbilt University, Nashville, Tennessee 37240, USA
| | - Alejandro Tarabillo
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA
| | - Ramnarayan Ramachandran
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA
| |
Collapse
|
2
|
Pannese A, Grandjean D, Frühholz S. Subcortical processing in auditory communication. Hear Res 2015; 328:67-77. [DOI: 10.1016/j.heares.2015.07.003] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/03/2015] [Revised: 06/23/2015] [Accepted: 07/01/2015] [Indexed: 12/21/2022]
|
3
|
Marmel F, Rodríguez-Mendoza MA, Lopez-Poveda EA. Stochastic undersampling steepens auditory threshold/duration functions: implications for understanding auditory deafferentation and aging. Front Aging Neurosci 2015; 7:63. [PMID: 26029098 PMCID: PMC4432715 DOI: 10.3389/fnagi.2015.00063] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2014] [Accepted: 04/11/2015] [Indexed: 12/03/2022] Open
Abstract
It has long been known that some listeners experience hearing difficulties out of proportion with their audiometric losses. Notably, some older adults as well as auditory neuropathy patients have temporal-processing and speech-in-noise intelligibility deficits not accountable for by elevated audiometric thresholds. The study of these hearing deficits has been revitalized by recent studies that show that auditory deafferentation comes with aging and can occur even in the absence of an audiometric loss. The present study builds on the stochastic undersampling principle proposed by Lopez-Poveda and Barrios (2013) to account for the perceptual effects of auditory deafferentation. Auditory threshold/duration functions were measured for broadband noises that were stochastically undersampled to various different degrees. Stimuli with and without undersampling were equated for overall energy in order to focus on the changes that undersampling elicited on the stimulus waveforms, and not on its effects on the overall stimulus energy. Stochastic undersampling impaired the detection of short sounds (<20 ms). The detection of long sounds (>50 ms) did not change or improved, depending on the degree of undersampling. The results for short sounds show that stochastic undersampling, and hence presumably deafferentation, can account for the steeper threshold/duration functions observed in auditory neuropathy patients and older adults with (near) normal audiometry. This suggests that deafferentation might be diagnosed using pure-tone audiometry with short tones. It further suggests that the auditory system of audiometrically normal older listeners might not be “slower than normal”, as is commonly thought, but simply less well afferented. Finally, the results for both short and long sounds support the probabilistic theories of detectability that challenge the idea that auditory threshold occurs by integration of sound energy over time.
Collapse
Affiliation(s)
- Frédéric Marmel
- Audición Computacional y Psicoacústica, Instituto de Neurociencias de Castilla y León, Universidad de Salamanca Salamanca, Spain ; Grupo de Audiología, Instituto de Investigación Biomédica de Salamanca, Universidad de Salamanca Salamanca, Spain
| | - Medardo A Rodríguez-Mendoza
- Audición Computacional y Psicoacústica, Instituto de Neurociencias de Castilla y León, Universidad de Salamanca Salamanca, Spain
| | - Enrique A Lopez-Poveda
- Audición Computacional y Psicoacústica, Instituto de Neurociencias de Castilla y León, Universidad de Salamanca Salamanca, Spain ; Grupo de Audiología, Instituto de Investigación Biomédica de Salamanca, Universidad de Salamanca Salamanca, Spain ; Facultad de Medicina, Departamento de Cirugía, Universidad de Salamanca Salamanca, Spain
| |
Collapse
|
4
|
Lopez-Poveda EA. Why do I hear but not understand? Stochastic undersampling as a model of degraded neural encoding of speech. Front Neurosci 2014; 8:348. [PMID: 25400543 PMCID: PMC4214224 DOI: 10.3389/fnins.2014.00348] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2014] [Accepted: 10/12/2014] [Indexed: 11/13/2022] Open
Abstract
Hearing impairment is a serious disease with increasing prevalence. It is defined based on increased audiometric thresholds but increased thresholds are only partly responsible for the greater difficulty understanding speech in noisy environments experienced by some older listeners or by hearing-impaired listeners. Identifying the additional factors and mechanisms that impair intelligibility is fundamental to understanding hearing impairment but these factors remain uncertain. Traditionally, these additional factors have been sought in the way the speech spectrum is encoded in the pattern of impaired mechanical cochlear responses. Recent studies, however, are steering the focus toward impaired encoding of the speech waveform in the auditory nerve. In our recent work, we gave evidence that a significant factor might be the loss of afferent auditory nerve fibers, a pathology that comes with aging or noise overexposure. Our approach was based on a signal-processing analogy whereby the auditory nerve may be regarded as a stochastic sampler of the sound waveform and deafferentation may be described in terms of waveform undersampling. We showed that stochastic undersampling simultaneously degrades the encoding of soft and rapid waveform features, and that this degrades speech intelligibility in noise more than in quiet without significant increases in audiometric thresholds. Here, we review our recent work in a broader context and argue that the stochastic undersampling analogy may be extended to study the perceptual consequences of various different hearing pathologies and their treatment.
Collapse
Affiliation(s)
- Enrique A. Lopez-Poveda
- Audición Computacional y Psicoacústica, Instituto de Neurociencias de Castilla y León, Universidad de SalamancaSalamanca, Spain
- Grupo de Audiología, Instituto de Investigación Biomédica de Salamanca, Universidad de SalamancaSalamanca, Spain
- Departamento de Cirugía, Facultad de Medicina, Universidad de SalamancaSalamanca, Spain
| |
Collapse
|
5
|
Trevino A, Coleman TP, Allen J. A dynamical point process model of auditory nerve spiking in response to complex sounds. J Comput Neurosci 2009; 29:193-201. [PMID: 19353258 DOI: 10.1007/s10827-009-0146-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2008] [Revised: 02/26/2009] [Accepted: 03/03/2009] [Indexed: 10/20/2022]
Abstract
In this paper, we develop a dynamical point process model for how complex sounds are represented by neural spiking in auditory nerve fibers. Although many models have been proposed, our point process model is the first to capture elements of spontaneous rate, refractory effects, frequency selectivity, phase locking at low frequencies, and short-term adaptation, all within a compact parametric approach. Using a generalized linear model for the point process conditional intensity, driven by extrinsic covariates, previous spiking, and an input-dependent charging/discharging capacitor model, our approach robustly captures the aforementioned features on datasets taken at the auditory nerve of chinchilla in response to speech inputs. We confirm the goodness of fit of our approach using the Time-Rescaling Theorem for point processes.
Collapse
Affiliation(s)
- Andrea Trevino
- Department of Electrical & Computer Engineering Neuroscience Program, University of Illinois, Urbana, USA.
| | - Todd P Coleman
- Department of Electrical & Computer Engineering Neuroscience Program, University of Illinois, Urbana, USA
| | - Jont Allen
- Department of Electrical & Computer Engineering Neuroscience Program, University of Illinois, Urbana, USA
| |
Collapse
|
6
|
The psychoacoustics of noise vocoded speech: A physiological means to a perceptual end. Hear Res 2008; 241:87-96. [PMID: 18556159 DOI: 10.1016/j.heares.2008.05.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/04/2007] [Revised: 04/29/2008] [Accepted: 05/06/2008] [Indexed: 10/22/2022]
|
7
|
Woolley SMN, Gill PR, Theunissen FE. Stimulus-dependent auditory tuning results in synchronous population coding of vocalizations in the songbird midbrain. J Neurosci 2006; 26:2499-512. [PMID: 16510728 PMCID: PMC6793651 DOI: 10.1523/jneurosci.3731-05.2006] [Citation(s) in RCA: 118] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Physiological studies in vocal animals such as songbirds indicate that vocalizations drive auditory neurons particularly well. But the neural mechanisms whereby vocalizations are encoded differently from other sounds in the auditory system are unknown. We used spectrotemporal receptive fields (STRFs) to study the neural encoding of song versus the encoding of a generic sound, modulation-limited noise, by single neurons and the neuronal population in the zebra finch auditory midbrain. The noise was designed to match song in frequency, spectrotemporal modulation boundaries, and power. STRF calculations were balanced between the two stimulus types by forcing a common stimulus subspace. We found that 91% of midbrain neurons showed significant differences in spectral and temporal tuning properties when birds heard song and when birds heard modulation-limited noise. During the processing of noise, spectrotemporal tuning was highly variable across cells. During song processing, the tuning of individual cells became more similar; frequency tuning bandwidth increased, best temporal modulation frequency increased, and spike timing became more precise. The outcome was a population response to song that encoded rapidly changing sounds with power and precision, resulting in a faithful neural representation of the temporal pattern of a song. Modeling responses to song using the tuning to modulation-limited noise showed that the population response would not encode song as precisely or robustly. We conclude that stimulus-dependent changes in auditory tuning during song processing facilitate the high-fidelity encoding of the temporal pattern of a song.
Collapse
Affiliation(s)
- Sarah M N Woolley
- Helen Wills Neuroscience Institute, Department of Psychology, University of California, Berkeley, California 94720, USA.
| | | | | |
Collapse
|
8
|
Loebach JL, Wickesberg RE. The representation of noise vocoded speech in the auditory nerve of the chinchilla: physiological correlates of the perception of spectrally reduced speech. Hear Res 2006; 213:130-44. [PMID: 16497455 DOI: 10.1016/j.heares.2006.01.011] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/28/2005] [Revised: 11/17/2005] [Accepted: 01/16/2006] [Indexed: 12/14/2022]
Abstract
This study investigated the neural representation of naturally produced and noise vocoded speech signals in the auditory nerve of the chinchilla. The syllables [see text] produced by male speakers were used to synthesize noise vocoded speech stimuli containing one, two, three and four bands of envelope modulated noise. The ensemble response of the auditory nerve, computed by pooling the PST histograms across many auditory nerve fibers, revealed temporal patterns in the responses to the natural tokens that uniquely identified the stop consonants. The responses to the 3- and 4-band noise vocoded tokens contained temporal patterns that were nearly identical to those observed for the natural tokens, while the responses to the 1- and 2-band tokens were significantly different (p<0.0001). The ALSR, ALIR and autocorrelation of the pooled PST histograms represented the detail of the frequency spectrum for a naturally produced vowel, while the driven rate was unreliable. Each of these spectral analyses failed to reveal significant information about the noise vocoded vowels. These results suggest that temporal patterns in the responses of the auditory nerve can provide the cues necessary for the recognition of noise vocoded stop consonants.
Collapse
Affiliation(s)
- Jeremy L Loebach
- Department of Psychology, University of Illinois at Urbana-Champaign, 603 East Daniel Street, Champaign, IL 61820, USA.
| | | |
Collapse
|
9
|
Abstract
This investigation compared the encoding of naturally-produced, whispered and normally-voiced vowels by auditory nerve fibers. Speech syllables containing the vowels /open o/ and /ae/ were produced by two female speakers and presented at three intensities to ketamine-anesthetized chinchillas. Six different representations of the spectral components in the vowels in the responses of the auditory nerve fibers were evaluated. For both normal and whispered vowels over a 30 dB range, the formant peaks in the vowel were best displayed using rate-place representations. The spectral detail in the vowel was revealed by average localized synchronized rates (ALSR) and autocorrelations of individual peristimulus time histograms. The average localized interval rates (ALIR), autocorrelations of ensemble responses, and autocorrelations of individual spike trains demonstrated poor representations of vowel spectra, although the frequency components of normally-voiced vowels had better representations than those of whispered vowels. These analyses suggest that rate-based and synchronization-based measures yields two very different pieces of information, but only a normalized rate-based measure consistently identified the formants of both the whispered and normally-voiced vowels.
Collapse
Affiliation(s)
- Hanna E Stevens
- Neuroscience Program, University of Illinois, Urbana-Champaign, Urbana, IL 61801, USA.
| | | |
Collapse
|
10
|
|
11
|
Clarey JC, Paolini AG, Grayden DB, Burkitt AN, Clark GM. Ventral cochlear nucleus coding of voice onset time in naturally spoken syllables. Hear Res 2004; 190:37-59. [PMID: 15051129 DOI: 10.1016/s0378-5955(04)00017-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/28/2003] [Accepted: 12/09/2003] [Indexed: 10/26/2022]
Abstract
These experiments examined the coding of the voice onset time (VOT) of six naturally spoken syllables, presented at a number of intensities, by ventral cochlear nucleus (VCN) neurons in rats anesthetized with urethane. VOT is one of the cues for the identification of a stop consonant, and is defined by the interval between stop release and the first glottal pulse that marks the onset of voicing associated with a vowel. The syllables presented (/bot/, /dot/, /got/, /pot/, /tot/, /kot/) each had a different VOT, ranging between 10 and 108 ms. Extracellular recordings were made from single neurons (N=202) with a wide range of best frequencies (BFs; 0.66-10 kHz) that represented the major VCN response types - primary-like (67.8% of sample), chopper (19.8%), and onset (12.4%) neurons. The different VOTs of the syllables were accurately reflected in sharp, precisely timed, and statistically significant changes in average discharge rate in all cell types, as well as the entire VCN sample. The prominence of the response to stop release and voice onset, and the level of activity prior to the VOT, were influenced by syllable intensity and the spectrum of stop release, as well as cell BF and type. Our results suggest that the responses of VCN cells with BFs above the first formant frequency are dominated by their sensitivity to the onsets of broadband events in speech, and allows them to convey accurate information about a syllable's VOT.
Collapse
Affiliation(s)
- Janine C Clarey
- The Bionic Ear Institute, 384-388 Albert St., East Melbourne, Vic. 3002, Australia.
| | | | | | | | | |
Collapse
|
12
|
Abstract
Recent physiological results from the auditory nerve suggest that specific response patterns for word-initial /d/ and /t/ are present across acoustic variations. In this study, single cell recordings from the auditory nerve of anesthetized chinchillas in response to the stop consonants /d/ and /t/ presented in a variety of acoustic contexts were analyzed. Consonants had variable word positions, vowel contexts, types of phonation, and speakers. The response patterns from individual auditory nerve fibers did not reliably differentiate the consonants /d/ and /t/. Global average peristimulus time histograms (GAPSTs) contained invariant patterns for all tokens of each word-final consonant, regardless of context. Ensemble responses to word-final consonants had similarities in their temporal patterns to those in GAPSTs for word-initial consonants. The similar representations in the ensemble auditory nerve response for consonants with different acoustic content suggest a possible substrate for perceptual normalization. Both invariant and variable elements of speech can be computed from the ensemble response of the auditory nerve.
Collapse
Affiliation(s)
- Hanna E Stevens
- 538 Psychology, Neuroscience Program, University of Illinois at Urbana-Champaign, 603 E. Daniel St., 61820, USA.
| | | |
Collapse
|
13
|
Ascending Pathways Through Ventral Nuclei of the Lateral Lemniscus and Their Possible Role in Pattern Recognition in Natural Sounds. ACTA ACUST UNITED AC 2002. [DOI: 10.1007/978-1-4757-3654-0_6] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2023]
|
14
|
Role of intrinsic conductances underlying responses to transients in octopus cells of the cochlear nucleus. J Neurosci 1999. [PMID: 10191307 DOI: 10.1523/jneurosci.19-08-02897.1999] [Citation(s) in RCA: 98] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Recognition of acoustic patterns in natural sounds depends on the transmission of temporal information. Octopus cells of the mammalian ventral cochlear nucleus form a pathway that encodes the timing of firing of groups of auditory nerve fibers with exceptional precision. Whole-cell patch recordings from octopus cells were used to examine how the brevity and precision of firing are shaped by intrinsic conductances. Octopus cells responded to steps of current with small, rapid voltage changes. Input resistances and membrane time constants averaged 2.4 MOmega and 210 microseconds, respectively (n = 15). As a result of the low input resistances of octopus cells, action potential initiation required currents of at least 2 nA for their generation and never occurred repetitively. Backpropagated action potentials recorded at the soma were small (10-30 mV), brief (0.24-0.54 msec), and tetrodotoxin-sensitive. The low input resistance arose in part from an inwardly rectifying mixed cationic conductance blocked by cesium and potassium conductances blocked by 4-aminopyridine (4-AP). Conductances blocked by 4-AP also contributed to the repolarization of the action potentials and suppressed the generation of calcium spikes. In the face of the high membrane conductance of octopus cells, sodium and calcium conductances amplified depolarizations produced by intracellular current injection over a time course similar to that of EPSPs. We suggest that this transient amplification works in concert with the shunting influence of potassium and mixed cationic conductances to enhance the encoding of the onset of synchronous auditory nerve fiber activity.
Collapse
|