1
|
Serrao DS, Theruvan N, Fathima H, Pitchaimuthu AN. Contribution of Temporal Fine Structure Cues to Concurrent Vowel Identification and Perception of Zebra Speech. Int Arch Otorhinolaryngol 2024; 28:e492-e501. [PMID: 38974629 PMCID: PMC11226255 DOI: 10.1055/s-0044-1785456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Accepted: 01/16/2024] [Indexed: 07/09/2024] Open
Abstract
Introduction The limited access to temporal fine structure (TFS) cues is a reason for reduced speech-in-noise recognition in cochlear implant (CI) users. The CI signal processing schemes like electroacoustic stimulation (EAS) and fine structure processing (FSP) encode TFS in the low frequency whereas theoretical strategies such as frequency amplitude modulation encoder (FAME) encode TFS in all the bands. Objective The present study compared the effect of simulated CI signal processing schemes that either encode no TFS, TFS information in all bands, or TFS only in low-frequency bands on concurrent vowel identification (CVI) and Zebra speech perception (ZSP). Methods Temporal fine structure information was systematically manipulated using a 30-band sine-wave (SV) vocoder. The TFS was either absent (SV) or presented in all the bands as frequency modulations simulating the FAME algorithm or only in bands below 525 Hz to simulate EAS. Concurrent vowel identification and ZSP were measured under each condition in 15 adults with normal hearing. Results The CVI scores did not differ between the 3 schemes (F (2, 28) = 0.62, p = 0.55, η 2 p = 0.04). The effect of encoding TFS was observed for ZSP (F (2, 28) = 5.73, p = 0.008, η 2 p = 0.29). Perception of Zebra speech was significantly better with EAS and FAME than with SV. There was no significant difference in ZSP scores obtained with EAS and FAME ( p = 1.00) Conclusion For ZSP, the TFS cues from FAME and EAS resulted in equivalent improvements in performance compared to the SV scheme. The presence or absence of TFS did not affect the CVI scores.
Collapse
Affiliation(s)
| | | | - Hasna Fathima
- Department of Audiology and Speech-Language Pathology, Kasturba Medical College, Mangalore, Manipal Academy of Higher Education, Manipal, Karnataka, India
- Department of Audiology and Speech Language Pathology, National Institute of Speech and Hearing, Trivandrum, Kerala, India
| | - Arivudai Nambi Pitchaimuthu
- Department of Audiology and Speech-Language Pathology, Kasturba Medical College, Mangalore, Manipal Academy of Higher Education, Manipal, Karnataka, India
- Department of Audiology, Centre for Hearing Science, All India Institute of Speech & Hearing, Mysuru, India
| |
Collapse
|
2
|
Çolak M, Bayramoğlu İ, Tutar H, Altınyay Ş. Benefits of using a contralateral hearing aid in cochlear implanted children with bilateral pre-lingual profound sensorineural hearing loss on language development and auditory perception performance. ENT UPDATES 2019. [DOI: 10.32448/entupdates.601175] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
|
3
|
King A, Varnet L, Lorenzi C. Accounting for masking of frequency modulation by amplitude modulation with the modulation filter-bank concept. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:2277. [PMID: 31046322 DOI: 10.1121/1.5094344] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2018] [Accepted: 02/25/2019] [Indexed: 06/09/2023]
Abstract
Frequency modulation (FM) is assumed to be detected through amplitude modulation (AM) created by cochlear filtering for modulation rates above 10 Hz and carrier frequencies (fc) above 4 kHz. If this is the case, a model of modulation perception based on the concept of AM filters should predict masking effects between AM and FM. To test this, masking effects of sinusoidal AM on sinusoidal FM detection thresholds were assessed on normal-hearing listeners as a function of FM rate, fc, duration, AM rate, AM depth, and phase difference between FM and AM. The data were compared to predictions of a computational model implementing an AM filter-bank. Consistent with model predictions, AM masked FM with some AM-masking-AM features (broad tuning and effect of AM-masker depth). Similar masking was predicted and observed at fc = 0.5 and 5 kHz for a 2 Hz AM masker, inconsistent with the notion that additional (e.g., temporal fine-structure) cues drive slow-rate FM detection at low fc. However, masking was lower than predicted and, unlike model predictions, did not show beating or phase effects. Broadly, the modulation filter-bank concept successfully explained some AM-masking-FM effects, but could not give a complete account of both AM and FM detection.
Collapse
Affiliation(s)
- Andrew King
- Laboratoire des systèmes perceptifs, UMR CNRS 8248, Département d'Etudes Cognitives, École normale supérieure, Université Paris Sciences & Lettres, 29 rue d'Ulm, 75005 Paris, France
| | - Léo Varnet
- Laboratoire des systèmes perceptifs, UMR CNRS 8248, Département d'Etudes Cognitives, École normale supérieure, Université Paris Sciences & Lettres, 29 rue d'Ulm, 75005 Paris, France
| | - Christian Lorenzi
- Laboratoire des systèmes perceptifs, UMR CNRS 8248, Département d'Etudes Cognitives, École normale supérieure, Université Paris Sciences & Lettres, 29 rue d'Ulm, 75005 Paris, France
| |
Collapse
|
4
|
Apoux F, Youngdahl CL, Yoho SE, Healy EW. Dual-carrier processing to convey temporal fine structure cues: Implications for cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 138:1469-80. [PMID: 26428784 PMCID: PMC4575322 DOI: 10.1121/1.4928136] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2014] [Revised: 07/22/2015] [Accepted: 07/23/2015] [Indexed: 05/26/2023]
Abstract
Speech intelligibility in noise can be degraded by using vocoder processing to alter the temporal fine structure (TFS). Here it is argued that this degradation is not attributable to the loss of speech information potentially present in the TFS. Instead it is proposed that the degradation results from the loss of sound-source segregation information when two or more carriers (i.e., TFS) are substituted with only one as a consequence of vocoder processing. To demonstrate this segregation role, vocoder processing involving two carriers, one for the target and one for the background, was implemented. Because this approach does not preserve the speech TFS, it may be assumed that any improvement in intelligibility can only be a consequence of the preserved carrier duality and associated segregation cues. Three experiments were conducted using this "dual-carrier" approach. All experiments showed substantial sentence intelligibility in noise improvements compared to traditional single-carrier conditions. In several conditions, the improvement was so substantial that intelligibility approximated that for unprocessed speech in noise. A foreseeable and potentially promising implication for the dual-carrier approach involves implementation into cochlear implant speech processors, where it may provide the TFS cues necessary to segregate speech from noise.
Collapse
Affiliation(s)
- Frédéric Apoux
- Speech Psychoacoustics Laboratory, Department of Speech and Hearing Science, The Ohio State University, Columbus, Ohio 43210, USA
| | - Carla L Youngdahl
- Speech Psychoacoustics Laboratory, Department of Speech and Hearing Science, The Ohio State University, Columbus, Ohio 43210, USA
| | - Sarah E Yoho
- Speech Psychoacoustics Laboratory, Department of Speech and Hearing Science, The Ohio State University, Columbus, Ohio 43210, USA
| | - Eric W Healy
- Speech Psychoacoustics Laboratory, Department of Speech and Hearing Science, The Ohio State University, Columbus, Ohio 43210, USA
| |
Collapse
|
5
|
Abstract
Frequency modulated (FM) sweeps are common in species-specific vocalizations, including human speech. Auditory neurons selective for the direction and rate of frequency change in FM sweeps are present across species, but the synaptic mechanisms underlying such selectivity are only beginning to be understood. Even less is known about mechanisms of experience-dependent changes in FM sweep selectivity. We present three network models of synaptic mechanisms of FM sweep direction and rate selectivity that explains experimental data: (1) The 'facilitation' model contains frequency selective cells operating as coincidence detectors, summing up multiple excitatory inputs with different time delays. (2) The 'duration tuned' model depends on interactions between delayed excitation and early inhibition. The strength of delayed excitation determines the preferred duration. Inhibitory rebound can reinforce the delayed excitation. (3) The 'inhibitory sideband' model uses frequency selective inputs to a network of excitatory and inhibitory cells. The strength and asymmetry of these connections results in neurons responsive to sweeps in a single direction of sufficient sweep rate. Variations of these properties, can explain the diversity of rate-dependent direction selectivity seen across species. We show that the inhibitory sideband model can be trained using spike timing dependent plasticity (STDP) to develop direction selectivity from a non-selective network. These models provide a means to compare the proposed synaptic and spectrotemporal mechanisms of FM sweep processing and can be utilized to explore cellular mechanisms underlying experience- or training-dependent changes in spectrotemporal processing across animal models. Given the analogy between FM sweeps and visual motion, these models can serve a broader function in studying stimulus movement across sensory epithelia.
Collapse
|
6
|
Malone BJ, Scott BH, Semple MN. Encoding frequency contrast in primate auditory cortex. J Neurophysiol 2014; 111:2244-63. [PMID: 24598525 DOI: 10.1152/jn.00878.2013] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Changes in amplitude and frequency jointly determine much of the communicative significance of complex acoustic signals, including human speech. We have previously described responses of neurons in the core auditory cortex of awake rhesus macaques to sinusoidal amplitude modulation (SAM) signals. Here we report a complementary study of sinusoidal frequency modulation (SFM) in the same neurons. Responses to SFM were analogous to SAM responses in that changes in multiple parameters defining SFM stimuli (e.g., modulation frequency, modulation depth, carrier frequency) were robustly encoded in the temporal dynamics of the spike trains. For example, changes in the carrier frequency produced highly reproducible changes in shapes of the modulation period histogram, consistent with the notion that the instantaneous probability of discharge mirrors the moment-by-moment spectrum at low modulation rates. The upper limit for phase locking was similar across SAM and SFM within neurons, suggesting shared biophysical constraints on temporal processing. Using spike train classification methods, we found that neural thresholds for modulation depth discrimination are typically far lower than would be predicted from frequency tuning to static tones. This "dynamic hyperacuity" suggests a substantial central enhancement of the neural representation of frequency changes relative to the auditory periphery. Spike timing information was superior to average rate information when discriminating among SFM signals, and even when discriminating among static tones varying in frequency. This finding held even when differences in total spike count across stimuli were normalized, indicating both the primacy and generality of temporal response dynamics in cortical auditory processing.
Collapse
Affiliation(s)
- Brian J Malone
- Department of Otolaryngology-Head and Neck Surgery, University of California, San Francisco, California;
| | - Brian H Scott
- Laboratory of Neuropsychology, National Institute of Mental Health/National Institutes of Health, Bethesda, Maryland; and
| | - Malcolm N Semple
- Center for Neural Science, New York University, New York, New York
| |
Collapse
|
7
|
Won JH, Shim HJ, Lorenzi C, Rubinstein JT. Use of amplitude modulation cues recovered from frequency modulation for cochlear implant users when original speech cues are severely degraded. J Assoc Res Otolaryngol 2014; 15:423-39. [PMID: 24532186 DOI: 10.1007/s10162-014-0444-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2013] [Accepted: 01/20/2014] [Indexed: 11/30/2022] Open
Abstract
Won et al. (J Acoust Soc Am 132:1113-1119, 2012) reported that cochlear implant (CI) speech processors generate amplitude-modulation (AM) cues recovered from broadband speech frequency modulation (FM) and that CI users can use these cues for speech identification in quiet. The present study was designed to extend this finding for a wide range of listening conditions, where the original speech cues were severely degraded by manipulating either the acoustic signals or the speech processor. The manipulation of the acoustic signals included the presentation of background noise, simulation of reverberation, and amplitude compression. The manipulation of the speech processor included changing the input dynamic range and the number of channels. For each of these conditions, multiple levels of speech degradation were tested. Speech identification was measured for CI users and compared for stimuli having both AM and FM information (intact condition) or FM information only (FM condition). Each manipulation degraded speech identification performance for both intact and FM conditions. Performance for the intact and FM conditions became similar for stimuli having the most severe degradations. Identification performance generally overlapped for the intact and FM conditions. Moreover, identification performance for the FM condition was better than chance performance even at the maximum level of distortion. Finally, significant correlations were found between speech identification scores for the intact and FM conditions. Altogether, these results suggest that despite poor frequency selectivity, CI users can make efficient use of AM cues recovered from speech FM in difficult listening situations.
Collapse
Affiliation(s)
- Jong Ho Won
- Department of Audiology and Speech Pathology, University of Tennessee Health Science Center, Knoxville, TN, 37996, USA
| | | | | | | |
Collapse
|
8
|
Trujillo M, Razak KA. Altered cortical spectrotemporal processing with age-related hearing loss. J Neurophysiol 2013; 110:2873-86. [DOI: 10.1152/jn.00423.2013] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
Presbycusis (age-related hearing loss) is a prevalent disability associated with aging that impairs spectrotemporal processing, but the mechanisms of such changes remain unclear. The goal of this study was to quantify cortical responses to frequency-modulated (FM) sweeps in a mouse model of presbycusis. Previous studies showed that cortical neurons in young mice are selective for the rate of frequency change in FM sweeps. Here single-unit data on cortical selectivity and response variability to FM sweeps of either direction and different rates (0.08–20 kHz/ms) were compared across young (1–3 mo), middle-aged (6–8 mo), and old (14–20 mo) groups. Three main findings are reported. First, there is a reduction in FM rate selectivity in the old group. Second, there is a slowing of the sweep rates at which neurons likely provide best detection and discrimination of sweep rates. Third, there is an increase in trial-to-trial variability in the magnitude and timing of spikes in response to sweeps. These changes were only observed in neurons that were selective for the fast or intermediate range of sweep rates and not in neurons that preferred slow sweeps or were nonselective. Increased variability of response magnitude, but not changes in temporal fidelity or selectivity, was seen even in the middle-aged group. The results show that spectrotemporal processing becomes slow and noisy with presbycusis in specific types of neurons, suggesting receptive field mechanisms that are altered. These data suggest neural correlates of presbycusis-related reduction in the ability of humans to process rapid spectrotemporal changes.
Collapse
Affiliation(s)
- Michael Trujillo
- Graduate Neuroscience Program and Department of Psychology, University of California, Riverside, California
| | - Khaleel A. Razak
- Graduate Neuroscience Program and Department of Psychology, University of California, Riverside, California
| |
Collapse
|
9
|
Apoux F, Yoho SE, Youngdahl CL, Healy EW. Role and relative contribution of temporal envelope and fine structure cues in sentence recognition by normal-hearing listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:2205-12. [PMID: 23967950 PMCID: PMC3765279 DOI: 10.1121/1.4816413] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
The present study investigated the role and relative contribution of envelope and temporal fine structure (TFS) to sentence recognition in noise. Target and masker stimuli were added at five different signal-to-noise ratios (SNRs) and filtered into 30 contiguous frequency bands. The envelope and TFS were extracted from each band by Hilbert decomposition. The final stimuli consisted of the envelope of the target/masker sound mixture at x dB SNR and the TFS of the same sound mixture at y dB SNR. A first experiment showed a very limited contribution of TFS cues, indicating that sentence recognition in noise relies almost exclusively on temporal envelope cues. A second experiment showed that replacing the carrier of a sound mixture with noise (vocoder processing) cannot be considered equivalent to disrupting the TFS of the target signal by adding a background noise. Accordingly, a re-evaluation of the vocoder approach as a model to further understand the role of TFS cues in noisy situations may be necessary. Overall, these data are consistent with the view that speech information is primarily extracted from the envelope while TFS cues are primarily used to detect glimpses of the target.
Collapse
Affiliation(s)
- Frédéric Apoux
- Speech Psychoacoustics Laboratory, Department of Speech and Hearing Science, The Ohio State University, Columbus, Ohio 43210, USA.
| | | | | | | |
Collapse
|
10
|
Won JH, Lorenzi C, Nie K, Li X, Jameyson EM, Drennan WR, Rubinstein JT. The ability of cochlear implant users to use temporal envelope cues recovered from speech frequency modulation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2012; 132:1113-1119. [PMID: 22894230 PMCID: PMC3427369 DOI: 10.1121/1.4726013] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2011] [Revised: 04/30/2012] [Accepted: 05/06/2012] [Indexed: 06/01/2023]
Abstract
Previous studies have demonstrated that normal-hearing listeners can understand speech using the recovered "temporal envelopes," i.e., amplitude modulation (AM) cues from frequency modulation (FM). This study evaluated this mechanism in cochlear implant (CI) users for consonant identification. Stimuli containing only FM cues were created using 1, 2, 4, and 8-band FM-vocoders to determine if consonant identification performance would improve as the recovered AM cues become more available. A consistent improvement was observed as the band number decreased from 8 to 1, supporting the hypothesis that (1) the CI sound processor generates recovered AM cues from broadband FM, and (2) CI users can use the recovered AM cues to recognize speech. The correlation between the intact and the recovered AM components at the output of the sound processor was also generally higher when the band number was low, supporting the consonant identification results. Moreover, CI subjects who were better at using recovered AM cues from broadband FM cues showed better identification performance with intact (unprocessed) speech stimuli. This suggests that speech perception performance variability in CI users may be partly caused by differences in their ability to use AM cues recovered from FM speech cues.
Collapse
Affiliation(s)
- Jong Ho Won
- Department of Audiology and Speech Pathology, University of Tennessee Health Science Center, Knoxville, Tennessee 37996, USA.
| | | | | | | | | | | | | |
Collapse
|
11
|
Helms Tillery K, Brown CA, Bacon SP. Comparing the effects of reverberation and of noise on speech recognition in simulated electric-acoustic listening. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2012; 131:416-423. [PMID: 22280603 PMCID: PMC3283901 DOI: 10.1121/1.3664101] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2010] [Revised: 10/28/2011] [Accepted: 11/07/2011] [Indexed: 05/26/2023]
Abstract
Cochlear implant users report difficulty understanding speech in both noisy and reverberant environments. Electric-acoustic stimulation (EAS) is known to improve speech intelligibility in noise. However, little is known about the potential benefits of EAS in reverberation, or about how such benefits relate to those observed in noise. The present study used EAS simulations to examine these questions. Sentences were convolved with impulse responses from a model of a room whose estimated reverberation times were varied from 0 to 1 sec. These reverberated stimuli were then vocoded to simulate electric stimulation, or presented as a combination of vocoder plus low-pass filtered speech to simulate EAS. Monaural sentence recognition scores were measured in two conditions: reverberated speech and speech in a reverberated noise. The long-term spectrum and amplitude modulations of the noise were equated to the reverberant energy, allowing a comparison of the effects of the interferer (speech vs noise). Results indicate that, at least in simulation, (1) EAS provides significant benefit in reverberation; (2) the benefits of EAS in reverberation may be underestimated by those in a comparable noise; and (3) the EAS benefit in reverberation likely arises from partially preserved cues in this background accessible via the low-frequency acoustic component.
Collapse
Affiliation(s)
- Kate Helms Tillery
- Psychoacoustics Laboratory, Department of Speech and Hearing Science, Arizona State University, PO Box 870102, Tempe, Arizona 85287-0102, USA
| | | | | |
Collapse
|
12
|
Apoux F, Healy EW. Relative contribution of target and masker temporal fine structure to the unmasking of consonants in noise. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 130:4044-4052. [PMID: 22225058 PMCID: PMC3253603 DOI: 10.1121/1.3652888] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2011] [Revised: 08/02/2011] [Accepted: 09/25/2011] [Indexed: 05/26/2023]
Abstract
The present study assessed the relative contribution of the "target" and "masker" temporal fine structure (TFS) when identifying consonants. Accordingly, the TFS of the target and that of the masker were manipulated simultaneously or independently. A 30 band vocoder was used to replace the original TFS of the stimuli with tones. Four masker types were used. They included a speech-shaped noise, a speech-shaped noise modulated by a speech envelope, a sentence, or a sentence played backward. When the TFS of the target and that of the masker were disrupted simultaneously, consonant recognition dropped significantly compared to the unprocessed condition for all masker types, except the speech-shaped noise. Disruption of only the target TFS led to a significant drop in performance with all masker types. In contrast, disruption of only the masker TFS had no effect on recognition. Overall, the present data are consistent with previous work showing that TFS information plays a significant role in speech recognition in noise, especially when the noise fluctuates over time. However, the present study indicates that listeners rely primarily on TFS information in the target and that the nature of the masker TFS has a very limited influence on the outcome of the unmasking process.
Collapse
Affiliation(s)
- Frédéric Apoux
- Department of Speech and Hearing Science, The Ohio State University, Columbus, Ohio 43210, USA.
| | | |
Collapse
|
13
|
Banai K, Sabin AT, Wright BA. Separable developmental trajectories for the abilities to detect auditory amplitude and frequency modulation. Hear Res 2011; 280:219-27. [PMID: 21664958 DOI: 10.1016/j.heares.2011.05.019] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/29/2010] [Revised: 05/23/2011] [Accepted: 05/25/2011] [Indexed: 10/18/2022]
Abstract
Amplitude modulation (AM) and frequency modulation (FM) are inherent components of most natural sounds. The ability to detect these modulations, considered critical for normal auditory and speech perception, improves over the course of development. However, the extent to which the development of AM and FM detection skills follow different trajectories, and therefore can be attributed to the maturation of separate processes, remains unclear. Here we explored the relationship between the developmental trajectories for the detection of sinusoidal AM and FM in a cross-sectional design employing children aged 8-10 and 11-12 years and adults. For FM of tonal carriers, both average performance (mean) and performance consistency (within-listener standard deviation) were adult-like in the 8-10 y/o. In contrast, in the same listeners, average performance for AM of wideband noise carriers was still not adult-like in the 11-12 y/o, though performance consistency was already mature in the 8-10 y/o. Among the children there were no significant correlations for either measure between the degrees of maturity for AM and FM detection. These differences in developmental trajectory between the two modulation cues and between average detection thresholds and performance consistency suggest that at least partially distinct processes may underlie the development of AM and FM detection as well as the abilities to detect modulation and to do so consistently.
Collapse
Affiliation(s)
- Karen Banai
- Department of Communication Sciences and Disorders, University of Haifa, Haifa 31905, Israel.
| | | | | |
Collapse
|
14
|
Apoux F, Healy EW. Relative contribution of off- and on-frequency spectral components of background noise to the masking of unprocessed and vocoded speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 128:2075-84. [PMID: 20968378 PMCID: PMC2981119 DOI: 10.1121/1.3478845] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
The present study examined the relative influence of the off- and on-frequency spectral components of modulated and unmodulated maskers on consonant recognition. Stimuli were divided into 30 contiguous equivalent rectangular bandwidths. The temporal fine structure (TFS) in each "target" band was either left intact or replaced with tones using vocoder processing. Recognition scores for 10, 15 and 20 target bands randomly located in frequency were obtained in quiet and in the presence of all 30 masker bands, only the off-frequency masker bands, or only the on-frequency masker bands. The amount of masking produced by the on-frequency bands was generally comparable to that produced by the broadband masker. However, the difference between these two conditions was often significant, indicating an influence of the off-frequency masker bands, likely through modulation interference or spectral restoration. Although vocoder processing systematically lead to poorer consonant recognition scores, the deficit observed in noise could often be attributed to that observed in quiet. These data indicate that (i) speech recognition is affected by the off-frequency components of the background and (ii) the nature of the target TFS does not systematically affect speech recognition in noise, especially when energetic masking and/or the number of target bands is limited.
Collapse
Affiliation(s)
- Frédéric Apoux
- Department of Speech and Hearing Science, The Ohio State University, Columbus, Ohio 43210, USA.
| | | |
Collapse
|
15
|
Drgas S, Blaszak MA. Perception of speech in reverberant conditions using AM-FM cochlear implant simulation. Hear Res 2010; 269:162-8. [PMID: 20603206 DOI: 10.1016/j.heares.2010.06.016] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/12/2009] [Revised: 06/14/2010] [Accepted: 06/18/2010] [Indexed: 11/16/2022]
Abstract
This study assessed the effects of speech misidentification and cognitive processing errors in normal-hearing adults listening to degraded auditory input signals simulating cochlear implants in reverberation conditions. Three variables were controlled: number of vocoder channels (six and twelve), instantaneous frequency change rate (none, 50, 400 Hz), and enclosures (different reverberation conditions). The analyses were made on the basis of: (a) nonsense word recognition scores for eight young normal-hearing listeners, (b) 'ease of listening' based on the time of response, and (c) the subjective measure of difficulty. The maximum score of speech intelligibility in cochlear implant simulation was 70% for non-reverberant conditions with a 12-channel vocoder and changes of instantaneous frequency limited to 400 Hz. In the presence of reflections, word misidentification was about 10-20 percentage points higher. There was little difference between the 50 and 400 Hz frequency modulation cut-off for the 12-channel vocoder; however, in the case of six channels this difference was more significant. The results of the experiment suggest that the information other than F0, that is carried by FM, can be sufficient to improve speech intelligibility in the real-world conditions.
Collapse
Affiliation(s)
- Szymon Drgas
- Adam Mickiewicz University, Institute of Acoustics, Poznan, Umultowska 85, Poland.
| | | |
Collapse
|
16
|
Carlyon RP, Deeks JM, McKay CM. The upper limit of temporal pitch for cochlear-implant listeners: stimulus duration, conditioner pulses, and the number of electrodes stimulated. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 127:1469-78. [PMID: 20329847 DOI: 10.1121/1.3291981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Three experiments studied discrimination of changes in the rate of electrical pulse trains by cochlear-implant (CI) users and investigated the effect of manipulations that would be expected to substantially affect the pattern of auditory nerve (AN) activity. Experiment 1 used single-electrode stimulation and tested discrimination at baseline rates between 100 and 500 pps. Performance was generally similar for stimulus durations of 200 and 800 ms, and, for the longer duration, for stimuli that were gated on abruptly or with 300-ms ramps. Experiment 2 used a similar procedure and found that no substantial benefit was obtained by the addition of background 5000-pps "conditioning" pulses. Experiment 3 used a pitch-ranking procedure and found that the range of rates over which pitch increased with increasing rate was not greater for multiple-electrode than for single-electrode stimulation. The results indicate that the limitation on pulse-rate discrimination by CI users, at high baseline rates, is not specific to a particular temporal pattern of the AN response.
Collapse
Affiliation(s)
- Robert P Carlyon
- MRC Cognition and Brain Sciences Unit, 15 Chaucer Road, Cambridge CB2 7EF, United Kingdom
| | | | | |
Collapse
|
17
|
Ardoint M, Lorenzi C. Effects of lowpass and highpass filtering on the intelligibility of speech based on temporal fine structure or envelope cues. Hear Res 2009; 260:89-95. [PMID: 19963053 DOI: 10.1016/j.heares.2009.12.002] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/03/2009] [Revised: 11/30/2009] [Accepted: 12/01/2009] [Indexed: 11/16/2022]
Abstract
This study aimed to assess whether or not temporal envelope (E) and fine structure (TFS) cues in speech convey distinct phonetic information. Syllables uttered by a male and female speaker were (i) processed to retain either E or TFS within 16 frequency bands, (ii) lowpass or highpass filtered at different cut-off frequencies, and (iii) presented for identification to seven listeners. Psychometric functions were fitted using a sigmoid function, and used to determine crossover frequencies (cut-off frequencies at which lowpass and highpass filtering yielded equivalent performance), and gradients at each point of the psychometric functions (change in performance with respect to cut-off frequency). Crossover frequencies and gradients were not significantly different across speakers. Crossover frequencies were not significantly different between E and TFS speech ( approximately 1.5kHz). Gradients were significantly different between E and TFS speech in various filtering conditions. When stimuli were highpass filtered above 2.5kHz, performance was significantly above chance level and gradients were significantly different from 0 for E speech only. These findings suggest that E and TFS convey important but distinct phonetic cues between 1 and 2kHz. Unlike TFS, E conveys information up to 6kHz, consistent with the characteristics of neural phase locking to E and TFS.
Collapse
Affiliation(s)
- Marine Ardoint
- Laboratoire de Psychologie de la Perception, CNRS, Universite Paris Descartes, DEC, Ecole Normale Supérieure, 29 rue d'Ulm, 75005 Paris, France.
| | | |
Collapse
|
18
|
Kim J, Davis C, Groot C. Speech identification in noise: Contribution of temporal, spectral, and visual speech cues. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 126:3246-3257. [PMID: 20000938 DOI: 10.1121/1.3250425] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
This study investigated the degree to which two types of reduced auditory signals (cochlear implant simulations) and visual speech cues combined for speech identification. The auditory speech stimuli were filtered to have only amplitude envelope cues or both amplitude envelope and spectral cues and were presented with/without visual speech. In Experiment 1, IEEE sentences were presented in quiet and noise. For in-quiet presentation, speech identification was enhanced by the addition of both spectral and visual speech cues. Due to a ceiling effect, the degree to which these effects combined could not be determined. In noise, these facilitation effects were more marked and were additive. Experiment 2 examined consonant and vowel identification in the context of CVC or VCV syllables presented in noise. For consonants, both spectral and visual speech cues facilitated identification and these effects were additive. For vowels, the effect of combined cues was underadditive, with the effect of spectral cues reduced when presented with visual speech cues. Analysis indicated that without visual speech, spectral cues facilitated the transmission of place information and vowel height, whereas with visual speech, they facilitated lip rounding, with little impact on the transmission of place information.
Collapse
Affiliation(s)
- Jeesun Kim
- MARCS Auditory Laboratories, University of Western Sydney, NSW 1797, Australia
| | | | | |
Collapse
|
19
|
Garadat SN, Litovsky RY, Yu G, Zeng FG. Role of binaural hearing in speech intelligibility and spatial release from masking using vocoded speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 126:2522-35. [PMID: 19894832 PMCID: PMC2787072 DOI: 10.1121/1.3238242] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
A cochlear implant vocoder was used to evaluate relative contributions of spectral and binaural temporal fine-structure cues to speech intelligibility. In Study I, stimuli were vocoded, and then convolved through head related transfer functions (HRTFs) to remove speech temporal fine structure but preserve the binaural temporal fine-structure cues. In Study II, the order of processing was reversed to remove both speech and binaural temporal fine-structure cues. Speech reception thresholds (SRTs) were measured adaptively in quiet, and with interfering speech, for unprocessed and vocoded speech (16, 8, and 4 frequency bands), under binaural or monaural (right-ear) conditions. Under binaural conditions, as the number of bands decreased, SRTs increased. With decreasing number of frequency bands, greater benefit from spatial separation of target and interferer was observed, especially in the 8-band condition. The present results demonstrate a strong role of the binaural cues in spectrally degraded speech, when the target and interfering speech are more likely to be confused. The nearly normal binaural benefits under present simulation conditions and the lack of order of processing effect further suggest that preservation of binaural cues is likely to improve performance in bilaterally implanted recipients.
Collapse
Affiliation(s)
- Soha N Garadat
- Waisman Center, University of Wisconsin, Madison, WI 53705, USA
| | | | | | | |
Collapse
|
20
|
|
21
|
Moore BCJ, Sek A. Development of a fast method for determining sensitivity to temporal fine structure. Int J Audiol 2009; 48:161-71. [PMID: 19085395 DOI: 10.1080/14992020802475235] [Citation(s) in RCA: 65] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Recent evidence suggests that sensitivity to the temporal fine structure (TFS) of sounds is adversely affected by cochlear hearing loss. This may partly explain the difficulties experienced by people with cochlear hearing loss in understanding speech when background sounds, especially fluctuating backgrounds, are present. We describe a test for assessing sensitivity to TFS. The test can be run using any PC with a sound card. The test involves discrimination of a harmonic complex tone (H), with a fundamental frequency F0, from a tone in which all harmonics are shifted upwards by the same amount in Hertz, resulting in an inharmonic tone (I). The phases of the components are selected randomly for every stimulus. Both tones have an envelope repetition rate equal to F0, but the tones differ in their TFS. To prevent discrimination based on spectral cues, all tones are passed through a fixed bandpass filter, usually centred at 11F0. A background noise is used to mask combination tones. The results show that, for normal-hearing subjects, learning effects are small, and the effect of the level of testing is also small. The test provides a simple, quick, and robust way to measure sensitivity to TFS.
Collapse
Affiliation(s)
- Brian C J Moore
- Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge, UK.
| | | |
Collapse
|
22
|
Gnansia D, Péan V, Meyer B, Lorenzi C. Effects of spectral smearing and temporal fine structure degradation on speech masking release. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 125:4023-33. [PMID: 19507983 DOI: 10.1121/1.3126344] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
This study assessed the effects of spectral smearing and temporal fine structure (TFS) degradation on masking release (MR) (the improvement in speech identification in amplitude-modulated compared to steady noise observed for normal-hearing listeners). Syllables and noise stimuli were processed using either a spectral-smearing algorithm or a tone-excited vocoder. The two processing schemes simulated broadening of the auditory filters by factors of 2 and 4. Simulations of the early stages of auditory processing showed that the two schemes produced comparable excitation patterns; however, fundamental frequency (F0) information conveyed by TFS was degraded more severely by the vocoder than by the spectral-smearing algorithm. Both schemes reduced MR but, for each amount of spectral smearing, the vocoder produced a greater reduction in MR than the spectral-smearing algorithm, consistent with the effects of each scheme on F0 representation. Moreover, the effects of spectral smearing on MR produced by the two schemes were different for manner and voicing. Finally, MR data for listeners with moderate hearing loss were well matched by MR data obtained for normal-hearing listeners with vocoded stimuli, suggesting that impaired frequency selectivity alone may not be sufficient to account for the reduced MR observed for hearing-impaired listeners.
Collapse
Affiliation(s)
- Dan Gnansia
- Laboratoire de Psychologie de la Perception, Universite Paris Descartes, UMR CNRS 8158, Ecole Normale Superieure, Paris, France.
| | | | | | | |
Collapse
|
23
|
Lee SH, Lee KY, Huh MJ, Jang HS. Effect of bimodal hearing in Korean children with profound hearing loss. Acta Otolaryngol 2008; 128:1227-32. [PMID: 19241597 DOI: 10.1080/00016480801901758] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
CONCLUSION Bimodal hearing with combined acoustic stimulation and electric stimulation could enhance speech performance in deaf patients by residual hearing even though the amount of residual hearing is not enough to be used for communication by amplification. OBJECTIVES The cochlear implant (CI) is a well-known therapeutic option for patients with profound hearing loss. However, deaf patients with a CI still have trouble in localization of sounds and understanding speech in a noisy environment. The aim of this study was to evaluate the benefits of bimodal hearing with a CI in one ear and a hearing aid in the contralateral ear in Korean children with profound hearing loss. SUBJECTS AND METHODS Fourteen deaf children with residual hearing participated in this study. There were eight male and six female patients, with an age range of 4.6-13.8 years at the time of testing. The test was conducted between 3 months and 4.2 years after cochlear implantation. Speech performance was examined in a noisy environment using Korean word lists. A speech sound and the noise were presented to the child from the front loudspeaker. RESULTS The results showed that speech performance in a noisy environment was significantly better with bimodal hearing than with a CI alone.
Collapse
|
24
|
Sheft S, Ardoint M, Lorenzi C. Speech identification based on temporal fine structure cues. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 124:562-75. [PMID: 18646999 PMCID: PMC2809700 DOI: 10.1121/1.2918540] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
The contribution of temporal fine structure (TFS) cues to consonant identification was assessed in normal-hearing listeners with two speech-processing schemes designed to remove temporal envelope (E) cues. Stimuli were processed vowel-consonant-vowel speech tokens. Derived from the analytic signal, carrier signals were extracted from the output of a bank of analysis filters. The "PM" and "FM" processing schemes estimated a phase- and frequency-modulation function, respectively, of each carrier signal and applied them to a sinusoidal carrier at the analysis-filter center frequency. In the FM scheme, processed signals were further restricted to the analysis-filter bandwidth. A third scheme retaining only E cues from each band was used for comparison. Stimuli processed with the PM and FM schemes were found to be highly intelligible (50-80% correct identification) over a variety of experimental conditions designed to affect the putative reconstruction of E cues subsequent to peripheral auditory filtering. Analysis of confusions between consonants showed that the contribution of TFS cues was greater for place than manner of articulation, whereas the converse was observed for E cues. Taken together, these results indicate that TFS cues convey important phonetic information that is not solely a consequence of E reconstruction.
Collapse
Affiliation(s)
- Stanley Sheft
- Parmly Hearing Institute, Loyola University Chicago, 6525 North Sheridan Road, Chicago, Illinois 60626, USA.
| | | | | |
Collapse
|
25
|
Effect of masker modulation depth on speech masking release. Hear Res 2008; 239:60-8. [DOI: 10.1016/j.heares.2008.01.012] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/27/2007] [Revised: 12/14/2007] [Accepted: 01/28/2008] [Indexed: 11/22/2022]
|
26
|
Bonham BH, Litvak LM. Current focusing and steering: modeling, physiology, and psychophysics. Hear Res 2008; 242:141-53. [PMID: 18501539 DOI: 10.1016/j.heares.2008.03.006] [Citation(s) in RCA: 115] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/19/2007] [Revised: 03/20/2008] [Accepted: 03/25/2008] [Indexed: 11/25/2022]
Abstract
Current steering and current focusing are stimulation techniques designed to increase the number of distinct perceptual channels available to cochlear implant (CI) users by adjusting currents applied simultaneously to multiple CI electrodes. Previous studies exploring current steering and current focusing stimulation strategies are reviewed, including results of research using computational models, animal neurophysiology, and human psychophysics. Preliminary results of additional neurophysiological and human psychophysical studies are presented that demonstrate the success of current steering strategies in stimulating auditory nerve regions lying between physical CI electrodes, as well as current focusing strategies that excite regions narrower than those stimulated using monopolar configurations. These results are interpreted in the context of perception and speech reception by CI users. Disparities between results of physiological and psychophysical studies are discussed. The differences in stimulation used for physiological and psychophysical studies are hypothesized to contribute to these disparities. Finally, application of current steering and focusing strategies to other types of auditory prostheses is also discussed.
Collapse
Affiliation(s)
- Ben H Bonham
- Saul and Ida Epstein Laboratory, Department of Otolaryngology-HNS, 533 Parnassus Avenue, Box 0526, University of California, San Francisco, CA 94143-0526, USA.
| | | |
Collapse
|
27
|
Carlyon RP, Long CJ, Deeks JM. Pulse-rate discrimination by cochlear-implant and normal-hearing listeners with and without binaural cues. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 123:2276-86. [PMID: 18397032 PMCID: PMC2376257 DOI: 10.1121/1.2874796] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
Experiment 1 measured rate discrimination of electric pulse trains by bilateral cochlear implant (CI) users, for standard rates of 100, 200, and 300 pps. In the diotic condition the pulses were presented simultaneously to the two ears. Consistent with previous results with unilateral stimulation, performance deteriorated at higher standard rates. In the signal interval of each trial in the dichotic condition, the standard rate was presented to the left ear and the (higher) signal rate was presented to the right ear; the non-signal intervals were the same as in the diotic condition. Performance in the dichotic condition was better for some listeners than in the diotic condition for standard rates of 100 and 200 pps, but not at 300 pps. It is concluded that the deterioration in rate discrimination observed for CI users at high rates cannot be alleviated by the introduction of a binaural cue, and is unlikely to be limited solely by central pitch processes. Experiment 2 performed an analogous experiment in which 300-pps acoustic pulse trains were bandpass filtered (3900-5400 Hz) and presented in a noise background to normal-hearing listeners. Unlike the results of experiment 1, performance was superior in the dichotic than in the diotic condition.
Collapse
Affiliation(s)
- Robert P Carlyon
- MRC Cognition and Brain Sciences Unit, 15 Chaucer Road, Cambridge CB2 7EF, United Kingdom.
| | | | | |
Collapse
|
28
|
Stickney GS, Assmann PF, Chang J, Zeng FG. Effects of cochlear implant processing and fundamental frequency on the intelligibility of competing sentences. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 122:1069-78. [PMID: 17672654 DOI: 10.1121/1.2750159] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Speech perception in the presence of another competing voice is one of the most challenging tasks for cochlear implant users. Several studies have shown that (1) the fundamental frequency (F0) is a useful cue for segregating competing speech sounds and (2) the F0 is better represented by the temporal fine structure than by the temporal envelope. However, current cochlear implant speech processing algorithms emphasize temporal envelope information and discard the temporal fine structure. In this study, speech recognition was measured as a function of the F0 separation of the target and competing sentence in normal-hearing and cochlear implant listeners. For the normal-hearing listeners, the combined sentences were processed through either a standard implant simulation or a new algorithm which additionally extracts a slowed-down version of the temporal fine structure (called Frequency-Amplitude-Modulation-Encoding). The results showed no benefit of increasing F0 separation for the cochlear implant or simulation groups. In contrast, the new algorithm resulted in gradual improvements with increasing F0 separation, similar to that found with unprocessed sentences. These results emphasize the importance of temporal fine structure for speech perception and demonstrate a potential remedy for difficulty in the perceptual segregation of competing speech sounds.
Collapse
|
29
|
Sit JJ, Simonson AM, Oxenham AJ, Faltys MA, Sarpeshkar R. A Low-Power Asynchronous Interleaved Sampling Algorithm for Cochlear Implants That Encodes Envelope and Phase Information. IEEE Trans Biomed Eng 2007; 54:138-49. [PMID: 17260865 DOI: 10.1109/tbme.2006.883819] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Cochlear implants currently fail to convey phase information, which is important for perceiving music, tonal languages, and for hearing in noisy environments. We propose a bio-inspired asynchronous interleaved sampling (AIS) algorithm that encodes both envelope and phase information, in a manner that may be suitable for delivery to cochlear implant users. Like standard continuous interleaved sampling (CIS) strategies, AIS naturally meets the interleaved-firing requirement, which is to stimulate only one electrode at a time, minimizing electrode interactions. The majority of interspike intervals are distributed over 1-4 ms, thus staying within the absolute refractory limit of neurons, and form a more natural, pseudostochastic pattern of firing due to complex channel interactions. Stronger channels are selected to fire more often but the strategy ensures that weaker channels are selected to fire in proportion to their signal strength as well. The resulting stimulation rates are considerably lower than those of most modern implants, saving power yet delivering higher potential performance. Correlations with original sounds were found to be significantly higher in AIS reconstructions than in signal reconstructions using only envelope information. Two perceptual tests on normal-hearing listeners verified that the reconstructed signals enabled better melody and speech recognition in noise than those processed using tone-excited envelope-vocoder simulations of cochlear implant processing. Thus, our strategy could potentially save power and improve hearing performance in cochlear implant users.
Collapse
Affiliation(s)
- Ji-Jon Sit
- Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
| | | | | | | | | |
Collapse
|
30
|
Kong YY, Zeng FG. Temporal and spectral cues in Mandarin tone recognition. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 120:2830-40. [PMID: 17139741 DOI: 10.1121/1.2346009] [Citation(s) in RCA: 102] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
This study evaluates the relative contributions of envelope and fine structure cues in both temporal and spectral domains to Mandarin tone recognition in quiet and in noise. Four sets of stimuli were created. Noise-excited vocoder speech was used to evaluate the temporal envelope. Frequency modulation was then added to evaluate the temporal fine structure. Whispered speech was used to evaluate the spectral envelope. Finally, equal-amplitude harmonics were used to evaluate the spectral fine structure. Results showed that normal-hearing listeners achieved nearly perfect tone recognition with either spectral or temporal fine structure in quiet, but only 70%-80% correct with the envelope cues. With the temporal envelope, 32 spectral bands were needed to achieve performance similar to that obtained with the original stimuli, but only four bands were necessary with the additional temporal fine structure. Envelope cues were more susceptible to noise than fine structure cues, with the envelope cues producing significantly lower performance in noise. These findings suggest that tonal pattern recognition is a robust process that can make use of both spectral and temporal cues. Unlike speech recognition, the fine structure is more important than the envelope for tone recognition in both temporal and spectral domains, particularly in noise.
Collapse
Affiliation(s)
- Ying-Yee Kong
- Hearing and Speech Research Laboratory, Department of Cognitive Sciences, University of California-Irvine, Irvine, CA 92697, USA.
| | | |
Collapse
|
31
|
Abstract
This study evaluated functional benefits from bilateral stimulation in 20 children ages 4-14, 10 use two CIs and 10 use one CI and one HA. Localization acuity was measured with the minimum audible angle (MAA). Speech intelligibility was measured in quiet, and in the presence of 2-talker competing speech using the CRISP forced-choice test. Results show that both groups perform similarly when speech reception thresholds are evaluated. However, there appears to be benefit (improved MAA and speech thresholds) from wearing two devices compared with a single device that is significantly greater in the group with two CI than in the bimodal group. Individual variability also suggests that some children perform similarly to normal-hearing children, while others clearly do not. Future advances in binaural fitting strategies and improved speech processing schemes that maximize binaural sensitivity will no doubt contribute to increasing the binaurally-driven advantages in persons with bilateral CIs.
Collapse
Affiliation(s)
- Ruth Y Litovsky
- Binaural Hearing and Speech Lab, Waisman Center, University of Wisconsin-Madison, Madison, WI 53705-1103, USA.
| | | | | |
Collapse
|