1
|
Saba JN, Ali H, Hansen JHL. The effects of estimation accuracy, estimation approach, and number of selected channels using formant-priority channel selection for an "n-of-m" sound processing strategy for cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:3100. [PMID: 37227411 PMCID: PMC10219683 DOI: 10.1121/10.0019416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 04/16/2023] [Accepted: 04/28/2023] [Indexed: 05/26/2023]
Abstract
Previously, selection of l channels was prioritized according to formant frequency locations in an l-of-n-of-m-based signal processing strategy to provide important voicing information independent of listening environments for cochlear implant (CI) users. In this study, ideal, or ground truth, formants were incorporated into the selection stage to determine the effect of accuracy on (1) subjective speech intelligibility, (2) objective channel selection patterns, and (3) objective stimulation patterns (current). An average +11% improvement (p < 0.05) was observed across six CI users in quiet, but not for noise or reverberation conditions. Analogous increases in channel selection and current for the upper range of F1 and a decrease across mid-frequencies with higher corresponding current, were both observed at the expense of noise-dominant channels. Objective channel selection patterns were analyzed a second time to determine the effects of estimation approach and number of selected channels (n). A significant effect of estimation approach was only observed in the noise and reverberation condition with minor differences in channel selection and significantly decreased stimulated current. Results suggest that estimation method, accuracy, and number of channels in the proposed strategy using ideal formants may improve intelligibility when corresponding stimulated current of formant channels are not masked by noise-dominant channels.
Collapse
Affiliation(s)
- Juliana N Saba
- University of Texas at Dallas, Center for Robust Speech Systems, Cochlear Implant Laboratory, 800 W. Campbell Rd, EC 33, Richardson, Texas 75080, USA
| | - Hussnain Ali
- University of Texas at Dallas, Center for Robust Speech Systems, Cochlear Implant Laboratory, 800 W. Campbell Rd, EC 33, Richardson, Texas 75080, USA
| | - John H L Hansen
- University of Texas at Dallas, Center for Robust Speech Systems, Cochlear Implant Laboratory, 800 W. Campbell Rd, EC 33, Richardson, Texas 75080, USA
| |
Collapse
|
2
|
Zhang C, Li M, Yu J, Liu C. Development of Mandarin Chinese Vowel Perception in Young Children With Normal Hearing and Cochlear Implants. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:4485-4494. [PMID: 34554847 DOI: 10.1044/2021_jslhr-20-00669] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Purpose Depicting the development pattern of vowel perception for children with normal hearing (NH) and cochlear implants (CIs) would be useful for clinicians and school teachers to monitor children's auditory rehabilitation. The study was to investigate the development of Mandarin Chinese vowel perception for Mandarin Chinese native-speaking children with the ages of 4-6 years. Method Vowel identification of children with NH and CIs were tested. All children with CIs received CIs before the age of 4 years. In a picture identification task with Mandarin Chinese speech stimuli, listeners identified the target consonant-vowel word among two to four contrastive words that differed only in vowels. Each target word represented a concrete object and was spoken by a young female native Mandarin Chinese talker. The target words included 16 monophthongs, 22 diphthongs, and nine triphthongs. Results Children with NH showed significantly better identification of monophthongs and diphthongs than children with CIs at the age of 6 years, whereas the two groups had comparable performance at age of 4 and 5 years. Children with NH significantly outperformed children with CIs for triphthong identification across all three age groups. For children with NH, a rapid development of perception of all three types of vowels occurred between age 4 and 5 years with a rapid development only for monophthong perception between age 5 and 6 years. For children with CIs, a rapid development of both diphthong and triphthong perception occurred between 4 and 5 years old, but not monophthong, with no significant development between 5 and 6 years old for all three types of vowels. Overall, Mandarin-speaking children with NH achieved their ceiling performance in vowel perception before or at the age of 6 years, whereas children with CIs may need more time to reach the typical level of their peers with NH. Conclusions The development of Mandarin vowel perception for Mandarin-native children differed between preschool-age children with NH and CIs, likely due to the deficits of spectral processing for children with CIs. The results would be a supplement to the development of speech recognition in Mandarin-native children with NH and CIs.
Collapse
Affiliation(s)
- Changxin Zhang
- Faculty of Education, East China Normal University, Shanghai
| | - Mingying Li
- Faculty of Education, East China Normal University, Shanghai
- Qihui Special Education School, Shanghai, China
| | - Jie Yu
- Faculty of Education, East China Normal University, Shanghai
| | - Chang Liu
- Department of Speech, Language, and Hearing Sciences, The University of Texas at Austin
| |
Collapse
|
3
|
Undurraga JA, Van Yper L, Bance M, McAlpine D, Vickers D. Neural encoding of spectro-temporal cues at slow and near speech-rate in cochlear implant users. Hear Res 2020; 403:108160. [PMID: 33461048 DOI: 10.1016/j.heares.2020.108160] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Revised: 12/17/2020] [Accepted: 12/21/2020] [Indexed: 10/22/2022]
Abstract
The ability to process rapid modulations in the spectro-temporal structure of sounds is critical for speech comprehension. For users of cochlear implants (CIs), spectral cues in speech are conveyed by differential stimulation of electrode contacts along the cochlea, and temporal cues in terms of the amplitude of stimulating electrical pulses, which track the amplitude-modulated (AM'ed) envelope of speech sounds. Whilst survival of inner-ear neurons and spread of electrical current are known factors that limit the representation of speech information in CI listeners, limitations in the neural representation of dynamic spectro-temporal cues common to speech are also likely to play a role. We assessed the ability of CI listeners to process spectro-temporal cues varying at rates typically present in human speech. Employing an auditory change complex (ACC) paradigm, and a slow (0.5Hz) alternating rate between stimulating electrodes, or different AM frequencies, to evoke a transient cortical ACC, we demonstrate that CI listeners-like normal-hearing listeners-are sensitive to transitions in the spectral- and temporal-domain. However, CI listeners showed impaired cortical responses when either spectral or temporal cues were alternated at faster, speech-like (6-7Hz), rates. Specifically, auditory change following responses-reliably obtained in normal-hearing listeners-were small or absent in CI users, indicating that cortical adaptation to alternating cues at speech-like rates is stronger under electrical stimulation. In CI listeners, temporal processing was also influenced by the polarity-behaviourally-and rate of presentation of electrical pulses-both neurally and behaviorally. Limitations in the ability to process dynamic spectro-temporal cues will likely impact speech comprehension in CI users.
Collapse
Affiliation(s)
- Jaime A Undurraga
- Department of Linguistics, 16 University Avenue, Macquarie University, NSW 2109, Australia.
| | - Lindsey Van Yper
- Department of Linguistics, 16 University Avenue, Macquarie University, NSW 2109, Australia
| | - Manohar Bance
- Cambridge Hearing Group, Department of Clinical Neurosciences, Cambridge Biomedical Campus, University of Cambridge, CB2 0QQ, UK
| | - David McAlpine
- Department of Linguistics, 16 University Avenue, Macquarie University, NSW 2109, Australia
| | - Deborah Vickers
- Cambridge Hearing Group, Department of Clinical Neurosciences, Cambridge Biomedical Campus, University of Cambridge, CB2 0QQ, UK
| |
Collapse
|
4
|
Whiteford KL, Kreft HA, Oxenham AJ. The role of cochlear place coding in the perception of frequency modulation. eLife 2020; 9:58468. [PMID: 32996463 PMCID: PMC7556860 DOI: 10.7554/elife.58468] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Accepted: 09/29/2020] [Indexed: 12/17/2022] Open
Abstract
Natural sounds convey information via frequency and amplitude modulations (FM and AM). Humans are acutely sensitive to the slow rates of FM that are crucial for speech and music. This sensitivity has long been thought to rely on precise stimulus-driven auditory-nerve spike timing (time code), whereas a coarser code, based on variations in the cochlear place of stimulation (place code), represents faster FM rates. We tested this theory in listeners with normal and impaired hearing, spanning a wide range of place-coding fidelity. Contrary to predictions, sensitivity to both slow and fast FM correlated with place-coding fidelity. We also used incoherent AM on two carriers to simulate place coding of FM and observed poorer sensitivity at high carrier frequencies and fast rates, two properties of FM detection previously ascribed to the limits of time coding. The results suggest a unitary place-based neural code for FM across all rates and carrier frequencies.
Collapse
Affiliation(s)
- Kelly L Whiteford
- Department of Psychology, University of Minnesota, Minneapolis, United States
| | - Heather A Kreft
- Department of Psychology, University of Minnesota, Minneapolis, United States
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, Minneapolis, United States
| |
Collapse
|
5
|
Kung SJ, Wu DH, Hsu CH, Hsieh IH. A Minimum Temporal Window for Direction Detection of Frequency-Modulated Sweeps: A Magnetoencephalography Study. Front Psychol 2020; 11:389. [PMID: 32218758 PMCID: PMC7078663 DOI: 10.3389/fpsyg.2020.00389] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 02/19/2020] [Indexed: 11/13/2022] Open
Abstract
The ability to rapidly encode the direction of frequency contour contained in frequency-modulated (FM) sweeps is essential for speech processing, music appreciation, and conspecific communications. Psychophysical evidence points to a common temporal window threshold for human listeners in processing rapid changes in frequency glides. No neural evidence has been provided for the existence of a cortical temporal window threshold underlying the encoding of rapid transitions in frequency glides. The present magnetoencephalography study used the cortical mismatch negativity activity (MMNm) to investigate the minimum temporal window required for detecting different magnitudes of directional changes in frequency-modulated sweeps. A deviant oddball paradigm was used in which directional upward or downward frequency sweep serves as the standard and the same type of sweep with the opposite direction serves as its deviant. Stimuli consisted of unidirectional linear frequency-sweep complexes that swept across speech-relevant frequency bands in durations of 10, 20, 40, 80, 160, and 320 ms (with corresponding rates of 50, 25, 12.5, 6.2, 3.1, 1.5 oct/s). The data revealed significant magnetic mismatch field responses across all sweep durations, with slower-rate sweeps eliciting larger MMNm responses. A greater temporally related enhancement in MMNm response was obtained for rising but not falling frequency sweep contours. A hemispheric asymmetry in the MMNm response pattern was observed corresponding to the directionality of frequency sweeps. Contrary to psychophysical findings, we report a temporal window as short as 10 ms sufficient to elicit a robust MMNm response to a directional change in speech-relevant frequency contours. The results suggest that auditory cortex requires extremely brief temporal window to implicitly differentiate a dynamic change in frequency of linguistically relevant pitch contours. That the brain is extremely sensitive to fine spectral changes contained in speech-relevant glides provides cortical evidence for the ecological importance of FM sweeps in speech processing.
Collapse
Affiliation(s)
- Shu-Jen Kung
- Institute of Cognitive Neuroscience, National Central University, Taoyuan City, Taiwan
| | - Denise H Wu
- Institute of Cognitive Neuroscience, National Central University, Taoyuan City, Taiwan
| | - Chun-Hsien Hsu
- Institute of Cognitive Neuroscience, National Central University, Taoyuan City, Taiwan.,Institute of Linguistics, Academia Sinica, Taipei, Taiwan
| | - I-Hui Hsieh
- Institute of Cognitive Neuroscience, National Central University, Taoyuan City, Taiwan
| |
Collapse
|
6
|
Todd AE, Goupell MJ, Litovsky RY. Binaural unmasking with temporal envelope and fine structure in listeners with cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:2982. [PMID: 31153315 PMCID: PMC6525004 DOI: 10.1121/1.5102158] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/09/2018] [Revised: 04/14/2019] [Accepted: 04/19/2019] [Indexed: 06/09/2023]
Abstract
For normal-hearing (NH) listeners, interaural information in both temporal envelope and temporal fine structure contribute to binaural unmasking of target signals in background noise; however, in many conditions low-frequency interaural information in temporal fine structure produces greater binaural unmasking. For bilateral cochlear-implant (CI) listeners, interaural information in temporal envelope contributes to binaural unmasking; however, the effect of encoding temporal fine structure information in electrical pulse timing (PT) is not fully understood. In this study, diotic and dichotic signal detection thresholds were measured in CI listeners using bilaterally synchronized single-electrode stimulation for conditions in which the temporal envelope was presented without temporal fine structure encoded (constant-rate pulses) or with temporal fine structure encoded (pulses timed to peaks of the temporal fine structure). CI listeners showed greater binaural unmasking at 125 pps with temporal fine structure encoded than without. There was no significant effect of encoding temporal fine structure at 250 pps. A similar pattern of performance was shown by NH listeners presented with acoustic pulse trains designed to simulate CI stimulation. The results suggest a trade-off across low rates between interaural information obtained from temporal envelope and that obtained from temporal fine structure encoded in PT.
Collapse
Affiliation(s)
- Ann E Todd
- Waisman Center, University of Wisconsin-Madison, 1500 Highland Avenue, Madison, Wisconsin 53705, USA
| | - Matthew J Goupell
- Department of Hearing and Speech Sciences, University of Maryland at College Park, College Park, Maryland 20742, USA
| | - Ruth Y Litovsky
- Waisman Center, University of Wisconsin-Madison, 1500 Highland Avenue, Madison, Wisconsin 53705, USA
| |
Collapse
|
7
|
Brochier T, McKay C, McDermott H. Rate modulation detection thresholds for cochlear implant users. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:1214. [PMID: 29495682 DOI: 10.1121/1.5025048] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The perception of temporal amplitude modulations is critical for speech understanding by cochlear implant (CI) users. The present study compared the ability of CI users to detect sinusoidal modulations of the electrical stimulation rate and current level, at different presentation levels (80% and 40% of the dynamic range) and modulation frequencies (10 and 100 Hz). Rate modulation detection thresholds (RMDTs) and amplitude modulation detection thresholds (AMDTs) were measured and compared to assess whether there was a perceptual advantage to either modulation method. Both RMDTs and AMDTs improved with increasing presentation level and decreasing modulation frequency. RMDTs and AMDTs were correlated, indicating that a common processing mechanism may underlie the perception of rate modulation and amplitude modulation, or that some subject-dependent factors affect both types of modulation detection.
Collapse
Affiliation(s)
- Tim Brochier
- Department of Medical Bionics, University of Melbourne, 384-388 Albert Street, East Melbourne, Victoria 3002, Australia
| | - Colette McKay
- The Bionics Institute, 384-388 Albert Street, East Melbourne, Victoria 3002, Australia
| | - Hugh McDermott
- The Bionics Institute, 384-388 Albert Street, East Melbourne, Victoria 3002, Australia
| |
Collapse
|
8
|
Sheft S, Cheng MY, Shafiro V. Discrimination of Stochastic Frequency Modulation by Cochlear Implant Users. J Am Acad Audiol 2018; 26:572-81. [PMID: 26134724 DOI: 10.3766/jaaa.14067] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
BACKGROUND Past work has shown that low-rate frequency modulation (FM) may help preserve signal coherence, aid segmentation at word and syllable boundaries, and benefit speech intelligibility in the presence of a masker. PURPOSE This study evaluated whether difficulties in speech perception by cochlear implant (CI) users relate to a deficit in the ability to discriminate among stochastic low-rate patterns of FM. RESEARCH DESIGN RESEARCH DESIGN This is a correlational study assessing the association between the ability to discriminate stochastic patterns of low-rate FM and the intelligibility of speech in noise. STUDY SAMPLE Thirteen postlingually deafened adult CI users participated in this study. DATA COLLECTION AND ANALYSIS Using modulators derived from 5-Hz lowpass noise applied to a 1-kHz carrier, thresholds were measured in terms of frequency excursion both in quiet and with a speech-babble masker present, stimulus duration, and signal-to-noise ratio in the presence of a speech-babble masker. Speech perception ability was assessed in the presence of the same speech-babble masker. Relationships were evaluated with Pearson product-moment correlation analysis with correction for family-wise error, and commonality analysis to determine the unique and common contributions across psychoacoustic variables to the association with speech ability. RESULTS Significant correlations were obtained between masked speech intelligibility and three metrics of FM discrimination involving either signal-to-noise ratio or stimulus duration, with shared variance among the three measures accounting for much of the effect. Compared to past results from young normal-hearing adults and older adults with either normal hearing or a mild-to-moderate hearing loss, mean FM discrimination thresholds obtained from CI users were higher in all conditions. CONCLUSIONS The ability to process the pattern of frequency excursions of stochastic FM may, in part, have a common basis with speech perception in noise. Discrimination of differences in the temporally distributed place coding of the stimulus could serve as this common basis for CI users.
Collapse
Affiliation(s)
- Stanley Sheft
- Department of Communication Disorders and Sciences, Rush University Medical Center, Chicago, IL
| | - Min-Yu Cheng
- Department of Communication Disorders and Sciences, Rush University Medical Center, Chicago, IL
| | - Valeriy Shafiro
- Department of Communication Disorders and Sciences, Rush University Medical Center, Chicago, IL
| |
Collapse
|
9
|
Tan J, Dowell R, Vogel A. Mandarin Lexical Tone Acquisition in Cochlear Implant Users With Prelingual Deafness: A Review. Am J Audiol 2016; 25:246-56. [PMID: 27387047 DOI: 10.1044/2016_aja-15-0069] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2015] [Accepted: 02/20/2016] [Indexed: 11/09/2022] Open
Abstract
PURPOSE The purpose of this review article is to synthesize evidence from the fields of developmental linguistics and cochlear implant technology relevant to the production and perception of Mandarin lexical tone in cochlear implant users with prelingual deafness. The aim of this review was to identify potential factors that determine outcomes for tonal-language speaking cochlear implant users and possible directions for further research. METHOD A computerized database search of MEDLINE, CINAHL, Academic Search Premier, Web of Science, and Google Scholar was undertaken in June and July 2014. Search terms used were lexical tone AND tonal language, speech development AND/OR speech production AND/OR speech perception AND cochlear implants, and pitch perception AND cochlear implants, anywhere in the title or abstract. CONCLUSION Despite the demonstrated limitations of pitch perception in cochlear implant users, there is some evidence that typical production and perception of lexical tone is possible by cochlear implant users with prelingual deafness. Further studies are required to determine the factors that contribute to better outcomes to inform rehabilitation processes for cochlear implant users in tonal-language environments.
Collapse
Affiliation(s)
- Johanna Tan
- The University of Melbourne, Victoria, Australia
| | | | - Adam Vogel
- Center for Neuroscience of Speech, The University of Melbourne, Victoria, Australia
- Hertie Institute for Clinical Brain Research, Eberhard Karls Universität Tübingen, Germany
- Murdoch Childrens Research Institute, The Bruce Lefroy Centre for Genetic Health Research, Melbourne, Victoria, Australia
| |
Collapse
|
10
|
Takanen M, Bruce IC, Seeber BU. Phenomenological modelling of electrically stimulated auditory nerve fibers: A review. NETWORK (BRISTOL, ENGLAND) 2016; 27:157-185. [PMID: 27573993 DOI: 10.1080/0954898x.2016.1219412] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Auditory nerve fibers (ANFs) play a crucial role in hearing by encoding and transporting the synaptic input from inner hair cells into afferent spiking information for higher stages of the auditory system. If the inner hair cells are degenerated, cochlear implants may restore hearing by directly stimulating the ANFs. The response of an ANF is affected by several characteristics of the electrical stimulus and of the ANF, and neurophysiological measurements are needed to know how the ANF responds to a particular stimulus. However, recording from individual nerve fibers in humans is not feasible and obtaining compound neural or psychophysical responses is often time-consuming. This motivates the design and use of models to estimate the ANF response to the electrical stimulation. Phenomenological models reproduce the ANF response based on a simplified description of ANF functionality and on a limited parameter space by not directly describing detailed biophysical mechanisms. Here, we give an overview of phenomenological models published to date and demonstrate how different modeling approaches can account for the diverse phenomena affecting the ANF response. To highlight the success achieved in designing such models, we also describe a number of applications of phenomenological models to predict percepts of cochlear implant listeners.
Collapse
Affiliation(s)
- Marko Takanen
- a Audio Information Processing, Department of Electrical and Computer Engineering , Technical University of Munich , Munich , Germany
| | - Ian C Bruce
- b Department of Electrical and Computer Engineering , McMaster University , Hamilton , ON , Canada
| | - Bernhard U Seeber
- a Audio Information Processing, Department of Electrical and Computer Engineering , Technical University of Munich , Munich , Germany
| |
Collapse
|
11
|
Meng Q, Zheng N, Li X. Mandarin speech-in-noise and tone recognition using vocoder simulations of the temporal limits encoder for cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 139:301-310. [PMID: 26827026 DOI: 10.1121/1.4939707] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Temporal envelope-based signal processing strategies are widely used in cochlear-implant (CI) systems. It is well recognized that the inability to convey temporal fine structure (TFS) in the stimuli limits CI users' performance, but it is still unclear how to effectively deliver the TFS. A strategy known as the temporal limits encoder (TLE), which employs an approach to derive the amplitude modulator to generate the stimuli coded in an interleaved-sampling strategy, has recently been proposed. The TLE modulator contains information related to the original temporal envelope and a slow-varying TFS from the band signal. In this paper, theoretical analyses are presented to demonstrate the superiority of TLE compared with two existing strategies, the clinically available continuous-interleaved-sampling (CIS) strategy and the experimental harmonic-single-sideband-encoder strategy. Perceptual experiments with vocoder simulations in normal-hearing listeners are conducted to compare the performance of TLE and CIS on two tasks (i.e., Mandarin speech reception in babble noise and tone recognition in quiet). The performance of the TLE modulator is mostly better than (for most tone-band vocoders) or comparable to (for noise-band vocoders) the CIS modulator on both tasks. This work implies that there is some potential for improving the representation of TFS with CIs by using a TLE strategy.
Collapse
Affiliation(s)
- Qinglin Meng
- Shenzhen Key Laboratory of Modern Communication and Information Processing, College of Information Engineering, Shenzhen University, Shenzhen 518060, China
| | - Nengheng Zheng
- Shenzhen Key Laboratory of Modern Communication and Information Processing, College of Information Engineering, Shenzhen University, Shenzhen 518060, China
| | - Xia Li
- Shenzhen Key Laboratory of Modern Communication and Information Processing, College of Information Engineering, Shenzhen University, Shenzhen 518060, China
| |
Collapse
|
12
|
Kalathottukaren RT, Purdy SC, Ballard E. Prosody perception and musical pitch discrimination in adults using cochlear implants. Int J Audiol 2015; 54:444-52. [DOI: 10.3109/14992027.2014.997314] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
13
|
Hsieh IH, Fillmore P, Rong F, Hickok G, Saberi K. FM-selective networks in human auditory cortex revealed using fMRI and multivariate pattern classification. J Cogn Neurosci 2012; 24:1896-907. [PMID: 22640390 DOI: 10.1162/jocn_a_00254] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Frequency modulation (FM) is an acoustic feature of nearly all complex sounds. Directional FM sweeps are especially pervasive in speech, music, animal vocalizations, and other natural sounds. Although the existence of FM-selective cells in the auditory cortex of animals has been documented, evidence in humans remains equivocal. Here we used multivariate pattern analysis to identify cortical selectivity for direction of a multitone FM sweep. This method distinguishes one pattern of neural activity from another within the same ROI, even when overall level of activity is similar, allowing for direct identification of FM-specialized networks. Standard contrast analysis showed that despite robust activity in auditory cortex, no clusters of activity were associated with up versus down sweeps. Multivariate pattern analysis classification, however, identified two brain regions as selective for FM direction, the right primary auditory cortex on the supratemporal plane and the left anterior region of the superior temporal gyrus. These findings are the first to directly demonstrate existence of FM direction selectivity in the human auditory cortex.
Collapse
Affiliation(s)
- I-Hui Hsieh
- National Central University, Jhongli City, Taiwan.
| | | | | | | | | |
Collapse
|
14
|
Bouton S, Serniclaes W, Bertoncini J, Colé P. Perception of speech features by French-speaking children with cochlear implants. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2012; 55:139-153. [PMID: 22199195 DOI: 10.1044/1092-4388(2011/10-0330)] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
PURPOSE The present study investigates the perception of phonological features in French-speaking children with cochlear implants (CIs) compared with normal-hearing (NH) children matched for listening age. METHOD Scores for discrimination and identification of minimal pairs for all features defining consonants (e.g., place, voicing, manner, nasality) and vowels (e.g., frontness, nasality, aperture) were measured in each listener. RESULTS The results indicated no differences in "categorical perception," specified as a similar difference between discrimination and identification between CI children and controls. However, CI children demonstrated a lower level of "categorical precision," that is, lesser accuracy in both feature identification and discrimination, than NH children, with the magnitude of the deficit depending on the feature. CONCLUSIONS If sensitive periods of language development extend well beyond the moment of implantation, the consequences of hearing deprivation for the acquisition of categorical perception should be fairly important in comparison to categorical precision because categorical precision develops more slowly than categorical perception in NH children. These results do not support the idea that the sensitive period for development of categorical perception is restricted to the first 1-2 years of life. The sensitive period may be significantly longer. Differences in precision may reflect the acoustic limitations of the cochlear implant, such as coding for temporal fine structure and frequency resolution.
Collapse
|
15
|
Banai K, Sabin AT, Wright BA. Separable developmental trajectories for the abilities to detect auditory amplitude and frequency modulation. Hear Res 2011; 280:219-27. [PMID: 21664958 DOI: 10.1016/j.heares.2011.05.019] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/29/2010] [Revised: 05/23/2011] [Accepted: 05/25/2011] [Indexed: 10/18/2022]
Abstract
Amplitude modulation (AM) and frequency modulation (FM) are inherent components of most natural sounds. The ability to detect these modulations, considered critical for normal auditory and speech perception, improves over the course of development. However, the extent to which the development of AM and FM detection skills follow different trajectories, and therefore can be attributed to the maturation of separate processes, remains unclear. Here we explored the relationship between the developmental trajectories for the detection of sinusoidal AM and FM in a cross-sectional design employing children aged 8-10 and 11-12 years and adults. For FM of tonal carriers, both average performance (mean) and performance consistency (within-listener standard deviation) were adult-like in the 8-10 y/o. In contrast, in the same listeners, average performance for AM of wideband noise carriers was still not adult-like in the 11-12 y/o, though performance consistency was already mature in the 8-10 y/o. Among the children there were no significant correlations for either measure between the degrees of maturity for AM and FM detection. These differences in developmental trajectory between the two modulation cues and between average detection thresholds and performance consistency suggest that at least partially distinct processes may underlie the development of AM and FM detection as well as the abilities to detect modulation and to do so consistently.
Collapse
Affiliation(s)
- Karen Banai
- Department of Communication Sciences and Disorders, University of Haifa, Haifa 31905, Israel.
| | | | | |
Collapse
|
16
|
Drgas S, Blaszak MA. Perception of speech in reverberant conditions using AM-FM cochlear implant simulation. Hear Res 2010; 269:162-8. [PMID: 20603206 DOI: 10.1016/j.heares.2010.06.016] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/12/2009] [Revised: 06/14/2010] [Accepted: 06/18/2010] [Indexed: 11/16/2022]
Abstract
This study assessed the effects of speech misidentification and cognitive processing errors in normal-hearing adults listening to degraded auditory input signals simulating cochlear implants in reverberation conditions. Three variables were controlled: number of vocoder channels (six and twelve), instantaneous frequency change rate (none, 50, 400 Hz), and enclosures (different reverberation conditions). The analyses were made on the basis of: (a) nonsense word recognition scores for eight young normal-hearing listeners, (b) 'ease of listening' based on the time of response, and (c) the subjective measure of difficulty. The maximum score of speech intelligibility in cochlear implant simulation was 70% for non-reverberant conditions with a 12-channel vocoder and changes of instantaneous frequency limited to 400 Hz. In the presence of reflections, word misidentification was about 10-20 percentage points higher. There was little difference between the 50 and 400 Hz frequency modulation cut-off for the 12-channel vocoder; however, in the case of six channels this difference was more significant. The results of the experiment suggest that the information other than F0, that is carried by FM, can be sufficient to improve speech intelligibility in the real-world conditions.
Collapse
Affiliation(s)
- Szymon Drgas
- Adam Mickiewicz University, Institute of Acoustics, Poznan, Umultowska 85, Poland.
| | | |
Collapse
|
17
|
Should spikes be treated with equal weightings in the generation of spectro-temporal receptive fields? ACTA ACUST UNITED AC 2009; 104:215-22. [PMID: 19941954 DOI: 10.1016/j.jphysparis.2009.11.026] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Knowledge on the trigger features of central auditory neurons is important in the understanding of speech processing. Spectro-temporal receptive fields (STRFs) obtained using random stimuli and spike-triggered averaging allow visualization of trigger features which often appear blurry in the time-versus-frequency plot. For a clearer visualization we have previously developed a dejittering algorithm to sharpen trigger features in the STRF of FM-sensitive cells. Here we extended this algorithm to segregate spikes, based on their dejitter values, into two groups: normal and outlying, and to construct their STRF separately. We found that while the STRF of the normal jitter group resembled full trigger feature in the original STRF, those of the outlying jitter group resembled a different or partial trigger feature. This algorithm allowed the extraction of other weaker trigger features. Due to the presence of different trigger features in a given cell, we proposed that in the generation of STRF, the evoked spikes should not be treated indiscriminately with equal weightings.
Collapse
|
18
|
Stohl JS, Throckmorton CS, Collins LM. Investigating the effects of stimulus duration and context on pitch perception by cochlear implant users. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 126:318-326. [PMID: 19603888 PMCID: PMC2723905 DOI: 10.1121/1.3133246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2008] [Revised: 04/21/2009] [Accepted: 04/22/2009] [Indexed: 05/28/2023]
Abstract
Cochlear implant sound processing strategies that use time-varying pulse rates to transmit fine structure information are one proposed method for improving the spectral representation of a sound with the eventual goal of improving speech recognition in noisy conditions, speech recognition in tonal languages, and music identification and appreciation. However, many of the perceptual phenomena associated with time-varying rates are not well understood. In this study, the effects of stimulus duration on both the place and rate-pitch percepts were investigated via psychophysical experiments. Four Nucleus CI24 cochlear implant users participated in these experiments, which included a short-duration pitch ranking task and three adaptive pulse rate discrimination tasks. When duration was fixed from trial-to-trial and rate was varied adaptively, results suggested that both the place-pitch and rate-pitch percepts may be independent of duration for durations above 10 and 20 ms, respectively. When duration was varied and pulse rates were fixed, performance was highly variable within and across subjects. Implications for multi-rate sound processing strategies are discussed.
Collapse
Affiliation(s)
- Joshua S Stohl
- Department of Electrical and Computer Engineering, Duke University, Durham, North Carolina 27708-0291, USA
| | | | | |
Collapse
|
19
|
Chatterjee M, Peng SC. Processing F0 with cochlear implants: Modulation frequency discrimination and speech intonation recognition. Hear Res 2007; 235:143-56. [PMID: 18093766 DOI: 10.1016/j.heares.2007.11.004] [Citation(s) in RCA: 148] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/25/2007] [Revised: 11/13/2007] [Accepted: 11/16/2007] [Indexed: 10/22/2022]
Abstract
Fundamental frequency (F0) processing by cochlear implant (CI) listeners was measured using a psychophysical task and a speech intonation recognition task. Listeners' Weber fractions for modulation frequency discrimination were measured using an adaptive, 3-interval, forced-choice paradigm: stimuli were presented through a custom research interface. In the speech intonation recognition task, listeners were asked to indicate whether resynthesized bisyllabic words, when presented in the free field through the listeners' everyday speech processor, were question-like or statement-like. The resynthesized tokens were systematically manipulated to have different initial-F0s to represent male vs. female voices, and different F0 contours (i.e. falling, flat, and rising) Although the CI listeners showed considerable variation in performance on both tasks, significant correlations were observed between the CI listeners' sensitivity to modulation frequency in the psychophysical task and their performance in intonation recognition. Consistent with their greater reliance on temporal cues, the CI listeners' performance in the intonation recognition task was significantly poorer with the higher initial-F0 stimuli than with the lower initial-F0 stimuli. Similar results were obtained with normal hearing listeners attending to noiseband-vocoded CI simulations with reduced spectral resolution.
Collapse
Affiliation(s)
- Monita Chatterjee
- Department of Hearing and Speech Sciences, University of Maryland, College Park, MD 20742, USA.
| | | |
Collapse
|
20
|
Xu Y, Collins LM. Predictions of Psychophysical Measurements for Sinusoidal Amplitude Modulated (SAM) Pulse-Train Stimuli From a Stochastic Model. IEEE Trans Biomed Eng 2007; 54:1389-98. [PMID: 17694859 DOI: 10.1109/tbme.2007.900800] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Two approaches have been proposed to reduce the synchrony of the neural response to electrical stimuli in cochlear implants. One approach involves adding noise to the pulse-train stimulus, and the other is based on using a high-rate pulse-train carrier. Hypotheses regarding the efficacy of the two approaches can be tested using computational models of neural responsiveness prior to time-intensive psychophysical studies. In our previous work, we have used such models to examine the effects of noise on several psychophysical measures important to speech recognition. However, to date there has been no parallel analytic solution investigating the neural response to the high-rate pulse-train stimuli and their effect on psychophysical measures. This work investigates the properties of the neural response to high-rate pulse-train stimuli with amplitude modulated envelopes using a stochastic auditory nerve model. The statistics governing the neural response to each pulse are derived using a recursive method. The agreement between the theoretical predictions and model simulations is demonstrated for sinusoidal amplitude modulated (SAM) high rate pulse-train stimuli. With our approach, predicting the neural response in modern implant devices becomes tractable. Psychophysical measurements are also predicted using the stochastic auditory nerve model for SAM high-rate pulse-train stimuli. Changes in dynamic range (DR) and intensity discrimination are compared with that observed for noise-modulated pulse-train stimuli. Modulation frequency discrimination is also studied as a function of stimulus level and pulse rate. Results suggest that high rate carriers may positively impact such psychophysical measures.
Collapse
Affiliation(s)
- Yifang Xu
- Department of Hearing, Speech, and Language Sciences, Gallaudet University, Washington, DC 20002, USA.
| | | |
Collapse
|
21
|
Luo X, Fu QJ. Frequency modulation detection with simultaneous amplitude modulation by cochlear implant users. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 122:1046-54. [PMID: 17672652 DOI: 10.1121/1.2751258] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
To better represent fine structure cues in cochlear implants (CIs), recent research has proposed varying the stimulation rate based on slowly varying frequency modulation (FM) information. The present study investigated the abilities of CI users to detect FM with simultaneous amplitude modulation (AM). FM detection thresholds (FMDTs) for 10-Hz sinusoidal FM and upward frequency sweeps were measured as a function of standard frequency (75-1000 Hz). Three AM conditions were tested, including (1) No AM, (2) 20-Hz Sinusoidal AM (SAM) with modulation depths of 10%, 20%, or 30%, and (3) Noise AM (NAM), in which the amplitude was randomly and uniformly varied over a range of 1, 2, or 3 dB, relative to the reference amplitude. Results showed that FMDTs worsened with increasing standard frequencies, and were lower for sinusoidal FM than for upward frequency sweeps. Simultaneous AM significantly interfered with FM detection; FMDTs were significantly poorer with simultaneous NAM than with SAM. Besides, sinusoidal FMDTs significantly worsened when the starting phase of simultaneous SAM was randomized. These results suggest that FM and AM in CI partly share a common loudness-based coding mechanism and the feasibility of "FM+AM" strategies for CI speech processing may be limited.
Collapse
Affiliation(s)
- Xin Luo
- Department of Auditory Implants and Perception, House Ear Institute, Los Angeles, California 90057, USA.
| | | |
Collapse
|
22
|
Stickney GS, Assmann PF, Chang J, Zeng FG. Effects of cochlear implant processing and fundamental frequency on the intelligibility of competing sentences. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 122:1069-78. [PMID: 17672654 DOI: 10.1121/1.2750159] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Speech perception in the presence of another competing voice is one of the most challenging tasks for cochlear implant users. Several studies have shown that (1) the fundamental frequency (F0) is a useful cue for segregating competing speech sounds and (2) the F0 is better represented by the temporal fine structure than by the temporal envelope. However, current cochlear implant speech processing algorithms emphasize temporal envelope information and discard the temporal fine structure. In this study, speech recognition was measured as a function of the F0 separation of the target and competing sentence in normal-hearing and cochlear implant listeners. For the normal-hearing listeners, the combined sentences were processed through either a standard implant simulation or a new algorithm which additionally extracts a slowed-down version of the temporal fine structure (called Frequency-Amplitude-Modulation-Encoding). The results showed no benefit of increasing F0 separation for the cochlear implant or simulation groups. In contrast, the new algorithm resulted in gradual improvements with increasing F0 separation, similar to that found with unprocessed sentences. These results emphasize the importance of temporal fine structure for speech perception and demonstrate a potential remedy for difficulty in the perceptual segregation of competing speech sounds.
Collapse
|
23
|
Laback B, Majdak P, Baumgartner WD. Lateralization discrimination of interaural time delays in four-pulse sequences in electric and acoustic hearing. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 121:2182-91. [PMID: 17471732 DOI: 10.1121/1.2642280] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
This study examined the sensitivity of four cochlear implant (CI) listeners to interaural time difference (ITD) in different portions of four-pulse sequences in lateralization discrimination. ITD was present either in all the pulses (referred to as condition Wave), the two middle pulses (Ongoing), the first pulse (Onset), the last pulse (Offset), or both the first and last pulse (Gating). All ITD conditions were tested at different pulse rates (100, 200, 400, and 800 pulses/s pps). Also, five normal hearing (NH) subjects were tested, listening to an acoustic simulation of CI stimulation. All CI and NH listeners were sensitive in condition Gating at all pulse rates for which they showed sensitivity in condition Wave. The sensitivity in condition Onset increased with the pulse rate for three CI listeners as well as for all NH listeners. The performance in condition Ongoing varied over the subjects. One CI listener showed sensitivity up to 800 pps, two up to 400 pps, and one at 100 pps only. The group of NH listeners showed sensitivity up to 200 pps. The result that CI listeners detect ITD from the middle pulses of short trains indicates the relevance of fine timing of stimulation pulses in lateralization and therefore in CI stimulation strategies.
Collapse
Affiliation(s)
- Bernhard Laback
- Acoustics Research Institute, Austrian Academy of Sciences, Reichsratsstrasse 17, A-1010 Vienna, Austria.
| | | | | |
Collapse
|
24
|
Sit JJ, Simonson AM, Oxenham AJ, Faltys MA, Sarpeshkar R. A Low-Power Asynchronous Interleaved Sampling Algorithm for Cochlear Implants That Encodes Envelope and Phase Information. IEEE Trans Biomed Eng 2007; 54:138-49. [PMID: 17260865 DOI: 10.1109/tbme.2006.883819] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Cochlear implants currently fail to convey phase information, which is important for perceiving music, tonal languages, and for hearing in noisy environments. We propose a bio-inspired asynchronous interleaved sampling (AIS) algorithm that encodes both envelope and phase information, in a manner that may be suitable for delivery to cochlear implant users. Like standard continuous interleaved sampling (CIS) strategies, AIS naturally meets the interleaved-firing requirement, which is to stimulate only one electrode at a time, minimizing electrode interactions. The majority of interspike intervals are distributed over 1-4 ms, thus staying within the absolute refractory limit of neurons, and form a more natural, pseudostochastic pattern of firing due to complex channel interactions. Stronger channels are selected to fire more often but the strategy ensures that weaker channels are selected to fire in proportion to their signal strength as well. The resulting stimulation rates are considerably lower than those of most modern implants, saving power yet delivering higher potential performance. Correlations with original sounds were found to be significantly higher in AIS reconstructions than in signal reconstructions using only envelope information. Two perceptual tests on normal-hearing listeners verified that the reconstructed signals enabled better melody and speech recognition in noise than those processed using tone-excited envelope-vocoder simulations of cochlear implant processing. Thus, our strategy could potentially save power and improve hearing performance in cochlear implant users.
Collapse
Affiliation(s)
- Ji-Jon Sit
- Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA
| | | | | | | | | |
Collapse
|
25
|
Rogers CF, Healy EW, Montgomery AA. Sensitivity to isolated and concurrent intensity and fundamental frequency increments by cochlear implant users under natural listening conditions. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 119:2276-87. [PMID: 16642841 DOI: 10.1121/1.2167150] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Sensitivity to acoustic cues in cochlear implant (CI) listening under natural conditions is a potentially complex interaction between a number of simultaneous factors, and may be difficult to predict. In the present study, sensitivity was measured under conditions that approximate those of natural listening. Synthesized words having increases in intensity or fundamental frequency (F0) in a middle stressed syllable were presented in soundfield to normal-hearing listeners and to CI listeners using their everyday speech processors and programming. In contrast to the extremely fine sensitivity to electrical current observed when direct stimulation of single electrodes is employed, difference limens (DLs) for intensity were larger for the CI listeners by a factor of 2.4. In accord with previous work, F0 DLs were larger by almost one order of magnitude. In a second experiment, it was found that the presence of concurrent intensity and F0 increments reduced the mean DL to half that of either cue alone for both groups of subjects, indicating that both groups combine concurrent cues with equal success. Although sensitivity to either cue in isolation was not related to word recognition in CI users, the listeners having lower combined-cue thresholds produced better word recognition scores.
Collapse
Affiliation(s)
- Cheryl F Rogers
- Department of Communication Sciences and Disorders, The Arnold School of Public Health, University of South Carolina, Columbia, South Carolina 28208, USA
| | | | | |
Collapse
|
26
|
Stickney GS, Nie K, Zeng FG. Contribution of frequency modulation to speech recognition in noise. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2005; 118:2412-20. [PMID: 16266163 DOI: 10.1121/1.2031967] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Cochlear implants allow most patients with profound deafness to successfully communicate under optimal listening conditions. However, the amplitude modulation (AM) information provided by most implants is not sufficient for speech recognition in realistic settings where noise is typically present. This study added slowly varying frequency modulation (FM) to the existing algorithm of an implant simulation and used competing sentences to evaluate FM contributions to speech recognition in noise. Potential FM advantage was evaluated as a function of the number of spectral bands, FM depth, FM rate, and FM band distribution. Barring floor and ceiling effects, significant improvement was observed for all bands from 1 to 32 with the additional FM cue both in quiet and noise. Performance also improved with greater FM depth and rate, which might reflect resolved sidebands under the FM condition. Having FM present in low-frequency bands was more beneficial than in high-frequency bands, and only half of the bands required the presence of FM, regardless of position, to achieve performance similar to when all bands had the FM cue. These results provide insight into the relative contributions of AM and FM to speech communication and the potential advantage of incorporating FM for cochlear implant signal processing.
Collapse
Affiliation(s)
- Ginger S Stickney
- Department of Otolaryngology - Head and Neck Surgery, University of California, Irvine, 364 Medical Surgery II, Irvine, California 92697-1275, USA.
| | | | | |
Collapse
|
27
|
Chen H, Ishihara YC, Zeng FG. Pitch discrimination of patterned electric stimulation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2005; 118:338-45. [PMID: 16119354 DOI: 10.1121/1.1937228] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
One reason for the poor pitch performance in current cochlear-implant users may be the highly synchronized neural firing in electric hearing that lacks stochastic properties of neural firing in normal acoustic hearing. This study used three different electric stimulation patterns, jittered, probabilistic, and auditory-model-generated pulses, to mimic some aspects of the normal neural firing pattern in acoustic hearing. Pitch discrimination was measured at standard frequencies of 100, 250, 500, and 1000 Hz on three Nucleus-24 cochlear-implant users. To test the utility of the autocorrelation pitch perception model in electric hearing, one, two, and four electrodes were stimulated independently with the same patterned electric stimulation. Results showed no improvement in performance with any experimental pattern compared to the fixed-rate control. Pitch discrimination was actually worsened with the jittered pattern at low frequencies (125 and 250 Hz) than that of the control, suggesting that externally introduced stochastic properties do not improve pitch perception in electric stimulation. The multiple-electrode stimulation did not improve performance but did not degrade performance either. The present results suggest that both "the right time and the right place" may be needed to restore normal pitch perception in cochlear-implant users.
Collapse
Affiliation(s)
- Hongbin Chen
- Hearing and Speech Research Laboratory, Department of Anatomy and Neurobiology, University of California, Irvine, California 92697, USA.
| | | | | |
Collapse
|
28
|
Zeng FG, Nie K, Stickney GS, Kong YY, Vongphoe M, Bhargave A, Wei C, Cao K. Speech recognition with amplitude and frequency modulations. Proc Natl Acad Sci U S A 2005; 102:2293-8. [PMID: 15677723 PMCID: PMC546014 DOI: 10.1073/pnas.0406460102] [Citation(s) in RCA: 249] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Amplitude modulation (AM) and frequency modulation (FM) are commonly used in communication, but their relative contributions to speech recognition have not been fully explored. To bridge this gap, we derived slowly varying AM and FM from speech sounds and conducted listening tests using stimuli with different modulations in normal-hearing and cochlear-implant subjects. We found that although AM from a limited number of spectral bands may be sufficient for speech recognition in quiet, FM significantly enhances speech recognition in noise, as well as speaker and tone recognition. Additional speech reception threshold measures revealed that FM is particularly critical for speech recognition with a competing voice and is independent of spectral resolution and similarity. These results suggest that AM and FM provide independent yet complementary contributions to support robust speech recognition under realistic listening situations. Encoding FM may improve auditory scene analysis, cochlear-implant, and audiocoding performance.
Collapse
Affiliation(s)
- Fan-Gang Zeng
- Department of Anatomy and Neurobiology, University of California, Irvine, CA 92697, USA.
| | | | | | | | | | | | | | | |
Collapse
|
29
|
Nie K, Stickney G, Zeng FG. Encoding Frequency Modulation to Improve Cochlear Implant Performance in Noise. IEEE Trans Biomed Eng 2005; 52:64-73. [PMID: 15651565 DOI: 10.1109/tbme.2004.839799] [Citation(s) in RCA: 150] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Different from traditional Fourier analysis, a signal can be decomposed into amplitude and frequency modulation components. The speech processing strategy in most modern cochlear implants only extracts and encodes amplitude modulation in a limited number of frequency bands. While amplitude modulation encoding has allowed cochlear implant users to achieve good speech recognition in quiet, their performance in noise is severely compromised. Here, we propose a novel speech processing strategy that encodes both amplitude and frequency modulations in order to improve cochlear implant performance in noise. By removing the center frequency from the subband signals and additionally limiting the frequency modulation's range and rate, the present strategy transforms the fast-varying temporal fine structure into a slowly varying frequency modulation signal. As a first step, we evaluated the potential contribution of additional frequency modulation to speech recognition in noise via acoustic simulations of the cochlear implant. We found that while amplitude modulation from a limited number of spectral bands is sufficient to support speech recognition in quiet, frequency modulation is needed to support speech recognition in noise. In particular, improvement by as much as 71 percentage points was observed for sentence recognition in the presence of a competing voice. The present result strongly suggests that frequency modulation be extracted and encoded to improve cochlear implant performance in realistic listening situations. We have proposed several implementation methods to stimulate further investigation. Index Terms-Amplitude modulation, cochlear implant, fine structure, frequency modulation, signal processing, speech recognition, temporal envelope.
Collapse
Affiliation(s)
- Kaibao Nie
- Departments of Otolaryngology-Head and Neck Surgery and Biomedical Engineering, University of California, Irvine, CA 92697 USA.
| | | | | |
Collapse
|