1
|
Pandey PR, Herrmann B. The Influence of Semantic Context on the Intelligibility Benefit From Speech Glimpses in Younger and Older Adults. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2025; 68:2499-2516. [PMID: 40233803 DOI: 10.1044/2025_jslhr-24-00588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/17/2025]
Abstract
PURPOSE Speech is often masked by background sound that fluctuates over time. Fluctuations in masker intensity can reveal glimpses of speech that support speech intelligibility, but older adults have frequently been shown to benefit less from speech glimpses than younger adults when listening to sentences. Recent work, however, suggests that older adults may leverage speech glimpses as much, or more, when listening to naturalistic stories, potentially because of the availability of semantic context in stories. The current study directly investigated whether semantic context helps older adults benefit from speech glimpses released by a fluctuating (modulated) masker more than younger adults. METHOD In two experiments, we reduced and extended semantic information of sentence stimuli in modulated and unmodulated speech maskers for younger and older adults. Speech intelligibility was assessed. RESULTS We found that semantic context improves speech intelligibility in both younger and older adults. Both age groups also exhibit better speech intelligibility for a modulated than an unmodulated (stationary) masker, but the benefit from the speech glimpses was reduced in older compared to younger adults. Semantic context amplified the benefit gained from the speech glimpses, but there was no indication that the amplification by the semantic context led to a greater benefit in older adults. If anything, younger adults benefitted more. CONCLUSIONS The current results suggest that the deficit in the masking-release benefit in older adults generalizes to situations in which extended speech context is available. That previous research found a greater benefit in older than younger adults during story listening may suggest that other factors, such as thematic knowledge, motivation, or cognition, may amplify the benefit from speech glimpses under naturalistic listening conditions.
Collapse
Affiliation(s)
- Priya R Pandey
- Rotman Research Institute, Baycrest Academy for Research and Education, Toronto, Ontario, Canada
- Department of Psychology, University of Toronto, Ontario, Canada
| | - Björn Herrmann
- Rotman Research Institute, Baycrest Academy for Research and Education, Toronto, Ontario, Canada
- Department of Psychology, University of Toronto, Ontario, Canada
| |
Collapse
|
2
|
Gao Z, Oxenham AJ. Adaptation to sentences and melodies when making judgments along a voice-nonvoice continuum. Atten Percept Psychophys 2025; 87:1022-1032. [PMID: 40000570 DOI: 10.3758/s13414-025-03030-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/05/2025] [Indexed: 02/27/2025]
Abstract
Adaptation to constant or repetitive sensory signals serves to improve detection of novel events in the environment and to encode incoming information more efficiently. Within the auditory modality, contrastive adaptation effects have been observed within a number of categories, including voice and musical instrument type. A recent study found contrastive perceptual shifts between voice and instrument categories following repetitive presentation of adaptors consisting of either vowels or instrument tones. The current study tested the generalizability of adaptation along a voice-instrument continuum, using more ecologically valid adaptors. Participants were presented with an adaptor followed by an ambiguous voice-instrument target, created by generating a 10-step morphed continuum between pairs of vowel and instrument sounds. Listeners' categorization of the target sounds was shifted contrastively by a spoken sentence or instrumental melody adaptor, regardless of whether the adaptor and the target shared the same speaker gender or similar pitch range (Experiment 1). However, no significant contrastive adaptation was observed when nonspeech vocalizations or nonpitched percussion sounds were used as the adaptors (Experiment 2). The results suggest that adaptation between voice and nonvoice categories does not rely on exact repetition of simple stimuli, nor does it solely reflect the result of a sound being categorized as being human or nonhuman sourced. The outcomes suggest future directions for determining the precise spectro-temporal properties of sounds that induce these voice-instrument contrastive adaptation effects.
Collapse
Affiliation(s)
- Zi Gao
- Department of Psychology, University of Minnesota-Twin Cities, 75 E River Rd, Minneapolis, MN, 55455, USA.
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota-Twin Cities, 75 E River Rd, Minneapolis, MN, 55455, USA
| |
Collapse
|
3
|
Borrie SA, Tetzloff KA, Barrett TS, Lansford KL. Increasing Motivation Increases Intelligibility Benefits of Perceptual Training in Dysarthria. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2025; 34:85-96. [PMID: 39504442 PMCID: PMC11745309 DOI: 10.1044/2024_ajslp-24-00196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Revised: 07/18/2024] [Accepted: 08/20/2024] [Indexed: 11/08/2024]
Abstract
PURPOSE Perceptual training offers a promising, listener-targeted option for improving intelligibility of dysarthric speech. Cognitive resources are required for learning, and theoretical models of listening effort and engagement account for a role of listener motivation in allocation of such resources. Here, we manipulate training instructions to enhance motivation to test the hypothesis that increased motivation increases the intelligibility benefits of perceptual training. METHOD Across two data collection sites, which differed with respect to many elements of study design including age of speaker with dysarthria, dysarthria type and severity, type of testing and training stimuli, and participant compensation, 84 neurotypical adults were randomly assigned to one of two training instruction conditions: enhanced instructions or standard instructions. Intelligibility, quantified as percent words correct, was measured before and after training. RESULTS Listeners who received the enhanced instructions achieved greater intelligibility improvements from training relative to listeners who received the standard instructions. This result was robust across data collection sites and the many differences in methodology. CONCLUSIONS This study provides evidence for the role of motivation in improved understanding of dysarthric speech-increasing motivation increases allocation of cognitive resources to the learning process, resulting in improved mapping of the degraded speech signal. This provides empirical support for theoretical models of listening effort and engagement. Clinically, the results show that a simple addition to the training instructions can elevate learning outcomes.
Collapse
Affiliation(s)
- Stephanie A. Borrie
- Department of Communicative Disorders and Deaf Education, Utah State University, Logan
| | - Katerina A. Tetzloff
- Department of Communicative Disorders and Deaf Education, Utah State University, Logan
| | - Tyson S. Barrett
- Department of Communicative Disorders and Deaf Education, Utah State University, Logan
| | - Kaitlin L. Lansford
- Department of Communication Science and Disorders, Florida State University, Tallahassee
| |
Collapse
|
4
|
Smith ML, Winn MB. Repairing Misperceptions of Words Early in a Sentence is More Effortful Than Repairing Later Words, Especially for Listeners With Cochlear Implants. Trends Hear 2025; 29:23312165251320789. [PMID: 39995109 PMCID: PMC11851752 DOI: 10.1177/23312165251320789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2024] [Revised: 01/10/2025] [Accepted: 01/30/2025] [Indexed: 02/26/2025] Open
Abstract
The process of repairing misperceptions has been identified as a contributor to effortful listening in people who use cochlear implants (CIs). The current study was designed to examine the relative cost of repairing misperceptions at earlier or later parts of a sentence that contained contextual information that could be used to infer words both predictively and retroactively. Misperceptions were enforced at specific times by replacing single words with noise. Changes in pupil dilation were analyzed to track differences in the timing and duration of effort, comparing listeners with typical hearing (TH) or with CIs. Increases in pupil dilation were time-locked to the moment of the missing word, with longer-lasting increases when the missing word was earlier in the sentence. Compared to listeners with TH, CI listeners showed elevated pupil dilation for longer periods of time after listening, suggesting a lingering effect of effort after sentence offset. When needing to mentally repair missing words, CI listeners also made more mistakes on words elsewhere in the sentence, even though these words were not masked. Changes in effort based on the position of the missing word were not evident in basic measures like peak pupil dilation and only emerged when the full-time course was analyzed, suggesting the timing analysis adds new information to our understanding of listening effort. These results demonstrate that some mistakes are more costly than others and incur different levels of mental effort to resolve the mistake, underscoring the information lost when characterizing speech perception with simple measures like percent-correct scores.
Collapse
Affiliation(s)
- Michael L. Smith
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, MN, USA
| | - Matthew B. Winn
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
5
|
Lee J, Oxenham AJ. Testing the role of temporal coherence on speech intelligibility with noise and single-talker maskers. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 156:3285-3297. [PMID: 39545746 PMCID: PMC11575144 DOI: 10.1121/10.0034420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Accepted: 10/25/2024] [Indexed: 11/17/2024]
Abstract
Temporal coherence, where sounds with aligned timing patterns are perceived as a single source, is considered an essential cue in auditory scene analysis. However, its effects have been studied primarily with simple repeating tones, rather than speech. This study investigated the role of temporal coherence in speech by introducing across-frequency asynchronies. The effect of asynchrony on the intelligibility of target sentences was tested in the presence of background speech-shaped noise or a single-talker interferer. Our hypothesis was that disrupting temporal coherence should not only reduce intelligibility but also impair listeners' ability to segregate the target speech from an interfering talker, leading to greater degradation for speech-in-speech than speech-in-noise tasks. Stimuli were filtered into eight frequency bands, which were then desynchronized with delays of 0-120 ms. As expected, intelligibility declined as asynchrony increased. However, the decline was similar for both noise and single-talker maskers. Primarily target, rather than masker, asynchrony affected performance for both natural (forward) and reversed-speech maskers, and for target sentences with low and high semantic context. The results suggest that temporal coherence may not be as critical a cue for speech segregation as it is for the non-speech stimuli traditionally used in studies of auditory scene analysis.
Collapse
Affiliation(s)
- Jaeeun Lee
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota 55455, USA
| |
Collapse
|
6
|
Borrie SA, Hepworth TJ, Wynn CJ, Hustad KC, Barrett TS, Lansford KL. Perceptual Learning of Dysarthria in Adolescence. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:3791-3803. [PMID: 37616225 PMCID: PMC10713018 DOI: 10.1044/2023_jslhr-23-00231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 05/28/2023] [Accepted: 06/20/2023] [Indexed: 08/26/2023]
Abstract
PURPOSE As evidenced by perceptual learning studies involving adult listeners and speakers with dysarthria, adaptation to dysarthric speech is driven by signal predictability (speaker property) and a flexible speech perception system (listener property). Here, we extend adaptation investigations to adolescent populations and examine whether adult and adolescent listeners can learn to better understand an adolescent speaker with dysarthria. METHOD Classified by developmental stage, adult (n = 42) and adolescent (n = 40) listeners completed a three-phase perceptual learning protocol (pretest, familiarization, and posttest). During pretest and posttest, all listeners transcribed speech produced by a 13-year-old adolescent with spastic dysarthria associated with cerebral palsy. During familiarization, half of the adult and adolescent listeners engaged in structured familiarization (audio and lexical feedback) with the speech of the adolescent speaker with dysarthria; and the other half, with the speech of a neurotypical adolescent speaker (control). RESULTS Intelligibility scores increased from pretest to posttest for all listeners. However, listeners who received dysarthria familiarization achieved greater intelligibility improvements than those who received control familiarization. Furthermore, there was a significant effect of developmental stage, where the adults achieved greater intelligibility improvements relative to the adolescents. CONCLUSIONS This study provides the first tranche of evidence that adolescent dysarthric speech is learnable-a finding that holds even for adolescent listeners whose speech perception systems are not yet fully developed. Given the formative role that social interactions play during adolescence, these findings of improved intelligibility afford important clinical implications.
Collapse
Affiliation(s)
- Stephanie A. Borrie
- Department of Communicative Disorders and Deaf Education, Utah State University, Logan
| | - Taylor J. Hepworth
- Department of Communicative Disorders and Deaf Education, Utah State University, Logan
| | - Camille J. Wynn
- Department of Communication Science and Disorders, University of Houston
| | - Katherine C. Hustad
- Waisman Center, University of Wisconsin–Madison
- Department of Communication Sciences and Disorders, University of Wisconsin–Madison
| | | | - Kaitlin L. Lansford
- Department of Communication Science and Disorders, Florida State University, Tallahassee
| |
Collapse
|
7
|
Wasiuk PA, Buss E, Oleson JJ, Calandruccio L. Predicting speech-in-speech recognition: Short-term audibility, talker sex, and listener factors. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:3010. [PMID: 36456289 DOI: 10.1121/10.0015228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Accepted: 11/01/2022] [Indexed: 06/17/2023]
Abstract
Speech-in-speech recognition can be challenging, and listeners vary considerably in their ability to accomplish this complex auditory-cognitive task. Variability in performance can be related to intrinsic listener factors as well as stimulus factors associated with energetic and informational masking. The current experiments characterized the effects of short-term audibility of the target, differences in target and masker talker sex, and intrinsic listener variables on sentence recognition in two-talker speech and speech-shaped noise. Participants were young adults with normal hearing. Each condition included the adaptive measurement of speech reception thresholds, followed by testing at a fixed signal-to-noise ratio (SNR). Short-term audibility for each keyword was quantified using a computational glimpsing model for target+masker mixtures. Scores on a psychophysical task of auditory stream segregation predicted speech recognition, with stronger effects for speech-in-speech than speech-in-noise. Both speech-in-speech and speech-in-noise recognition depended on the proportion of audible glimpses available in the target+masker mixture, even across stimuli presented at the same global SNR. Short-term audibility requirements varied systematically across stimuli, providing an estimate of the greater informational masking for speech-in-speech than speech-in-noise recognition and quantifying informational masking for matched and mismatched talker sex.
Collapse
Affiliation(s)
- Peter A Wasiuk
- Department of Psychological Sciences, 11635 Euclid Avenue, Case Western Reserve University, Cleveland, Ohio 44106, USA
| | - Emily Buss
- Department of Otolaryngology/Head and Neck Surgery, 170 Manning Drive, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
| | - Jacob J Oleson
- Department of Biostatistics, 145 North Riverside Drive, University of Iowa, Iowa City, Iowa 52242, USA
| | - Lauren Calandruccio
- Department of Psychological Sciences, 11635 Euclid Avenue, Case Western Reserve University, Cleveland, Ohio 44106, USA
| |
Collapse
|
8
|
Cowan T, Paroby C, Leibold LJ, Buss E, Rodriguez B, Calandruccio L. Masked-Speech Recognition for Linguistically Diverse Populations: A Focused Review and Suggestions for the Future. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:3195-3216. [PMID: 35917458 PMCID: PMC9911100 DOI: 10.1044/2022_jslhr-22-00011] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Revised: 04/12/2022] [Accepted: 05/04/2022] [Indexed: 06/15/2023]
Abstract
PURPOSE Twenty years ago, von Hapsburg and Peña (2002) wrote a tutorial that reviewed the literature on speech audiometry and bilingualism and outlined valuable recommendations to increase the rigor of the evidence base. This review article returns to that seminal tutorial to reflect on how that advice was applied over the last 20 years and to provide updated recommendations for future inquiry. METHOD We conducted a focused review of the literature on masked-speech recognition for bilingual children and adults. First, we evaluated how studies published since 2002 described bilingual participants. Second, we reviewed the literature on native language masked-speech recognition. Third, we discussed theoretically motivated experimental work. Fourth, we outlined how recent research in bilingual speech recognition can be used to improve clinical practice. RESULTS Research conducted since 2002 commonly describes bilingual samples in terms of their language status, competency, and history. Bilingualism was not consistently associated with poor masked-speech recognition. For example, bilinguals who were exposed to English prior to age 7 years and who were dominant in English performed comparably to monolinguals for masked-sentence recognition tasks. To the best of our knowledge, there are no data to document the masked-speech recognition ability of these bilinguals in their other language compared to a second monolingual group, which is an important next step. Nonetheless, individual factors that commonly vary within bilingual populations were associated with masked-speech recognition and included language dominance, competency, and age of acquisition. We identified methodological issues in sampling strategies that could, in part, be responsible for inconsistent findings between studies. For instance, disparities in socioeconomic status (SES) between recruited bilingual and monolingual groups could cause confounding bias within the research design. CONCLUSIONS Dimensions of the bilingual linguistic profile should be considered in clinical practice to inform counseling and (re)habilitation strategies since susceptibility to masking is elevated in at least one language for most bilinguals. Future research should continue to report language status, competency, and history but should also report language stability and demand for use data. In addition, potential confounds (e.g., SES, educational attainment) when making group comparisons between monolinguals and bilinguals must be considered.
Collapse
Affiliation(s)
- Tiana Cowan
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE
| | - Caroline Paroby
- Department of Psychological Sciences, Case Western Reserve University, Cleveland, OH
| | - Lori J. Leibold
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE
| | - Emily Buss
- Department of Otolaryngology/Head and Neck Surgery, The University of North Carolina at Chapel Hill
| | - Barbara Rodriguez
- Department of Speech and Hearing Sciences, The University of New Mexico, Albuquerque
| | - Lauren Calandruccio
- Department of Psychological Sciences, Case Western Reserve University, Cleveland, OH
| |
Collapse
|
9
|
Pragt L, van Hengel P, Grob D, Wasmann JWA. Preliminary Evaluation of Automated Speech Recognition Apps for the Hearing Impaired and Deaf. Front Digit Health 2022; 4:806076. [PMID: 35252959 PMCID: PMC8889114 DOI: 10.3389/fdgth.2022.806076] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Accepted: 01/18/2022] [Indexed: 11/26/2022] Open
Abstract
Objective Automated speech recognition (ASR) systems have become increasingly sophisticated, accurate, and deployable on many digital devices, including on a smartphone. This pilot study aims to examine the speech recognition performance of ASR apps using audiological speech tests. In addition, we compare ASR speech recognition performance to normal hearing and hearing impaired listeners and evaluate if standard clinical audiological tests are a meaningful and quick measure of the performance of ASR apps. Methods Four apps have been tested on a smartphone, respectively AVA, Earfy, Live Transcribe, and Speechy. The Dutch audiological speech tests performed were speech audiometry in quiet (Dutch CNC-test), Digits-in-Noise (DIN)-test with steady-state speech-shaped noise, sentences in quiet and in averaged long-term speech-shaped spectrum noise (Plomp-test). For comparison, the app's ability to transcribe a spoken dialogue (Dutch and English) was tested. Results All apps scored at least 50% phonemes correct on the Dutch CNC-test for a conversational speech intensity level (65 dB SPL) and achieved 90–100% phoneme recognition at higher intensity levels. On the DIN-test, AVA and Live Transcribe had the lowest (best) signal-to-noise ratio +8 dB. The lowest signal-to-noise measured with the Plomp-test was +8 to 9 dB for Earfy (Android) and Live Transcribe (Android). Overall, the word error rate for the dialogue in English (19–34%) was lower (better) than for the Dutch dialogue (25–66%). Conclusion The performance of the apps was limited on audiological tests that provide little linguistic context or use low signal to noise levels. For Dutch audiological speech tests in quiet, ASR apps performed similarly to a person with a moderate hearing loss. In noise, the ASR apps performed more poorly than most profoundly deaf people using a hearing aid or cochlear implant. Adding new performance metrics including the semantic difference as a function of SNR and reverberation time could help to monitor and further improve ASR performance.
Collapse
Affiliation(s)
- Leontien Pragt
- Department of Otorhinolaryngology, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center Nijmegen, Nijmegen, Netherlands
- *Correspondence: Leontien Pragt
| | - Peter van Hengel
- Department of Otorhinolaryngology, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center Nijmegen, Nijmegen, Netherlands
- Pento Audiological Center Twente, Hengelo, Netherlands
| | - Dagmar Grob
- Department of Medical Imaging, Radboud University Medical Center, Nijmegen, Netherlands
| | - Jan-Willem A. Wasmann
- Department of Otorhinolaryngology, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center Nijmegen, Nijmegen, Netherlands
| |
Collapse
|
10
|
Ratnanather JT, Wang LC, Bae SH, O'Neill ER, Sagi E, Tward DJ. Visualization of Speech Perception Analysis via Phoneme Alignment: A Pilot Study. Front Neurol 2022; 12:724800. [PMID: 35087462 PMCID: PMC8787339 DOI: 10.3389/fneur.2021.724800] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 12/13/2021] [Indexed: 11/13/2022] Open
Abstract
Objective: Speech tests assess the ability of people with hearing loss to comprehend speech with a hearing aid or cochlear implant. The tests are usually at the word or sentence level. However, few tests analyze errors at the phoneme level. So, there is a need for an automated program to visualize in real time the accuracy of phonemes in these tests. Method: The program reads in stimulus-response pairs and obtains their phonemic representations from an open-source digital pronouncing dictionary. The stimulus phonemes are aligned with the response phonemes via a modification of the Levenshtein Minimum Edit Distance algorithm. Alignment is achieved via dynamic programming with modified costs based on phonological features for insertion, deletions and substitutions. The accuracy for each phoneme is based on the F1-score. Accuracy is visualized with respect to place and manner (consonants) or height (vowels). Confusion matrices for the phonemes are used in an information transfer analysis of ten phonological features. A histogram of the information transfer for the features over a frequency-like range is presented as a phonemegram. Results: The program was applied to two datasets. One consisted of test data at the sentence and word levels. Stimulus-response sentence pairs from six volunteers with different degrees of hearing loss and modes of amplification were analyzed. Four volunteers listened to sentences from a mobile auditory training app while two listened to sentences from a clinical speech test. Stimulus-response word pairs from three lists were also analyzed. The other dataset consisted of published stimulus-response pairs from experiments of 31 participants with cochlear implants listening to 400 Basic English Lexicon sentences via different talkers at four different SNR levels. In all cases, visualization was obtained in real time. Analysis of 12,400 actual and random pairs showed that the program was robust to the nature of the pairs. Conclusion: It is possible to automate the alignment of phonemes extracted from stimulus-response pairs from speech tests in real time. The alignment then makes it possible to visualize the accuracy of responses via phonological features in two ways. Such visualization of phoneme alignment and accuracy could aid clinicians and scientists.
Collapse
Affiliation(s)
- J Tilak Ratnanather
- Center for Imaging Science and Institute for Computational Medicine, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, United States
| | - Lydia C Wang
- Center for Imaging Science and Institute for Computational Medicine, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, United States
| | - Seung-Ho Bae
- Center for Imaging Science and Institute for Computational Medicine, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, United States
| | - Erin R O'Neill
- Center for Applied and Translational Sensory Sciences, University of Minnesota, Minneapolis, MN, United States
| | - Elad Sagi
- Department of Otolaryngology, New York University School of Medicine, New York, NY, United States
| | - Daniel J Tward
- Center for Imaging Science and Institute for Computational Medicine, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, United States.,Departments of Computational Medicine and Neurology, University of California, Los Angeles, Los Angeles, CA, United States
| |
Collapse
|
11
|
O'Neill ER, Parke MN, Kreft HA, Oxenham AJ. Role of semantic context and talker variability in speech perception of cochlear-implant users and normal-hearing listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:1224. [PMID: 33639827 PMCID: PMC7895533 DOI: 10.1121/10.0003532] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Revised: 01/01/2021] [Accepted: 01/26/2021] [Indexed: 06/12/2023]
Abstract
This study assessed the impact of semantic context and talker variability on speech perception by cochlear-implant (CI) users and compared their overall performance and between-subjects variance with that of normal-hearing (NH) listeners under vocoded conditions. Thirty post-lingually deafened adult CI users were tested, along with 30 age-matched and 30 younger NH listeners, on sentences with and without semantic context, presented in quiet and noise, spoken by four different talkers. Additional measures included working memory, non-verbal intelligence, and spectral-ripple detection and discrimination. Semantic context and between-talker differences influenced speech perception to similar degrees for both CI users and NH listeners. Between-subjects variance for speech perception was greatest in the CI group but remained substantial in both NH groups, despite the uniformly degraded stimuli in these two groups. Spectral-ripple detection and discrimination thresholds in CI users were significantly correlated with speech perception, but a single set of vocoder parameters for NH listeners was not able to capture average CI performance in both speech and spectral-ripple tasks. The lack of difference in the use of semantic context between CI users and NH listeners suggests no overall differences in listening strategy between the groups, when the stimuli are similarly degraded.
Collapse
Affiliation(s)
- Erin R O'Neill
- Department of Psychology, University of Minnesota, Elliott Hall, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| | - Morgan N Parke
- Department of Psychology, University of Minnesota, Elliott Hall, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| | - Heather A Kreft
- Department of Psychology, University of Minnesota, Elliott Hall, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, Elliott Hall, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| |
Collapse
|