Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Calandruccio L, Wasiuk PA, Buss E, Leibold LJ, Kong J, Holmes A, Oleson J. The effect of target/masker fundamental frequency contour similarity on masked-speech recognition. J Acoust Soc Am 2019;146:1065. [PMID: 31472562 PMCID: PMC6690832 DOI: 10.1121/1.5121314] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Revised: 07/19/2019] [Accepted: 07/23/2019] [Indexed: 05/20/2023]

For:	Calandruccio L, Wasiuk PA, Buss E, Leibold LJ, Kong J, Holmes A, Oleson J. The effect of target/masker fundamental frequency contour similarity on masked-speech recognition. J Acoust Soc Am 2019;146:1065. [PMID: 31472562 PMCID: PMC6690832 DOI: 10.1121/1.5121314] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Revised: 07/19/2019] [Accepted: 07/23/2019] [Indexed: 05/20/2023]

Number

Cited by Other Article(s)

Shen J, Wu J. Recognition of Speech With Dynamic Pitch Manipulation in Noise: Effects of Manipulation Methods. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024;67:269-281. [PMID: 37983169 PMCID: PMC11000783 DOI: 10.1044/2023_jslhr-23-00142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 06/26/2023] [Accepted: 09/27/2023] [Indexed: 11/22/2023]

Wasiuk PA, Calandruccio L, Oleson JJ, Buss E. Predicting speech-in-speech recognition: Short-term audibility and spatial separation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023;154:1827-1837. [PMID: 37728286 DOI: 10.1121/10.0021069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Accepted: 08/28/2023] [Indexed: 09/21/2023]

Mesiano PA, Zaar J, Bramslw L, Relaño-Iborra H, Dau T. The Role of Average Fundamental Frequency Difference on the Intelligibility of Real-Life Competing Sentences. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023:1-14. [PMID: 37390502 DOI: 10.1044/2023_jslhr-22-00219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/02/2023]

Flaherty MM, Buss E, Libert K. Effects of Target and Masker Fundamental Frequency Contour Depth on School-Age Children's Speech Recognition in a Two-Talker Masker. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023;66:400-414. [PMID: 36580582 DOI: 10.1044/2022_jslhr-22-00207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]

Abstract

PURPOSE

Maturation of the ability to recognize target speech in the presence of a two-talker speech masker extends into early adolescence. This study evaluated whether children benefit from differences in fundamental frequency (f _o) contour depth between the target and masker speech, a cue that has been shown to improve recognition in adults.

METHOD

Speech stimuli were recorded from talkers using three speaking styles, with f _o contour depths that were Flat, Normal, or Exaggerated. Targets were open-set, declarative sentences produced by a female talker, and maskers were two streams of concatenated sentences produced by a second female talker. Listeners were children (ages 5-17 years) and adults (ages 18-24 years) with normal hearing. Each listener was tested in one of the three masker styles paired with all three target styles. Speech recognition thresholds (SRTs) corresponding to 50% correct were estimated by fitting psychometric functions to adaptive track data.

RESULTS

For adults, performance did not differ significantly across conditions with matched speaking styles. A mismatch benefit was observed when combining Flat targets with the Exaggerated masker and Exaggerated targets with the Flat masker, and for both Flat and Exaggerated targets paired with the Normal masker. For children, there was a significant effect of age in all conditions. Flat targets in the Flat masker were associated with lower SRTs than the other two matched conditions, and a mismatch benefit was observed for young children only when the target f _o contour was less variable than the masker f _o contour.

CONCLUSIONS

Whereas child-directed speech often has exaggerated pitch contours, young children were better able to recognize speech with less variable f _o. Age effects were observed in the benefit of mismatched speaking styles for some conditions, which could be related to differences in baseline SRTs rather than differences in segregation abilities.

Collapse

Wasiuk PA, Buss E, Oleson JJ, Calandruccio L. Predicting speech-in-speech recognition: Short-term audibility, talker sex, and listener factors. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022;152:3010. [PMID: 36456289 DOI: 10.1121/10.0015228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Accepted: 11/01/2022] [Indexed: 06/17/2023]

Shen J, Fitzgerald LP, Kulick ER. Interactions between acoustic challenges and processing depth in speech perception as measured by task-evoked pupil response. Front Psychol 2022;13:959638. [PMID: 36389464 PMCID: PMC9641013 DOI: 10.3389/fpsyg.2022.959638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 09/12/2022] [Indexed: 08/21/2023] Open

Abstract

Speech perception under adverse conditions is a multistage process involving a dynamic interplay among acoustic, cognitive, and linguistic factors. Nevertheless, prior research has primarily focused on factors within this complex system in isolation. The primary goal of the present study was to examine the interaction between processing depth and the acoustic challenge of noise and its effect on processing effort during speech perception in noise. Two tasks were used to represent different depths of processing. The speech recognition task involved repeating back a sentence after auditory presentation (higher-level processing), while the tiredness judgment task entailed a subjective judgment of whether the speaker sounded tired (lower-level processing). The secondary goal of the study was to investigate whether pupil response to alteration of dynamic pitch cues stems from difficult linguistic processing of speech content in noise or a perceptual novelty effect due to the unnatural pitch contours. Task-evoked peak pupil response from two groups of younger adult participants with typical hearing was measured in two experiments. Both tasks (speech recognition and tiredness judgment) were implemented in both experiments, and stimuli were presented with background noise in Experiment 1 and without noise in Experiment 2. Increased peak pupil dilation was associated with deeper processing (i.e., the speech recognition task), particularly in the presence of background noise. Importantly, there is a non-additive interaction between noise and task, as demonstrated by the heightened peak pupil dilation to noise in the speech recognition task as compared to in the tiredness judgment task. Additionally, peak pupil dilation data suggest dynamic pitch alteration induced an increased perceptual novelty effect rather than reflecting effortful linguistic processing of the speech content in noise. These findings extend current theories of speech perception under adverse conditions by demonstrating that the level of processing effort expended by a listener is influenced by the interaction between acoustic challenges and depth of linguistic processing. The study also provides a foundation for future work to investigate the effects of this complex interaction in clinical populations who experience both hearing and cognitive challenges.

Collapse

Buss E, Miller MK, Leibold LJ. Maturation of Speech-in-Speech Recognition for Whispered and Voiced Speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022;65:3117-3128. [PMID: 35868232 PMCID: PMC9911131 DOI: 10.1044/2022_jslhr-21-00620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 04/01/2022] [Accepted: 04/29/2022] [Indexed: 06/15/2023]

Abstract

PURPOSE

Some speech recognition data suggest that children rely less on voice pitch and harmonicity to support auditory scene analysis than adults. Two experiments evaluated development of speech-in-speech recognition using voiced speech and whispered speech, which lacks the harmonic structure of voiced speech.

METHOD

Listeners were 5- to 7-year-olds and adults with normal hearing. Targets were monosyllabic words organized into three-word sets that differ in vowel content. Maskers were two-talker or one-talker streams of speech. Targets and maskers were recorded by different female talkers in both voiced and whispered speaking styles. For each masker, speech reception thresholds (SRTs) were measured in all four combinations of target and masker speech, including matched and mismatched speaking styles for the target and masker.

RESULTS

Children performed more poorly than adults overall. For the two-talker masker, this age effect was smaller for the whispered target and masker than for the other three conditions. Children's SRTs in this condition were predominantly positive, suggesting that they may have relied on a wholistic listening strategy rather than segregating the target from the masker. For the one-talker masker, age effects were consistent across the four conditions. Reduced informational masking for the one-talker masker could be responsible for differences in age effects for the two maskers. A benefit of mismatching the target and masker speaking style was observed for both target styles in the two-talker masker and for the voiced targets in the one-talker masker.

CONCLUSIONS

These results provide no compelling evidence that young school-age children and adults are differentially sensitive to the cues present in voiced and whispered speech. Both groups benefit from mismatches in speaking style under some conditions. These benefits could be due to a combination of reduced perceptual similarity, harmonic cancelation, and differences in energetic masking.

Collapse

Brown VA, Dillman-Hasso NH, Li Z, Ray L, Mamantov E, Van Engen KJ, Strand JF. Revisiting the target-masker linguistic similarity hypothesis. Atten Percept Psychophys 2022;84:1772-1787. [PMID: 35474415 PMCID: PMC10701341 DOI: 10.3758/s13414-022-02486-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/25/2022] [Indexed: 02/01/2023]

Analysis Model of Spoken English Evaluation Algorithm Based on Intelligent Algorithm of Internet of Things. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022;2022:8469945. [PMID: 35387241 PMCID: PMC8977287 DOI: 10.1155/2022/8469945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 01/23/2022] [Accepted: 01/28/2022] [Indexed: 11/18/2022]

Abstract

With the in-depth promotion of the national strategy for the integration of artificial intelligence technology and entity development, speech recognition processing technology, as an important medium of human-computer interaction, has received extensive attention and motivated research in industry and academia. However, the existing accurate speech recognition products are based on massive data platform, which has the problems of slow response and security risk, which makes it difficult for the existing speech recognition products to meet the application requirements for timely translation of speech with high response time and network security requirements under the condition of network instability and insecurity. Based on this, this paper studies the analysis model of oral English evaluation algorithm based on Internet of things intelligent algorithm in speech recognition technology. Firstly, based on the automatic machine learning and lightweight learning strategy, a lightweight technology of automatic speech recognition depth neural network adapted to the edge computing power is proposed. Secondly, the quantitative evaluation of Internet of things intelligent classification algorithm and big data analysis in this system is described. In the evaluation, the evaluation method of oral English characteristics is adopted. At the same time, the Internet of things intelligent classification algorithm and big data analysis strategy are used to evaluate the accuracy of oral English. Finally, the experimental results show that the oral English feature recognition system based on Internet of things intelligent classification algorithm and big data analysis has the advantages of good reliability, high intelligence, and strong ability to resist subjective factors, which proves the advantages of Internet of things intelligent classification algorithm and big data analysis in English feature recognition.

Collapse

Byrne AJ, Conroy C, Kidd G. The Effects of Uncertainty in Level on Speech-on-Speech Masking. Trends Hear 2022;26:23312165221077555. [PMID: 35238259 PMCID: PMC8902181 DOI: 10.1177/23312165221077555] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open

Shen J. Pupillary response to dynamic pitch alteration during speech perception in noise. JASA EXPRESS LETTERS 2021;1:115202. [PMID: 34778875 PMCID: PMC8574131 DOI: 10.1121/10.0007056] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Accepted: 10/12/2021] [Indexed: 06/13/2023]

Buss E, Bosen A. Band importance for speech-in-speech recognition. JASA EXPRESS LETTERS 2021;1:084402. [PMID: 34661194 PMCID: PMC8499852 DOI: 10.1121/10.0005762] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 07/13/2021] [Indexed: 05/04/2023]

Liu JS, Liu YW, Yu YF, Galvin JJ, Fu QJ, Tao DD. Segregation of competing speech in adults and children with normal hearing and in children with cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021;150:339. [PMID: 34340485 DOI: 10.1121/10.0005597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Accepted: 06/22/2021] [Indexed: 06/13/2023]

Jett B, Buss E, Best V, Oleson J, Calandruccio L. Does Sentence-Level Coarticulation Affect Speech Recognition in Noise or a Speech Masker? JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021;64:1390-1403. [PMID: 33784185 PMCID: PMC8608179 DOI: 10.1044/2021_jslhr-20-00450] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Revised: 12/04/2020] [Accepted: 01/05/2021] [Indexed: 06/12/2023]

Abstract

Purpose Three experiments were conducted to better understand the role of between-word coarticulation in masked speech recognition. Specifically, we explored whether naturally coarticulated sentences supported better masked speech recognition as compared to sentences derived from individually spoken concatenated words. We hypothesized that sentence recognition thresholds (SRTs) would be similar for coarticulated and concatenated sentences in a noise masker but would be better for coarticulated sentences in a speech masker. Method Sixty young adults participated (n = 20 per experiment). An adaptive tracking procedure was used to estimate SRTs in the presence of noise or two-talker speech maskers. Targets in Experiments 1 and 2 were matrix-style sentences, while targets in Experiment 3 were semantically meaningful sentences. All experiments included coarticulated and concatenated targets; Experiments 2 and 3 included a third target type, concatenated keyword-intensity-matched (KIM) sentences, in which the words were concatenated but individually scaled to replicate the intensity contours of the coarticulated sentences. Results Regression analyses evaluated the main effects of target type, masker type, and their interaction. Across all three experiments, effects of target type were small (< 2 dB). In Experiment 1, SRTs were slightly poorer for coarticulated than concatenated sentences. In Experiment 2, coarticulation facilitated speech recognition compared to the concatenated KIM condition. When listeners had access to semantic context (Experiment 3), a coarticulation benefit was observed in noise but not in the speech masker. Conclusions Overall, differences between SRTs for sentences with and without between-word coarticulation were small. Beneficial effects of coarticulation were only observed relative to the concatenated KIM targets; for unscaled concatenated targets, it appeared that consistent audibility across the sentence offsets any benefit of coarticulation. Contrary to our hypothesis, effects of coarticulation generally were not more pronounced in speech maskers than in noise maskers.

Collapse

Shen J. Older Listeners' Perception of Speech With Strengthened and Weakened Dynamic Pitch Cues in Background Noise. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021;64:348-358. [PMID: 33439741 PMCID: PMC8632513 DOI: 10.1044/2020_jslhr-20-00116] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2020] [Revised: 07/28/2020] [Accepted: 09/21/2020] [Indexed: 06/12/2023]

Abstract

Purpose Dynamic pitch, which is defined as the variation in fundamental frequency, is an acoustic cue that aids speech perception in noise. This study examined the effects of strengthened and weakened dynamic pitch cues on older listeners' speech perception in noise, as well as how these effects were modulated by individual factors including spectral perception ability. Method The experiment measured speech reception thresholds in noise in both younger listeners with normal hearing and older listeners whose hearing status ranged from near-normal hearing to mild-to-moderate sensorineural hearing loss. The pitch contours of the target speech were manipulated to create four levels of dynamic pitch strength: weakened, original, mildly strengthened, and strengthened. Listeners' spectral perception ability was measured using tests of spectral ripple and frequency modulation discrimination. Results Both younger and older listeners performed worse with manipulated dynamic pitch cues than with original dynamic pitch. The effects of dynamic pitch on older listeners' speech recognition were associated with their age but not with their perception of spectral information. Those older listeners who were relatively younger were more negatively affected by dynamic pitch manipulations. Conclusions The findings suggest the current pitch manipulation strategy is detrimental for older listeners to perceive speech in noise, as compared to original dynamic pitch. While the influence of age on the effects of dynamic pitch is likely due to age-related declines in pitch perception, the spectral measures used in this study were not strong predictors for dynamic pitch effects. Taken together, these results indicate next steps in this line of work should be focused on how to manipulate acoustic cues in speech in order to improve speech perception in noise for older listeners.

Collapse

Flaherty MM, Buss E, Leibold LJ. Independent and Combined Effects of Fundamental Frequency and Vocal Tract Length Differences for School-Age Children's Sentence Recognition in a Two-Talker Masker. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021;64:206-217. [PMID: 33375828 PMCID: PMC8610228 DOI: 10.1044/2020_jslhr-20-00327] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Revised: 09/08/2020] [Accepted: 09/29/2020] [Indexed: 06/12/2023]

Wasiuk PA, Lavandier M, Buss E, Oleson J, Calandruccio L. The effect of fundamental frequency contour similarity on multi-talker listening in older and younger adults. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020;148:3527. [PMID: 33379934 PMCID: PMC7863686 DOI: 10.1121/10.0002661] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Bonino AY, Malley AR. Measuring open-set, word recognition in school-aged children: Corpus of monosyllabic target words and speech maskers. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019;146:EL393. [PMID: 31671998 PMCID: PMC6910017 DOI: 10.1121/1.5130192] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/17/2019] [Revised: 09/30/2019] [Accepted: 09/30/2019] [Indexed: 06/10/2023]