1
|
Modified Multiple Stimulus With Hidden Reference and Anchors-Gabrielsson Total Impression Sound Quality Rating Comparisons for Speech in Quiet, Noise, and Reverberation. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:3677-3688. [PMID: 37579731 DOI: 10.1044/2023_jslhr-22-00627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/16/2023]
Abstract
PURPOSE The purpose of the study was to obtain, analyze, and compare subjective sound quality data for the same test stimuli using modified multistimulus MUSHRA (Multiple Stimulus with Hidden Reference and Anchors) based procedures (viz., MUSHRA with custom anchors and MUSHRA without anchor) and the single-stimulus Gabrielsson's total impression rating procedure. METHOD Twenty normally hearing young adults were recruited in this study. Participants completed sound quality ratings on two different hearing aid recording data sets-Data Set A contained speech recordings from four different hearing aids under a variety of noisy and processing conditions, and Data Set B contained speech recordings from a single hearing aid under a combination of different noisy, reverberant, and signal processing conditions. Recordings in both data sets were rated for their quality using the total impression rating procedure. In addition, quality ratings of Data Set A recordings were obtained using a MUSHRA with custom anchors, while the ratings of Data Set B recordings were collected using a MUSHRA without anchor. RESULTS Statistical analyses revealed a high test-retest reliability of quality ratings for the same stimuli that were rated multiple times. In addition, high-interrater reliability was observed with all three rating procedures. Further analyses indicated (a) a high correlation between the total impression rating and the two modified MUSHRA ratings and (b) a similar relationship between the average and standard deviation of the subjective rating data obtained by the total impression rating and MUSHRA with custom anchors on Data Set A, and the total impression rating and the MUSHRA without anchor on Data Set B. CONCLUSION Both sound quality procedures, namely, the MUSHRA-based procedures and the total impression rating scale, obtained similar quality ratings of varied hearing aid speech recordings with high reliability.
Collapse
|
2
|
Auditory-Perceptual and Pupillometric Evaluation of Vocal Roughness and Listening Effort in Tracheoesophageal Speech. J Voice 2023:S0892-1997(23)00149-2. [PMID: 37385902 DOI: 10.1016/j.jvoice.2023.04.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 04/24/2023] [Accepted: 04/25/2023] [Indexed: 07/01/2023]
Abstract
OBJECTIVES This study evaluated auditory-perceptual judgments of perceived vocal roughness (VR) and listening effort (LE) along with pupillometric responses in response to speech samples produced by tracheoesophageal (TE) talkers. METHODS Twenty normal-hearing, naive young adults (eight men and twelve women) served as listeners. Listeners were divided into two groups: (1) a with-anchor (WA) group (four men and six women) and (2) a no-anchor (NA) group (four men and six women). All were presented with speech samples produced by twenty TE talkers; listeners evaluated two auditory-perceptual dimensions-VR and LE-using visual analog scales. Anchors were provided to the WA group as an external referent for their ratings. In addition, during the auditory-perceptual task, each listener's pupil reactions also were recorded with peak pupil dilation (PPD) measures extracted as a physiologic indicator associated with the listening task. RESULTS High interrater reliability was obtained for both the WA and NA groups. High correlations also were observed between auditory-perceptual ratings of roughness and LE, and between PPD values and ratings of both dimensions for the WA group. The inclusion of an anchor during the auditory-perceptual task improved interrater reliability ratings, but it also imposed an increased demand on listeners. CONCLUSIONS Data obtained offer insights into the relationship between subjective indices of voice quality (ie, auditory-perceptual evaluation) and physiologic responses (PPD) to the abnormal voice quality that characterizes TE talkers. Furthermore, these data provide information on the inclusion/exclusion of audio anchors and potential increases in listener demand in response to abnormal voice quality.
Collapse
|
3
|
Verification of a Mobile Psychoacoustic Test System. Audiol Res 2021; 11:673-690. [PMID: 34940019 PMCID: PMC8698855 DOI: 10.3390/audiolres11040061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Revised: 11/25/2021] [Accepted: 12/08/2021] [Indexed: 11/20/2022] Open
Abstract
Many hearing difficulties can be explained as a loss of audibility, a problem easily detected and treated using standard audiological procedures. Yet, hearing can be much poorer (or more impaired) than audibility predicts because of deficits in the suprathreshold mechanisms that encode the rapidly changing, spectral, temporal, and binaural aspects of the sound. The ability to evaluate these mechanisms requires well-defined stimuli and strict adherence to rigorous psychometric principles. This project reports on the comparison between a laboratory-based and a mobile system's results for psychoacoustic assessment in adult listeners with normal hearing. A description of both systems employed is provided. Psychoacoustic tests include frequency discrimination, amplitude modulation detection, binaural encoding, and temporal gap detection. Results reported by the mobile system were not significantly different from those collected with the laboratory-based system for most of the tests and were consistent with those reported in the literature. The mobile system has the potential to be a feasible option for the assessment of suprathreshold auditory encoding abilities.
Collapse
|
4
|
Evaluation of Acoustic Analyses of Voice in Nonoptimized Conditions. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:3991-3999. [PMID: 33186510 DOI: 10.1044/2020_jslhr-20-00212] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Objectives This study aimed to evaluate the fidelity and accuracy of a smartphone microphone and recording environment on acoustic measurements of voice. Method A prospective cohort proof-of-concept study. Two sets of prerecorded samples (a) sustained vowels (/a/) and (b) Rainbow Passage sentence were played for recording via the internal iPhone microphone and the Blue Yeti USB microphone in two recording environments: a sound-treated booth and quiet office setting. Recordings were presented using a calibrated mannequin speaker with a fixed signal intensity (69 dBA), at a fixed distance (15 in.). Each set of recordings (iPhone-audio booth, Blue Yeti-audio booth, iPhone-office, and Blue Yeti-office), was time-windowed to ensure the same signal was evaluated for each condition. Acoustic measures of voice including fundamental frequency (fo), jitter, shimmer, harmonic-to-noise ratio (HNR), and cepstral peak prominence (CPP), were generated using a widely used analysis program (Praat Version 6.0.50). The data gathered were compared using a repeated measures analysis of variance. Two separate data sets were used. The set of vowel samples included both pathologic (n = 10) and normal (n = 10), male (n = 5) and female (n = 15) speakers. The set of sentence stimuli ranged in perceived voice quality from normal to severely disordered with an equal number of male (n = 12) and female (n = 12) speakers evaluated. Results The vowel analyses indicated that the jitter, shimmer, HNR, and CPP were significantly different based on microphone choice and shimmer, HNR, and CPP were significantly different based on the recording environment. Analysis of sentences revealed a statistically significant impact of recording environment and microphone type on HNR and CPP. While statistically significant, the differences across the experimental conditions for a subset of the acoustic measures (viz., jitter and CPP) have shown differences that fell within their respective normative ranges. Conclusions Both microphone and recording setting resulted in significant differences across several acoustic measurements. However, a subset of the acoustic measures that were statistically significant across the recording conditions showed small overall differences that are unlikely to have clinical significance in interpretation. For these acoustic measures, the present data suggest that, although a sound-treated setting is ideal for voice sample collection, a smartphone microphone can capture acceptable recordings for acoustic signal analysis.
Collapse
|
5
|
Fitting Frequency-Lowering Signal Processing Applying the American Academy of Audiology Pediatric Amplification Guideline: Updates and Protocols. J Am Acad Audiol 2020; 27:219-236. [DOI: 10.3766/jaaa.15059] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Background: Although guidelines for fitting hearing aids for children are well developed and have strong basis in evidence, specific protocols for fitting and verifying technologies can supplement such guidelines. One such technology is frequency-lowering signal processing. Children require access to a broad bandwidth of speech to detect and use all phonemes including female /s/. When access through conventional amplification is not possible, the use of frequency-lowering signal processing may be considered as a means to overcome limitations. Fitting and verification protocols are needed to better define candidacy determination and options for assessing and fine tuning frequency-lowering signal processing for individuals.
Purpose: This work aims to (1) describe a set of calibrated phonemes that can be used to characterize the variation in different brands of frequency-lowering processors in hearing aids and the verification with these signals and (2) determine whether verification with these signal are predictive of perceptual changes associated with changes in the strength of frequency-lowering signal processing. Finally, we aimed to develop a fitting protocol for use in pediatric clinical practice.
Study Sample: Study 1 used a sample of six hearing aids spanning four types of frequency lowering algorithms for an electroacoustic evaluation. Study 2 included 21 adults who had hearing loss (mean age 66 yr).
Data Collection and Analysis: Simulated fricatives were designed to mimic the level and frequency shape of female fricatives extracted from two sources of speech. These signals were used to verify the frequency-lowering effects of four distinct types of frequency-lowering signal processors available in commercial hearing aids, and verification measures were compared to extracted fricatives made in a reference system. In a second study, the simulated fricatives were used within a probe microphone measurement system to verify a wide range of frequency compression settings in a commercial hearing aid, and 27 adult listeners were tested at each setting. The relation between the hearing aid verification measures and the listener’s ability to detect and discriminate between fricatives was examined.
Results: Verification measures made with the simulated fricatives agreed to within 4 dB, on average, and tended to mimic the frequency response shape of fricatives presented in a running speech context. Some processors showed a greater aided response level for fricatives in running speech than fricatives presented in isolation. Results with listeners indicated that verified settings that provided a positive sensation level of /s/ and that maximized the frequency difference between /s/ and /∫/ tended to have the best performance.
Conclusions: Frequency-lowering signal processors have measureable effects on the high-frequency fricative content of speech, particularly female /s/. It is possible to measure these effects either with a simple strategy that presents an isolated simulated fricative and measures the aided frequency response or with a more complex system that extracts fricatives from running speech. For some processors, a more accurate result may be achieved with a running speech system. In listeners, the aided frequency location and sensation level of fricatives may be helpful in predicting whether a specific hearing aid fitting, with or without frequency-lowering, will support access to the fricatives of speech.
Collapse
|
6
|
Fitting Noise Management Signal Processing Applying the American Academy of Audiology Pediatric Amplification Guideline: Verification Protocols. J Am Acad Audiol 2020; 27:237-251. [DOI: 10.3766/jaaa.15060] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Background: Although guidelines for fitting hearing aids for children are well developed and have strong basis in evidence, specific protocols for fitting and verifying some technologies are not always available. One such technology is noise management in children’s hearing aids. Children are frequently in high-level and/or noisy environments, and many options for noise management exist in modern hearing aids. Verification protocols are needed to define specific test signals and levels for use in clinical practice.
Purpose: This work aims to (1) describe the variation in different brands of noise reduction processors in hearing aids and the verification of these processors and (2) determine whether these differences are perceived by 13 children who have hearing loss. Finally, we aimed to develop a verification protocol for use in pediatric clinical practice.
Study Sample: A set of hearing aids was tested using both clinically available test systems and a reference system, so that the impacts of noise reduction signal processing in hearing aids could be characterized for speech in a variety of background noises. A second set of hearing aids was tested across a range of audiograms and across two clinical verification systems to characterize the variance in clinical verification measurements. Finally, a set of hearing aid recordings that varied by type of noise reduction was rated for sound quality by children with hearing loss.
Results: Significant variation across makes and models of hearing aids was observed in both the speed of noise reduction activation and the magnitude of noise reduction. Reference measures indicate that noise-only testing may overestimate noise reduction magnitude compared to speech-in-noise testing. Variation across clinical test signals was also observed, indicating that some test signals may be more successful than others for characterization of hearing aid noise reduction. Children provided different sound quality ratings across hearing aids, and for one hearing aid rated the sound quality as higher with the noise reduction system activated.
Conclusions: Implications for clinical verification systems may be that greater standardization and the use of speech-in-noise test signals may improve the quality and consistency of noise reduction verification cross clinics. A suggested clinical protocol for verification of noise management in children’s hearing aids is suggested.
Collapse
|
7
|
Sound Quality Effects of an Adaptive Nonlinear Frequency Compression Processor with Normal-Hearing and Hearing-Impaired Listeners. J Am Acad Audiol 2020; 30:552-563. [DOI: 10.3766/jaaa.16179] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
AbstractFrequency lowering (FL) technology offers a means of improving audibility of high-frequency sounds. For some listeners, the benefit of such technology can be accompanied by a perceived degradation in sound quality, depending on the strength of the FL setting.The studies presented in this article investigate the effect of a new type of FL signal processing for hearing aids, adaptive nonlinear frequency compression (ANFC), on subjective speech quality.Listener ratings of sound quality were collected for speech stimuli processed with systematically varied fitting parameters.Study 1 included 40 normal-hearing (NH) adult and child listeners. Study 2 included 11 hearing-impaired (HI) adult and child listeners. HI listeners were fitted with laboratory-worn hearing aids for use during listening tasks.Speech quality ratings were assessed across test conditions consisting of various strengths of static nonlinear frequency compression (NFC) and ANFC speech. Test conditions included those that were fine-tuned on an individual basis per hearing aid fitting and conditions that were modified to intentionally alter the sound quality of the signal.Listeners rated speech quality using the MUlti Stimulus test with Hidden Reference and Anchor (MUSHRA) test paradigm. Ratings were analyzed for reliability and to compare results across conditions.Results show that interrater reliability is high for both studies, indicating that NH and HI listeners from both adult and child age groups can reliably complete the MUSHRA task. Results comparing sound quality ratings across experimental conditions suggest that both the NH and HI listener groups rate the stimuli intended to have poor sound quality (e.g., anchors and the strongest available parameter settings) as having below-average sound quality ratings. A different trend in the results is reported when considering the other experimental conditions across the listener groups in the studies. Speech quality ratings measured with NH listeners improve as the strength of ANFC decreases, with a range of bad to good ratings reported, on average. Speech quality ratings measured with HI listeners are similar and above-average for many of the experimental stimuli, including those with fine-tuned NFC and ANFC parameters.Overall, HI listeners provide similar sound quality ratings when comparing static and adaptive forms of frequency compression, especially when considering the individualized parameter settings. These findings suggest that a range in settings may result in above-average sound quality for adults and children with hearing impairment. Furthermore, the fitter should fine-tune FL parameters for each individual listener, regardless of type of FL technology.
Collapse
|
8
|
Using visual feedback to enhance intonation control with a variable pitch electrolarynx. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:1802. [PMID: 32237840 DOI: 10.1121/10.0000936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Accepted: 03/03/2020] [Indexed: 06/11/2023]
Abstract
This study evaluated the effectiveness of using visual feedback to facilitate pitch control by a speaker using a pressure sensitive onset controlled electrolarynx (EL). This proof-of-concept study was conducted with one healthy adult. The participant-speaker was provided with computer generated visual feedback over five sessions within a consecutive period of three weeks. Changes in force control accuracy were gathered and analyzed. An improvement in finger (thumb) force control accuracy from the first to the last training session was documented. The results of this study provide data toward the development of a clinical training protocol for the use of a pressure sensitive onset controlled EL by laryngectomized speakers. Further, these results highlight the importance of developing a relevant multimodality training protocol for the improvement of postlaryngectomy EL speech production.
Collapse
|
9
|
Audiological outcome measures with the BONEBRIDGE transcutaneous bone conduction hearing implant: impact of noise, reverberation and signal processing features. Int J Audiol 2020; 59:556-565. [PMID: 32069128 DOI: 10.1080/14992027.2020.1728400] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Objective: To assess the performance of an active transcutaneous implantable-bone conduction device (TI-BCD), and to evaluate the benefit of device digital signal processing (DSP) features in challenging listening environments.Design: Participants were tested at 1- and 3-month post-activation of the TI-BCD. At each session, aided and unaided phoneme perception was assessed using the Ling-6 test. Speech reception thresholds (SRTs) and quality ratings of speech and music samples were collected in noisy and reverberant environments, with and without the DSP features. Self-assessment of the device performance was obtained using the Abbreviated Profile of Hearing Aid Benefit (APHAB) questionnaire.Study sample: Six adults with conductive or mixed hearing loss.Results: Average SRTs were 2.9 and 12.3 dB in low and high reverberation environments, respectively, which improved to -1.7 and 8.7 dB, respectively with the DSP features. In addition, speech quality ratings improved by 23 points with the DSP features when averaged across all environmental conditions. Improvement scores on APHAB scales revealed a statistically significant aided benefit.Conclusions: Noise and reverberation significantly impacted speech recognition performance and perceived sound quality. DSP features (directional microphone processing and adaptive noise reduction) significantly enhanced subjects' performance in these challenging listening environments.
Collapse
|
10
|
Effects of Bimodal and Bilateral Cochlear Implant Use on a Nonauditory Working Memory Task: Reading Span Tests Over 2 Years Following Cochlear Implantation. Am J Audiol 2019; 28:947-963. [PMID: 31829722 DOI: 10.1044/2019_aja-19-0030] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Purpose A growing body of evidence indicates that treatment of hearing loss by provision of hearing aids leads to improvements in auditory and visual working memory. The purpose of this study was to assess whether similar working memory benefits are observed following provision of cochlear implants (CIs). Method Fifteen adults with postlingually acquired severe bilateral sensorineural hearing loss completed the prospective longitudinal study. Participants were candidates for bilateral cochlear implantation with some aidable hearing in each ear. Implantation surgeries were carried out sequentially, approximately 1 year apart. Working memory was measured with the visual Reading Span Test (Daneman & Carpenter, 1980) at 5 time points: pre-operatively following a 6-month bilateral hearing aid trial, after 6 and 12 months of bimodal (CI plus contralateral hearing aid) listening experience following the 1st CI surgery and activation, and again after 6 and 12 months of bilateral CI listening experience following the 2nd CI surgery and activation. Results Compared to the preoperative baseline, CI listening experience yielded significant improvements in participants' ability to recall test words in the correct serial order after 12 months in the bimodal condition. Individual performance outcomes were variable, but almost all participants showed increases in task performance over the course of the study. Conclusions These results suggest that, similar to appropriate interventions with hearing aids, treatment of hearing loss with CIs can yield working memory benefits. A likely mechanism is the freeing of cognitive resources previously devoted to effortful listening.
Collapse
|
11
|
Perceptual and Objective Assessment of Envelope Enhancement for Children With Auditory Processing Disorder. IEEE Trans Neural Syst Rehabil Eng 2019; 28:143-151. [PMID: 31804940 DOI: 10.1109/tnsre.2019.2957230] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
This paper evaluated the performance of an envelope enhancement (EE) algorithm subjectively by children with auditory processing disorder (APD), and objectively through computational models. Speech intelligibility data was collected from children with APD, for unprocessed and envelope-enhanced speech in the presence of stationary and non-stationary background noise at different signal to noise ratios (SNRs), both with and without noise reduction (NR) algorithms as a front-end to the EE algorithm. Furthermore, intrusive and non-intrusive objective speech intelligibility metrics were derived to predict the perceptual impact of this EE algorithm. Subjective data for stationary noise conditions revealed that the combination of NR and EE algorithms significantly improved the speech intelligibility scores at poor SNRs. In contrast, the same combination was ineffective in improving speech intelligibility in non-stationary noise conditions. Taken together, subjective results suggest that exaggerating the envelope cues improves speech identification scores for children with APD. However, the benefit obtained varies depending upon the type and level of the background noise. Both intrusive and non-intrusive objective speech intelligibility estimators exhibited good correlation with the subjective data, with the intrusive metric demonstrating better generalization capabilities. Implications of these results for hearing aid applications for children with APD is discussed.
Collapse
|
12
|
Objective and Subjective Speech Quality Assessment of Amplification Devices for Patients With Parkinson’s Disease. IEEE Trans Neural Syst Rehabil Eng 2019; 27:1226-1235. [DOI: 10.1109/tnsre.2019.2915172] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
13
|
An evaluation of the Sennheiser HDA 280-CL circumaural headphone for use in audiometric testing. Int J Audiol 2019; 58:427-433. [PMID: 30957582 DOI: 10.1080/14992027.2019.1594415] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Objective: Evaluation of the Sennheiser HDA 280-CL circumaural headphone for the determination of (1) equivalent threshold sound pressure levels (ETSPL) for 125-18,000 Hz.; (2) real ear attenuation (250-8000 Hz); (3) insertion loss (63-18,000 Hz); (4) frequency response (125-18,000 Hz); (5) total harmonic distortion (THD) (125-10,000 Hz); and, (6) linearity (11,200-18,000 Hz).Study Sample: Twenty-five normal hearing adults aged 18-25 participated in (1) and (2).Design: (1) Hearing thresholds were measured using the Sennheiser HDA 280-CL. Frequency specific ETSPL values were calculated in an artificial ear. (2) Sound field thresholds were measured with the ears open and covered with the headphone to obtain the real ear attenuation thresholds (REAT). These values were used to determine the maximum permissible ambient noise levels (MPANL). (3) A B&K HATS mannequin recorded the output levels of a broadband pink noise with the ears open and covered with the headphones. (4, 5) The frequency response, THD and linearity were measured in an artificial ear.Results: Values for ETSPL, REAT, MPANL, insertion loss, as well as measures of frequency response, THD and linearity are presented.Conclusions: The Sennheiser HDA 280-CL meets the requirements for audiometric testing and the values presented can be used for calibration.
Collapse
|
14
|
Objective and Subjective Assessment of Amplified Parkinsonian Speech Quality. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2018; 2018:2084-2087. [PMID: 30440813 DOI: 10.1109/embc.2018.8512618] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Hypophonia is a common speech impairment associated with Parkinson's disease (PD). Voice amplifiers are typically used to increase voice loudness, but little is known about their impact on perceived speech quality. In this paper, speech recordings were obtained from 11 PD subjects with and without the use of seven different amplification devices, and in the absence or presence of background noise. The recorded speech samples were rated for their sound quality by 10 naive listeners. The same speech recordings were analyzed objectively, where in linear prediction, mel-frequency cepstral coefficients (MFCCs), and gammatone cepstral coefficients (GFCCs) were extracted and mapped to predicted quality scores using linear regression and Support Vector Regression (SVR). Results showed that amplification devices differentially affect the perceived quality of PD speech, that objective and subjective quality scores correlated well, and that a reduced set of GFCC features mapped with SVR produced the best correlation with the subjective scores.
Collapse
|
15
|
Electroacoustic assessment of wireless remote microphone systems. Audiol Res 2018; 8:204. [PMID: 29732045 PMCID: PMC5913655 DOI: 10.4081/audiores.2018.204] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2017] [Accepted: 03/11/2018] [Indexed: 11/23/2022] Open
Abstract
Wireless remote microphones (RMs) transmit the desired acoustic signal to the hearing aid (HA) and facilitate enhanced listening in challenging environments. Fitting and verification of RMs, and benchmarking the relative performance of different RM devices in varied acoustic environments are of significant interest to Audiologists and RM developers. This paper investigates the application of instrumental speech intelligibility and quality metrics for characterizing the RM performance in two acoustic environments with varying amounts of background noise and reverberation. In both environments, two head and torso simulators (HATS) were placed 2 m apart, where one HATS served as the talker and the other served as the listener. Four RM systems were interfaced separately with a HA programmed to match the prescriptive targets for the N4 standard audiogram and placed on the listener HATS. The HA output in varied acoustic conditions was recorded and analyzed offline through computational models predicting speech intelligibility and quality. Results showed performance differences among the four RMs in the presence of noise and/or reverberation, with one RM exhibiting significantly better performance. Clinical implications and applications of these results are discussed.
Collapse
|
16
|
Predicting the quality of enhanced wideband speech with a cochlear model. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:EL319. [PMID: 28964067 DOI: 10.1121/1.5003785] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Objective measures are commonly used in the development of speech coding algorithms as an adjunct to human subjective evaluation. Predictors of speech quality based on models of physiological or perceptual processing tend to perform better than measures based on simple acoustical properties. Here, a modeling method based on a detailed physiological model and a neurogram similarity measure is developed and optimized to predict the quality of an enhanced wideband speech dataset. A model capturing temporal modulations in neural activity up to 267 Hz was found to perform as well as or better than several existing objective quality measures.
Collapse
|
17
|
Objective Quality and Intelligibility Prediction for Users of Assistive Listening Devices. IEEE SIGNAL PROCESSING MAGAZINE 2015; 32:114-124. [PMID: 26052190 PMCID: PMC4452133 DOI: 10.1109/msp.2014.2358871] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
This article presents an overview of twelve existing objective speech quality and intelligibility prediction tools. Two classes of algorithms are presented, namely intrusive and non-intrusive, with the former requiring the use of a reference signal, while the latter does not. Investigated metrics include both those developed for normal hearing listeners, as well as those tailored particularly for hearing impaired (HI) listeners who are users of assistive listening devices (i.e., hearing aids, HAs, and cochlear implants, CIs). Representative examples of those optimized for HI listeners include the speech-to-reverberation modulation energy ratio, tailored to hearing aids (SRMR-HA) and to cochlear implants (SRMR-CI); the modulation spectrum area (ModA); the hearing aid speech quality (HASQI) and perception indices (HASPI); and the PErception MOdel - hearing impairment quality (PEMO-Q-HI). The objective metrics are tested on three subjectively-rated speech datasets covering reverberation-alone, noise-alone, and reverberation-plus-noise degradation conditions, as well as degradations resultant from nonlinear frequency compression and different speech enhancement strategies. The advantages and limitations of each measure are highlighted and recommendations are given for suggested uses of the different tools under specific environmental and processing conditions.
Collapse
|
18
|
Predicting the perceived sound quality of frequency-compressed speech. PLoS One 2014; 9:e110260. [PMID: 25402456 PMCID: PMC4234248 DOI: 10.1371/journal.pone.0110260] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2014] [Accepted: 09/18/2014] [Indexed: 11/18/2022] Open
Abstract
The performance of objective speech and audio quality measures for the prediction of the perceived quality of frequency-compressed speech in hearing aids is investigated in this paper. A number of existing quality measures have been applied to speech signals processed by a hearing aid, which compresses speech spectra along frequency in order to make information contained in higher frequencies audible for listeners with severe high-frequency hearing loss. Quality measures were compared with subjective ratings obtained from normal hearing and hearing impaired children and adults in an earlier study. High correlations were achieved with quality measures computed by quality models that are based on the auditory model of Dau et al., namely, the measure PSM, computed by the quality model PEMO-Q; the measure qc, computed by the quality model proposed by Hansen and Kollmeier; and the linear subcomponent of the HASQI. For the prediction of quality ratings by hearing impaired listeners, extensions of some models incorporating hearing loss were implemented and shown to achieve improved prediction accuracy. Results indicate that these objective quality measures can potentially serve as tools for assisting in initial setting of frequency compression parameters.
Collapse
|
19
|
Abstract
Frequency lowering technologies offer an alternative amplification solution for severe to profound high frequency hearing losses. While frequency lowering technologies may improve audibility of high frequency sounds, the very nature of this processing can affect the perceived sound quality. This article reports the results from two studies that investigated the impact of a nonlinear frequency compression (NFC) algorithm on perceived sound quality. In the first study, the cutoff frequency and compression ratio parameters of the NFC algorithm were varied, and their effect on the speech quality was measured subjectively with 12 normal hearing adults, 12 normal hearing children, 13 hearing impaired adults, and 9 hearing impaired children. In the second study, 12 normal hearing and 8 hearing impaired adult listeners rated the quality of speech in quiet, speech in noise, and music after processing with a different set of NFC parameters. Results showed that the cutoff frequency parameter had more impact on sound quality ratings than the compression ratio, and that the hearing impaired adults were more tolerant to increased frequency compression than normal hearing adults. No statistically significant differences were found in the sound quality ratings of speech-in-noise and music stimuli processed through various NFC settings by hearing impaired listeners. These findings suggest that there may be an acceptable range of NFC settings for hearing impaired individuals where sound quality is not adversely affected. These results may assist an Audiologist in clinical NFC hearing aid fittings for achieving a balance between high frequency audibility and sound quality.
Collapse
|
20
|
On a reference-free speech quality estimator for hearing aids. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 133:EL412-8. [PMID: 23656102 DOI: 10.1121/1.4802186] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
A reference-free speech quality measure is proposed and assessed for hearing aid applications. The proposed speech quality metric is validated with subjective ratings obtained from hearing impaired listeners under a number of noisy and reverberant conditions. In addition, a comparison is drawn between the proposed measure and a state-of-the-art electroacoustic measure that relies on a clean reference signal. The results showed that the reference-free measure had a lower correlation with the subjective ratings of hearing aid speech quality in comparison to the correlations achieved by the measure utilizing a reference signal. Nevertheless, advantages of the reference-free approach are discussed.
Collapse
|
21
|
Evaluation of Speech Intelligibility and Sound Localization Abilities with Hearing Aids Using Binaural Wireless Technology. Audiol Res 2012; 3:e1. [PMID: 26557339 PMCID: PMC4627128 DOI: 10.4081/audiores.2013.e1] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2012] [Revised: 10/15/2012] [Accepted: 11/19/2012] [Indexed: 11/23/2022] Open
Abstract
Wireless synchronization of the digital signal processing (DSP) features between two hearing aids in a bilateral hearing aid fitting is a fairly new technology. This technology is expected to preserve the differences in time and intensity between the two ears by co-ordinating the bilateral DSP features such as multichannel compression, noise reduction, and adaptive directionality. The purpose of this study was to evaluate the benefits of wireless communication as implemented in two commercially available hearing aids. More specifically, this study measured speech intelligibility and sound localization abilities of normal hearing and hearing impaired listeners using bilateral hearing aids with wireless synchronization of multichannel Wide Dynamic Range Compression (WDRC). Twenty subjects participated; 8 had normal hearing and 12 had bilaterally symmetrical sensorineural hearing loss. Each individual completed the Hearing in Noise Test (HINT) and a sound localization test with two types of stimuli. No specific benefit from wireless WDRC synchronization was observed for the HINT; however, hearing impaired listeners had better localization with the wireless synchronization. Binaural wireless technology in hearing aids may improve localization abilities although the possible effect appears to be small at the initial fitting. With adaptation, the hearing aids with synchronized signal processing may lead to an improvement in localization and speech intelligibility. Further research is required to demonstrate the effect of adaptation to the hearing aids with synchronized signal processing on different aspects of auditory performance.
Collapse
|
22
|
Clinical approach to monitoring variability associated with adductor spasmodic dysphonia. J Otolaryngol Head Neck Surg 2011; 40:343-349. [PMID: 21777554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023] Open
Abstract
OBJECTIVES Adductor spasmodic dysphonia (ADSD) is a voice disorder characterized by considerable intra- and intersubject variability. Although objective, acoustic measures of voice may provide a metric for ADSD, such measures can be inefficient in documenting such characteristics. This project integrated a simple auditory-perceptual measure termed "laryngeal overpressure" (LO) with measures of acoustic variability. METHODS Ten adults diagnosed with ADSD were sequentially followed over a period of 3 to 6 months. Standard voice recordings were obtained at each point, and acoustic measures were gathered. Additionally, three experienced listeners then rated LO using a visual analogue scale, and acoustic variability was assessed relative to the measure of LO. RESULTS Listener ratings of LO did not differ across the three-sentence stimuli and were highly correlated (r = .828 and .909 for naive and experienced listeners, respectively). A strong correlation was identified between the acoustic measure of harmonics to noise ratio and the all-voiced sentence stimuli (r = .710). CONCLUSION LO appears to provide an easy clinical method of documenting voice change over time in those with ADSD. Although additional methods of voice monitoring may be used, the use of LO may provide the opportunity for a standard and reliable approach to the clinical monitoring of voice variability in those presenting with ADSD.
Collapse
|
23
|
Objective estimation of tracheoesophageal speech ratings using an auditory model. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 127:1032-1041. [PMID: 20136224 DOI: 10.1121/1.3270396] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Total laryngectomy is often the treatment of choice for many individuals diagnosed with advanced laryngeal cancer. This procedure alters the normal voice production mechanism, and tracheoesophageal (TE) speech is one alternative method of voicing postlaryngectomy. TE speech is created when pulmonary air is passed through the upper esophagus to create a vibratory source that is then articulated into speech. TE speech is often characterized by abnormal voice quality. Acoustic analysis of TE speech has the potential of quantifying the voice quality and assisting the speech language pathologist in facilitating rehabilitation. Motivated in part by the recent advances in telecommunication industry for speech quality estimation, this paper investigated the application of an auditory model in predicting the ratings of TE speech by normal hearing listeners. The Moore-Glasberg auditory model was employed to extract perceptually relevant features from the acoustic waveform, and these features were later combined to estimate the subjective ratings of TE speech. This approach was validated with a database of subjective ratings of speech samples recorded from 35 TE speakers. Results showed moderate correlations between the objective metrics and the subjective ratings, and these correlations were significantly better than those obtained with traditional methods used in the telecommunication applications.
Collapse
|
24
|
Abstract
This study evaluated prototype multichannel nonlinear frequency compression (NFC) signal processing on listeners with high-frequency hearing loss. This signal processor applies NFC above a cut-off frequency. The participants were hearing-impaired adults (13) and children (11) with sloping, high-frequency hearing loss. Multiple outcome measures were repeated using a modified withdrawal design. These included speech sound detection, speech recognition, and self-reported preference measures. Group level results provide evidence of significant improvement of consonant and plural recognition when NFC was enabled. Vowel recognition did not change significantly. Analysis of individual results allowed for exploration of individual factors contributing to benefit received from NFC processing. Findings suggest that NFC processing can improve high frequency speech detection and speech recognition ability for adult and child listeners. Variability in individual outcomes related to factors such as degree and configuration of hearing loss, age of participant, and type of outcome measure.
Collapse
|
25
|
Safety and efficacy analysis of sunitinib (S), bevacizumab (B), and M-Tor inhibitors in metastatic renal cell cancer (mRCC) patients (pts) with renal insufficiency (RI). J Clin Oncol 2009. [DOI: 10.1200/jco.2009.27.15_suppl.5108] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
5108 Background: S, T (temsirolimus) and E (everolimus) are primarily metabolized in the liver, while the metabolism of B is unclear. There are limited data on the clinical toxicity profile and efficacy of these agents in pts with RI. Methods: The primary objective was to assess the safety and efficacy of S, B, T and E in pts with RI. Medical records of pts with mRCC at Wayne State University, treated on S, B, T or E were reviewed. Pts with a calculated creatinine clearance (CrCl) of ≤ 60ml/min [chronic kidney disease stage 3 or higher per K/DOQI guidelines by the National Kidney Foundation] were deemed to have RI. Data on safety and efficacy of the therapy were collected and analyzed with respect to renal function. Results: 19 of 51 (37%) pts had RI. Pts with RI had a higher median rise in blood pressure (BP) with S and B than pts with normal renal function. Patients with RI had an increased incidence of rash and higher dose interruption rates with m-TOR inhibitors. No major differences in toxicities including cardiac, thyroid, renal, lipid profile abnormalities or hyperglycemia were observed. Similar efficacy was seen in all groups. Conclusions: More than a third of pts with mRCC receiving targeted therapy have RI, hence highlighting the importance of evaluating tolerability of therapies in pts with RI. Therapy with S, B and T/E is well tolerated and efficacy appears to be maintained. Closer monitoring for hypertension is needed in pts receiving S and B. [Table: see text] No significant financial relationships to disclose.
Collapse
|
26
|
Reference-free automatic quality assessment of tracheoesophageal speech. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2009; 2009:6210-6213. [PMID: 19964897 DOI: 10.1109/iembs.2009.5334545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Evaluation of the quality of tracheoesophageal (TE) speech using machines instead of human experts can enhance the voice rehabilitation process for patients who have undergone total laryngectomy and voice restoration. Towards the goal of devising a reference-free TE speech quality estimation algorithm, we investigate the efficacy of speech signal features that are used in standard telephone-speech quality assessment algorithms, in conjunction with a recently introduced speech modulation spectrum measure. Tests performed on two TE speech databases demonstrate that the modulation spectral measure and a subset of features in the standard ITU-T P.563 algorithm estimate TE speech quality with better correlation (up to 0.9) than previously proposed features.
Collapse
|
27
|
Prediction of the quality ratings of tracheoespohageal speech using adaptive time-frequency representations. ACTA ACUST UNITED AC 2008. [DOI: 10.1109/ccece.2008.4564836] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
28
|
Active Noise Reduction Audiometry: A Prospective Analysis of a New Approach to Noise Management in Audiometric Testing. Laryngoscope 2008; 118:104-9. [DOI: 10.1097/mlg.0b013e31815743ac] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
29
|
Loudness pattern-based speech quality evaluation using bayesian modeling and Markov chain Monte Carlo methods. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 121:EL77-83. [PMID: 17348550 DOI: 10.1121/1.2430765] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
This work presents a speech quality evaluation method which is based on Moore and Glasberg's loudness model and Bayesian modeling. In the proposed method, the differences between the loudness patterns of the original and processed speech signals are employed as the observed features for representing speech quality, a Bayesian learning model is exploited as the cognitive model which maps the features into quality scores, and Markov chain Monte Carlo methods are used for the Bayesian computation. The performance of the proposed method was demonstrated through comparisons with the state-of-the-art speech quality evaluation standard, ITU-T P.862, using seven ITU subjective quality databases.
Collapse
|
30
|
Abstract
Acoustical measures of vocal function are routinely used in the assessments of disordered voice, and for monitoring the patient's progress over the course of voice therapy. Typically, acoustic measures are extracted from sustained vowel stimuli where short-term and long-term perturbations in fundamental frequency and intensity, and the level of "glottal noise" are used to characterize the vocal function. However, acoustic measures extracted from continuous speech samples may well be required for accurate prediction of abnormal voice quality that is relevant to the client's "real world" experience. In contrast with sustained vowel research, there is relatively sparse literature on the effectiveness of acoustic measures extracted from continuous speech samples. This is partially due to the challenge of segmenting the speech signal into voiced, unvoiced, and silence periods before features can be extracted for vocal function characterization. In this paper we propose a joint time-frequency approach for classifying pathological voices using continuous speech signals that obviates the need for such segmentation. The speech signals were decomposed using an adaptive time-frequency transform algorithm, and several features such as the octave max, octave mean, energy ratio, length ratio, and frequency ratio were extracted from the decomposition parameters and analyzed using statistical pattern classification techniques. Experiments with a database consisting of continuous speech samples from 51 normal and 161 pathological talkers yielded a classification accuracy of 93.4%.
Collapse
|
31
|
Interaction of speech coders and atypical speech II: effects on speech quality. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2002; 45:689-699. [PMID: 12199399 DOI: 10.1044/1092-4388(2002/055)] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
We investigated how standard speech coders, currently used in modern communication systems, affect the quality of the speech of persons who have common speech and voice disorders. Three standardized speech coders (GSM 6.10 RPE-LTP, FS1016 CELP, and FS1015 LPC) and two speech coders based on subband processing were evaluated for their performance. Coder effects were assessed by measuring the quality of speech samples both before and after processing by the speech coders. Speech quality was rated by 10 listeners with normal hearing on 28 different scales representing pitch and loudness changes, speech rate, laryngeal and resonatory dysfunction, and coder-induced distortions. Results showed that (a) nine scale items were consistently and reliably rated by the listeners; (b) all coders degraded speech quality on these nine scales, with the GSM and CELP coders providing the better quality speech; and (c) interactions between coders and individual voices did occur on several voice quality scales.
Collapse
|
32
|
Interaction of speech coders and atypical speech I: effects on speech intelligibility. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2002; 45:482-493. [PMID: 12069001 DOI: 10.1044/1092-4388(2002/038)] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
We investigated how standard speech coders, currently used in modern communication systems, affect the intelligibility of the speech of persons who have common speech and voice disorders. Three standardized speech coders (viz., GSM 6.10 [RPE-LTP], FS1016 [CELP], FS1015 [LPC]) and two speech coders based on subband processing were evaluated for their performance. Coder effects were assessed by measuring the intelligibility of vowels and consonants both before and after processing by the speech coders. Native English talkers who had normal hearing identified these speech sounds. Results confirmed that (a) all coders reduce the intelligibility of spoken language; (b) these effects occur in a consistent manner, with the GSM and CELP coders providing the least degradation relative to the original unprocessed speech; and (c) coders interact with individual voices so that speech is degraded differentially for different talkers.
Collapse
|
33
|
Abstract
We evaluated acoustic voice characteristics of 18 male patients undergoing radiotherapy. The subjects were seen for voice assessment preradiotherapy and at 1 month, 3 months, 6 months, and 1 year following radiotherapy. A multidimensional voice analysis computer program (IVANS, Avaaz Innovations, 1998) was employed to evaluate measures of traditional frequency and amplitude perturbation as well as time-based and linear prediction (LP) modeled "noise" parameters of the acoustic output in conjunction with perceptual judgments of overall vocal quality. The results indicate vocal deterioration of vocal function immediately following radiotherapy with gradual and significant improvement in acoustic and perceptual features over 9 to 12 months following the radiation treatment. Measures of glottal noise demonstrated higher sensitivity than frequency-based measures of voice perturbation, and with more consistent, less variable changes in acoustical voice output from the preradiation to the 12 month postradiation periods. Future research evaluating vowel type and acoustic perturbation measures with a larger sample of subjects over a longer time period seems warranted.
Collapse
|
34
|
Abstract
Acoustic measures provide an objective means to describe pathological voices and are a routine component of the clinical voice examination. Because the voice sample is obtained using a microphone, microphone characteristics have the potential to influence the values of parameters obtained from a voice sample. This project examined how the choice of microphone affects key voice parameters and investigated how one might compensate for such microphone effects through filtering or by including additional parameters in the decision process. A database of 53 normal voice samples and 100 pathological voice samples was used in four experiments conducted in an anechoic chamber using four different microphones. One omnidirectional microphone and three cardioid microphones were used in these experiments. The original voice samples were presented to each microphone through a speaker located in an anechoic chamber, and the output of each microphone sampled to computer disk. Each microphone modified the frequency spectrum of the voice signal; this, in turn, affected the values of the voice parameters obtained. These microphone effects reduced the accuracy with which acoustic measures of voice could be used to discriminate pathological from normal voices. Discrimination performance improved when the microphone output was filtered to compensate for microphone frequency response. Performance also improved when spectral moment coefficient parameters were added to the vocal function parameters already in use.
Collapse
|
35
|
Acoustic discrimination of pathological voice: sustained vowels versus continuous speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2001; 44:327-339. [PMID: 11324655 DOI: 10.1044/1092-4388(2001/027)] [Citation(s) in RCA: 154] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
We investigated the ability of acoustic measures to discriminate between normal and pathological talkers. Two groups of measures were compared: (a) those extracted from sustained vowels and (b) those based on continuous speech samples. Nine acoustic measures, which include fundamental frequency and amplitude perturbation measures, long term average spectral measures, and glottal noise measures were extracted from both sustained vowel and continuous speech samples. Our experiments were performed on a published database of 53 normal talkers and 175 talkers with a pathological voice. The classification performance of the nine acoustic measures was quantified using linear discriminant analysis and receiver operating characteristic (ROC) curve analysis. When individual measures were considered in isolation, classification was more accurate for measures extracted from sustained vowels than for those based on continuous speech samples. Classification accuracy improved when combinations of acoustic parameters were considered. For such combinations of measures, classification results were comparable for measures extracted from continuous speech samples and for those based on sustained vowels.
Collapse
|
36
|
Identification of pathological voices using glottal noise measures. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2000; 43:469-485. [PMID: 10757697 DOI: 10.1044/jslhr.4302.469] [Citation(s) in RCA: 62] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
We investigated the abilities of four fundamental frequency (F0)-dependent and two F0-independent measures to quantify vocal noise. Two of the F0-dependent measures were computed in the time domain, and two were computed using spectral information from the vowel. The F0-independent measures were based on the linear prediction (LP) modeling of vowel samples. Tests using a database of sustained vowel samples, collected from 53 normal and 175 pathological talkers, showed that measures based on the LP model were much superior to the other measures. A classification rate of 96.5% was achieved by a parameter that quantifies the spectral flatness of the unmodeled component of the vowel sample.
Collapse
|
37
|
A comparison of high precision F0 extraction algorithms for sustained vowels. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 1999; 42:112-126. [PMID: 10025548 DOI: 10.1044/jslhr.4201.112] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Perturbation analysis of sustained vowel waveforms is used routinely in the clinical evaluation of pathological voices and in monitoring patient progress during treatment. Accurate estimation of voice fundamental frequency (F0) is essential for accurate perturbation analysis. Several algorithms have been proposed for fundamental frequency extraction. To be appropriate for clinical use, a key consideration is that an F0 extraction algorithm be robust to such extraneous factors as the presence of noise and modulations in voice frequency and amplitude that are commonly associated with the voice pathologies under study. This work examines the performance of seven F0 algorithms, based on the average magnitude difference function (AMDF), the input autocorrelation function (AC), the autocorrelation function of the center-clipped signal (ACC), the autocorrelation function of the inverse filtered signal (IFAC), the signal cepstrum (CEP), the Harmonic Product Spectrum (HPS) of the signal, and the waveform matching function (WM) respectively. These algorithms were evaluated using sustained vowel samples collected from normal and pathological subjects. The effect of background noise and of frequency and amplitude modulations on these algorithms was also investigated, using synthetic vowel waveforms.
Collapse
|
38
|
Convergence characteristics of two algorithms in non-linear stimulus artefact cancellation for electrically evoked potential enhancement. Med Biol Eng Comput 1998; 36:202-14. [PMID: 9684461 DOI: 10.1007/bf02510744] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Somatosensory evoked potentials (SEPs) are a sub-class of evoked potentials (EPs) that are very useful in diagnosing various neuromuscular disorders and in spinal cord and peripheral-nerve monitoring. Most often, the measurements of these signals are contaminated by stimulus-evoked artefact. Conventional stimulus-artifact (SA) reduction schemes are primarily hardware-based and rely on some form of input blanking during the SA phase. This procedure can result in partial SEP loss if the tail of the SA interferes with the SEP. Adaptive filters offer an attractive solution to this problem by iteratively reducing the SA waveform while leaving the SEP intact. Owing to the inherent non-linearities in the SA generation system, non-linear adaptive filters (NAFs) are most suitable. SA reduction using NAFs based on truncated second-order Volterra expansion series is investigated. The focus is on the performance of two main adaptation algorithms, the least mean square (LMS) and recursive least squares (RLS) algorithms, in the context of non-linear adaptive filtering. A comparison between the convergence and performance characteristics of these two algorithms is made by processing both simulated and experimental SA data. It is found that, in high artefact-to-noise ratio (ANR) SA cancellation, owing to the large eigenvalue spreads, the RLS-based NAF is more efficient than the LMS-based NAF. However, in low-ANR scenarios, the RLS- and LMS-based NAFs exhibit similar convergence properties, and the computational simplicity of the LMS-based NAFs makes them the preferred option.
Collapse
|
39
|
Adaptive stimulus artifact reduction in noncortical somatosensory evoked potential studies. IEEE Trans Biomed Eng 1998; 45:165-79. [PMID: 9473840 DOI: 10.1109/10.661265] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Somatosensory evoked potentials (SEP's) are an important class of bioelectric signals which contain clinically valuable information. The surface measurements of these potentials are often contaminated by a stimulus evoked artifact. The stimulus artifact (SA), depending upon the stimulator and measurement system characteristics, may obscure some of the information carried by the SEP's. Conventional methods for SA reduction employ hardware-based circuits which attempt to eliminate the SA by blanking the input during SA period. However, there is a danger of losing some of the important SEP information, especially if the stimulating and recording electrodes are close together. In this paper, we apply both linear and nonlinear adaptive filtering techniques to the problem of SA reduction. Nonlinear adaptive filters (NAF's) based on truncated second-order Volterra series expansion are discussed and their applicability to SA cancellation is explored through processing both simulated and in vivo SEP data. The performances of the NAF and the finite impulse response (FIR) linear adaptive filter (LAF) are compared by processing experimental SEP data collected from different recording sites. Due to the inherent nonlinearities in the generation of the SA, the NAF is shown to achieve significantly better SA cancellation compared to the LAF.
Collapse
|
40
|
Multireference adaptive noise cancellation applied to somatosensory evoked potentials. IEEE Trans Biomed Eng 1994; 41:792-800. [PMID: 7927401 DOI: 10.1109/10.310094] [Citation(s) in RCA: 23] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Somatosensory Evoked Potentials (SEP's) contain information that is useful in diagnosing various physiological disorders. However, surface measurements of these potentials suffer from very poor Signal-to-Noise ratio (SNR) resulting in imperceptible SEP waveforms. This factor motivates the employment of dedicated signal processing techniques to improve the quality of the waveform. The objective of this research work is to improve the SNR of SEP by eliminating the predominant myoelectric interference. The strategy followed to achieve this goal is to process the SEP signal by MultiReference Adaptive Noise Cancellation (MRANC). A theoretical model for the MRANC is presented and its performance under the influence of various factors is investigated and compared with other signal processing techniques. The performance of the MRANC is then evaluated by processing simulated and in vivo SEP data. It is found that the MRANC gives a significant improvement in the SNR of the SEP.
Collapse
|