1
|
Evaluating Camera Mouse as a computer access system for augmentative and alternative communication in cerebral palsy: A case study. Assist Technol 2024; 36:217-223. [PMID: 37699111 PMCID: PMC10927611 DOI: 10.1080/10400435.2023.2242893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/26/2023] [Indexed: 09/14/2023] Open
Abstract
Camera Mouse is a freely available software program that visually tracks the movement of facial features to allow individuals with motor impairments to control a computer mouse. The goal of this case study was to provide an evaluation of Camera Mouse as a computer access method as part of a multiple modality communication system for an individual with cerebral palsy. The participant was asked to reproduce sentences and respond to ethical dilemmas for language sampling. Tasks were completed using natural speech and an AAC solution consisting of Camera Mouse paired with an orthographic selection interface and speech synthesis. The participant completed a questionnaire for satisfaction with the introduced assistive technology. Camera Mouse resulted in higher intelligibility than natural speech, while natural speech had a higher rate. She used more complex language with her natural speech. The participant rated Camera Mouse as at least 3/5 on all measures, including 5/5 on weight and safety. The results of this case study suggest Camera Mouse is a promising computer access system for communication supported by the participant's satisfaction rating, expressive language, and synthesized speech production capabilities.
Collapse
|
2
|
Relative Fundamental Frequency in Individuals with Globus Syndrome and Muscle Tension Dysphagia. J Voice 2024; 38:612-618. [PMID: 34823980 PMCID: PMC9124719 DOI: 10.1016/j.jvoice.2021.10.013] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 10/01/2021] [Accepted: 10/05/2021] [Indexed: 12/19/2022]
Abstract
OBJECTIVE Relative fundamental frequency (RFF) has been investigated as an acoustic measure to assess for changes in laryngeal tension. This study aimed to assess RFF in individuals with globus syndrome, individuals with muscle tension dysphagia (MTDg), and individuals with typical voices. METHODS RFF values were calculated from the speech acoustics of individuals with globus syndrome (n = 12), individuals with MTDg (n = 12), and age- and sex-matched controls with typical voices (n = 24). An analysis of variance was performed on RFF values to assess the effect of group. RESULTS There was no statistically significant effect of group on RFF values, with similar values for individuals with globus syndrome, individuals with MTDg, and control participants. CONCLUSIONS These results suggest that individuals with these disorders do not appear to possess paralaryngeal muscle tension in a locus and/or manner that directly impacts voice production.
Collapse
|
3
|
The Impact of Foreign Language Accent on Expert Listeners' Auditory-Perceptual Evaluations of Dysphonia. Laryngoscope 2024; 134:2272-2276. [PMID: 37942827 PMCID: PMC11006577 DOI: 10.1002/lary.31160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 09/06/2023] [Accepted: 10/17/2023] [Indexed: 11/10/2023]
Abstract
INTRODUCTION Auditory-perceptual evaluations of dysphonia, though essential for comprehensive voice evaluation, are subject to listener bias. Knowledge of an underlying voice disorder can influence auditory-perceptual ratings. Accented speech results in increased listener effort and delays in word identification. Yet, little is known about the impact of foreign language accents on auditory-perceptual ratings for dysphonic speakers. The purpose of this work was to determine the impact of a foreign language accent on experts' auditory-perceptual ratings of dysphonic speakers. METHODS Twelve voice-specializing SLPs who spoke with a General American English (GAE) accent rated vocal percepts of 28 speakers with a foreign language accent and 28 with a GAE accent, all of whom had been diagnosed with a voice disorder. Speaker groups were matched based on sex, age, and mean smoothed cepstral peak prominence. Four linear mixed-effects models assessed the impact of a foreign language accent on expert auditory-perceptual ratings of the overall severity of dysphonia, roughness, breathiness, and strain. RESULTS The twelve raters demonstrated good inter- and intra-rater reliability (ICC[3, k] = .89; mean ICC = .89). The linear mixed-effects models revealed no significant impact of foreign language accent on ratings of overall severity of dysphonia, roughness, breathiness, or strain. CONCLUSION Despite the possibility of increased listener effort and bias, foreign language accent incongruence had no effect on expert listeners' auditory-perceptual evaluations for dysphonic speakers. Findings support the use of auditory-perceptual evaluations for voice disorders across sociolinguistically diverse populations. LEVEL OF EVIDENCE 3 Laryngoscope, 134:2272-2276, 2024.
Collapse
|
4
|
Controlling Pitch for Prosody: Sensorimotor Adaptation in Linguistically Meaningful Contexts. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:440-454. [PMID: 38241671 PMCID: PMC11000799 DOI: 10.1044/2023_jslhr-23-00460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 10/09/2023] [Accepted: 11/02/2023] [Indexed: 01/21/2024]
Abstract
PURPOSE This study examined how speakers adapt to fundamental frequency (fo) errors that affect the use of prosody to convey linguistic meaning, whether fo adaptation in that context relates to adaptation in linguistically neutral sustained vowels, and whether cue trading is reflected in responses in the prosodic cues of fo and amplitude. METHOD Twenty-four speakers said vowels and sentences while fo was digitally altered to induce predictable errors. Shifts in fo (±200 cents) were applied to the entire sustained vowel and one word (emphasized or unemphasized) in sentences. Two prosodic cues-fo and amplitude-were extracted. The effects of fo shifts, shift direction, and emphasis on fo response magnitude were evaluated with repeated-measures analyses of variance. Relationships between adaptive fo responses in sentences and vowels and between adaptive fo and amplitude responses were evaluated with Spearman correlations. RESULTS Speakers adapted to fo errors in both linguistically meaningful sentences and linguistically neutral vowels. Adaptive fo responses of unemphasized words were smaller than those of emphasized words when fo was shifted upward. There was no relationship between adaptive fo responses in vowels and emphasized words, but adaptive fo and amplitude responses were strongly, positively correlated. CONCLUSIONS Sensorimotor adaptation occurs in response to fo errors regardless of how disruptive the error is to linguistic meaning. Adaptation to fo errors during sustained vowels may not involve the exact same mechanisms as sensorimotor adaptation as it occurs in meaningful speech. The relationship between adaptive responses in fo and amplitude supports an integrated model of prosody. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.25008908.
Collapse
|
5
|
Test-Retest Reliability of Behavioral Assays of Feedforward and Feedback Auditory-Motor Control of Voice and Articulation. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:34-48. [PMID: 37992404 PMCID: PMC11000789 DOI: 10.1044/2023_jslhr-23-00038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 07/24/2023] [Accepted: 09/25/2023] [Indexed: 11/24/2023]
Abstract
PURPOSE Behavioral assays of feedforward and feedback auditory-motor control of voice and articulation frequently are used to make inferences about underlying neural mechanisms and to study speech development and disorders. However, no studies have examined the test-retest reliability of such measures, which is critical for rigorous study of auditory-motor control. Thus, the purpose of the present study was to assess the reliability of assays of feedforward and feedback control in voice versus articulation domains. METHOD Twenty-eight participants (14 cisgender women, 12 cisgender men, one transgender man, one transmasculine/nonbinary) who denied any history of speech, hearing, or neurological impairment were measured for responses to predictable versus unexpected auditory feedback perturbations of vocal (fundamental frequency, fo) and articulatory (first formant, F1) acoustic parameters twice, with 3-6 weeks between sessions. Reliability was measured with intraclass correlations. RESULTS Opposite patterns of reliability were observed for fo and F1; fo reflexive responses showed good reliability and fo adaptive responses showed poor reliability, whereas F1 reflexive responses showed poor reliability and F1 adaptive responses showed moderate reliability. However, a criterion-referenced categorical measurement of fo adaptive responses as typical versus atypical showed substantial test-retest agreement. CONCLUSIONS Individual responses to some behavioral assays of auditory-motor control of speech should be interpreted with caution, which has implications for several fields of research. Additional research is needed to establish reliable criterion-referenced measures of F1 adaptive responses as well as fo and F1 reflexive responses. Furthermore, the opposite patterns of test-retest reliability observed for voice versus articulation add to growing evidence for differences in underlying neural control mechanisms.
Collapse
|
6
|
Sex Differences in the Speech of Persons With and Without Parkinson's Disease. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2024; 33:96-116. [PMID: 37889201 PMCID: PMC11000784 DOI: 10.1044/2023_ajslp-22-00350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 02/24/2023] [Accepted: 08/30/2023] [Indexed: 10/28/2023]
Abstract
BACKGROUND Sex differences are apparent in the prevalence and the clinical presentation of Parkinson's disease (PD), but their effects on speech have been less studied. METHOD Speech acoustics of persons with (34 females and 34 males) and without (age- and sex-matched) PD were examined, assessing the effects of PD diagnosis and sex on ratings of dysarthria severity and acoustic measures of phonation (fundamental frequency standard deviation, smoothed cepstral peak prominence), speech rate (net syllables per second, percent pause ratio), and articulation (articulatory-acoustic vowel space, release burst precision). RESULTS Most measures were affected by PD (dysarthria severity, fundamental frequency standard deviation) and sex (smoothed cepstral peak prominence, net syllables per second, percent pause ratio, articulatory-acoustic vowel space), but without interactions between them. Release burst precision was differentially affected by sex in PD. Relative to those without PD, persons with PD produced fewer plosives with a single burst: females more frequently produced multiple bursts, whereas males more frequently produced no burst at all. CONCLUSIONS Most metrics did not indicate that speech production is differentially affected by sex in PD. Sex was, however, associated with disparate effects on release burst precision in PD, which deserves further study. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.24388666.
Collapse
|
7
|
Effects of a Concurrent Working Memory Task on Speech Acoustics in Parkinson's Disease. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2024; 33:418-434. [PMID: 38081054 PMCID: PMC11001185 DOI: 10.1044/2023_ajslp-23-00214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 08/30/2023] [Accepted: 10/26/2023] [Indexed: 01/05/2024]
Abstract
PURPOSE The purpose of this study was to determine the effect of a concurrent working memory task on acoustic measures of speech in individuals with Parkinson's disease (PD). METHOD Individuals with PD and age- and sex-matched controls performed a speaking task with and without a Stroop-like concurrent working memory task. Cepstral peak prominence, low-to-high spectral energy ratio, fundamental frequency (fo) standard deviation, articulation rate, pause duration, articulatory-acoustic vowel space, relative fo, mean voice onset time (VOT), and VOT variability were calculated for each condition. Mixed-model analyses of variance were performed to determine the effects of group, condition (presence of the concurrent working memory task), and their interaction on the acoustic measures. RESULTS All measures except for VOT variability, mean pause duration, and relative fo offset differed between people with and without PD. Cepstral peak prominence, articulation rate, and relative fo offset differed as a function of condition. However, no measures indicated disparate effects of condition as a function of group. CONCLUSION Although differentially impactful on limb motor function in PD, here a concurrent working memory task was not found to be differentially disruptive to speech acoustics in PD. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.24759648.
Collapse
|
8
|
Does Implicit Racial Bias Affect Auditory-Perceptual Evaluations of Dysphonic Voices? J Voice 2023:S0892-1997(23)00383-1. [PMID: 38065808 DOI: 10.1016/j.jvoice.2023.11.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Revised: 11/22/2023] [Accepted: 11/27/2023] [Indexed: 01/07/2024]
Abstract
PURPOSE The purpose of this study was to understand the role of implicit racial bias in auditory-perceptual evaluations of dysphonic voices by determining if a biasing effect exists for novice listeners in their auditory-perceptual ratings of Black and White speakers. METHOD Thirty speech-language pathology graduate students at Boston University listened to audio files of 20 Black speakers and 20 White speakers of General American English with voice disorders. Listeners rated the overall severity of dysphonia of each voice heard using a 100-unit visual analog scale and completed the Harvard Implicit Association Test (IAT) to measure their implicit racial bias. RESULTS Both Black and White speakers were rated as less severely dysphonic when their race was labeled as Black. No significant relationship was found between Harvard IAT scores and differences in severity ratings by race labeling condition. CONCLUSIONS These findings suggest a minimizing bias in the evaluation of dysphonia for Black patients with voice disorders. These results contribute to the understanding of how a patient's race may impact their visit with a clinician. Further research is needed to determine the most effective interventions for implicit bias retraining and the additional ways that implicit racial bias impacts comprehensive voice evaluations.
Collapse
|
9
|
Do Not Cut Off Your Tail: A Mega-Analysis of Responses to Auditory Perturbation Experiments. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:4315-4331. [PMID: 37850867 PMCID: PMC10715843 DOI: 10.1044/2023_jslhr-23-00315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 08/04/2023] [Accepted: 08/06/2023] [Indexed: 10/19/2023]
Abstract
PURPOSE The practice of removing "following" responses from speech perturbation analyses is increasingly common, despite no clear evidence as to whether these responses represent a unique response type. This study aimed to determine if the distribution of responses to auditory perturbation paradigms represents a bimodal distribution, consisting of two distinct response types, or a unimodal distribution. METHOD This mega-analysis pooled data from 22 previous studies to examine the distribution and magnitude of responses to auditory perturbations across four tasks: adaptive pitch, adaptive formant, reflexive pitch, and reflexive formant. Data included at least 150 unique participants for each task, with studies comprising younger adult, older adult, and Parkinson's disease populations. A Silverman's unimodality test followed by a smoothed bootstrap resampling technique was performed for each task to evaluate the number of modes in each distribution. Wilcoxon signed-ranks tests were also performed for each distribution to confirm significant compensation in response to the perturbation. RESULTS Modality analyses were not significant (p > .05) for any group or task, indicating unimodal distributions. Our analyses also confirmed compensatory reflexive responses to pitch and formant perturbations across all groups, as well as adaptive responses to sustained formant perturbations. However, analyses of sustained pitch perturbations only revealed evidence of adaptation in studies with younger adults. CONCLUSION The demonstration of a clear unimodal distribution across all tasks suggests that following responses do not represent a distinct response pattern, but rather the tail of a unimodal distribution. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.24282676.
Collapse
|
10
|
Automated Creak Differentiates Adductor Laryngeal Dystonia and Muscle Tension Dysphonia. Laryngoscope 2023; 133:2687-2694. [PMID: 36715109 PMCID: PMC10387123 DOI: 10.1002/lary.30588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 12/21/2022] [Accepted: 01/11/2023] [Indexed: 01/31/2023]
Abstract
OBJECTIVE The purpose of this study was to determine whether automated estimates of vocal creak would differentiate speakers with adductor laryngeal dystonia (AdLD) from speakers with muscle tension dysphonia (MTD) and speakers without voice disorders. METHODS Sixteen speakers with AdLD, sixteen speakers with MTD, and sixteen speakers without voice disorders were recorded in a quiet environment reading aloud a standard paragraph. An open-source creak detector was used to calculate the percentage of creak (% creak) in each of the speaker's six recorded sentences. RESULTS A Kruskal-Wallis one-way analysis of variance revealed a statistically significant effect of group on the % creak with a large effect size. Pairwise Wilcoxon tests revealed a statistically significant difference in % creak between speakers with AdLD and controls as well as between speakers with AdLD and MTD. Receiver operating characteristic curve analyses indicated that % creak differentiated AdLD from both controls and speakers with MTD with high sensitivity and specificity (area under the curve statistics of 0.94 and 0.86, respectively). CONCLUSION Percentage of creak as calculated by an automated creak detector may be useful as a quantitative indicator of AdLD, demonstrating the potential for use as a screening tool or to aid in a differential diagnosis. LEVEL OF EVIDENCE 3 Laryngoscope, 133:2687-2694, 2023.
Collapse
|
11
|
Auditory-Motor Function Pre- and Post-Therapy in Hyperfunctional Voice Disorders: A Case Series. J Voice 2023:S0892-1997(23)00264-3. [PMID: 37716889 DOI: 10.1016/j.jvoice.2023.08.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 08/16/2023] [Accepted: 08/16/2023] [Indexed: 09/18/2023]
Abstract
OBJECTIVE/HYPOTHESIS Behavioral voice therapy is the most common treatment for hyperfunctional voice disorders (HVDs) but has limited long-term effectiveness since the comprehensive mechanisms underlying HVDs remain unclear. Recent work has implicated disordered sensorimotor integration during speech in some speakers with HVDs and suggests that auditory processing is a key factor to consider in HVD assessment and therapy. The purpose of this case-series study was to assess whether current voice therapy approaches for HVDs resulted in improvements to auditory-motor function. STUDY DESIGN Longitudinal (pre-post) study. METHOD Pre and postvoice therapy for HVDs, 11 speakers underwent an assessment of auditory-motor function via auditory discrimination of vocal pitch, responses to unanticipated auditory perturbations, and responses to predictable auditory perturbations of vocal pitch. RESULTS At the post-therapy session, 10 out of 11 participants demonstrated voice therapy success (via self-reported voice problems and/or auditory-perceptual judgements of voice by a clinician) and eight of the 11 participants demonstrated improvements in at least one measure of auditory discrimination and/or auditory-motor control. Specifically, three speakers demonstrated improvements in auditory discrimination, five speakers demonstrated improved (within typical cutoffs) responses to predictable perturbations, and two speakers demonstrated improvements in both auditory discrimination and auditory-motor measures. CONCLUSIONS Together, these findings support that voice therapy in individuals with HVDs may impact auditory-motor control and highlight the potential benefit of systematically addressing auditory function in voice therapy and assessment for HVDs.
Collapse
|
12
|
Voice Acoustic Instability During Spontaneous Speech in Parkinson's Disease. J Voice 2023:S0892-1997(23)00176-5. [PMID: 37500359 PMCID: PMC10808279 DOI: 10.1016/j.jvoice.2023.06.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 06/01/2023] [Accepted: 06/06/2023] [Indexed: 07/29/2023]
Abstract
BACKGROUND In people with Parkinson's disease (PwPD), both motor and cognitive deficits influence voice and other aspects of communication. PwPD demonstrate vocal instability, but acoustic declines over the course of speaking are not well characterized and the role of cognition on these declines is unknown. We examined voice acoustics related to speech motor instability by comparing the first and the last utterances within a speech task. Our objective was to determine if mild cognitive impairment (MCI) status was associated with different patterns of acoustic change during these tasks. METHODS Participants with PD (n = 44) were enrolled at University of Massachusetts Chan Medical School and classified by gold-standard criteria as normal cognition (PD-NC) or mild cognitive impairment (PD-MCI). The speech was recorded during the Rainbow Passage and a picture description task (Cookie Theft). We calculated the difference between first and last utterances in fo mean and standardized semitones (STSD), cepstral peak prominence-smoothed (CPPS), and low to high ratio (LH). We used t-tests to compare the declines in acoustic parameters between the task types and between participants with PD-NC versus PD-MCI. RESULTS Mean fo, fo variability (STSD) and CPPS declined from the first to the last utterance in both tasks, but there was no significant difference in these declines between the PD-NC and PD-MCI groups. Those with PD-MCI demonstrated lower fo variability on the whole in both tasks and lower CPPS in the picture description task, compared to those with PD-NC. CONCLUSIONS Mean and STSD fo as well as CPPS may be sensitive to PD-MCI status in reading and spontaneous speech tasks. Speech motor instability can be observed in these voice acoustic parameters over brief speech tasks, but the degree of decline does not depend on cognitive status. These findings will inform the ongoing development of algorithms to monitor speech and cognitive function in PD.
Collapse
|
13
|
Normative Values of Cepstral Peak Prominence Measures in Typical Speakers by Sex, Speech Stimuli, and Software Type Across the Life Span. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2023; 32:1565-1577. [PMID: 37257202 PMCID: PMC10473385 DOI: 10.1044/2023_ajslp-22-00264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 12/15/2022] [Accepted: 03/16/2023] [Indexed: 06/02/2023]
Abstract
PURPOSE The purpose of this study was to determine normative values for cepstral peak prominence measures across the life span as a function of sex using clinically relevant stimuli (/ɑ/, /i/, and two sentences of The Rainbow Passage) and two commonly used software types: Praat (Version 6.0.50) and Analysis of Dysphonia in Speech and Voice (ADSV). METHOD One hundred fifty speakers (75 males, 75 females; evenly distributed into three age groups) without voice disorders aged 18-91 years were recorded via headset microphone in a sound-treated booth. Cepstral measures were analyzed using common analysis methods in Praat and ADSV by sex, stimuli, and software type. Kruskal-Wallis tests and post hoc Mood's Median tests for significant factors were performed on cepstral measures to assess the effects of age group, sex, stimuli, and software type. RESULTS The results revealed statistically significant effects of sex, stimuli, and software type on cepstral measures, but no statistical effect of age group on cepstral values. Females had lower average cepstral values compared to males. Across stimuli, the highest average cepstral measure was found for sustained /ɑ/, followed by sustained /i/, and then of the two sentences of The Rainbow Passage. Average cepstral measures in Praat were higher than those from ADSV. CONCLUSIONS The current work did not find a statistical effect of age group on cepstral values; thus, normative cepstral values were reported by sex, stimuli, and software type. Future work should examine the applicability of these normative values for discriminating speakers with and without voice disorders.
Collapse
|
14
|
Acoustic Measures of Voice and Physiologic Measures of Autonomic Arousal During Speech as a Function of Cognitive Load in Older Adults. J Voice 2023; 37:194-202. [PMID: 33509665 PMCID: PMC8310524 DOI: 10.1016/j.jvoice.2020.12.027] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Revised: 12/01/2020] [Accepted: 12/17/2020] [Indexed: 10/22/2022]
Abstract
OBJECTIVES/HYPOTHESIS The purpose of this study was to determine the relationships among cognitive loading, autonomic arousal, and acoustic measures of voice in healthy older adults. STUDY DESIGN Prospective and observational. METHODS Twelve healthy older adults (six females) produced a sentence containing an embedded Stroop task in each of two cognitive load conditions: congruent and incongruent. Three physiologic measures of autonomic arousal (pulse volume amplitude, pulse period, and skin conductance response amplitude) and four acoustic measures of voice (cepstral peak prominence, low-to-high spectral energy ratio, fundamental frequency, and sound pressure level) were analyzed in each cognitive load condition. RESULTS A logistic regression model was used to predict the cognitive load condition using participant as a categorical predictor and the four acoustic measures and three autonomic measures as continuous predictors. Skin conductance response amplitude and pulse volume amplitude were both predictive of cognitive load; however, no acoustic measures of voice were statistically significant predictors of cognitive load for older adults. CONCLUSIONS These findings support the idea that increased cognitive load is associated with increased autonomic nervous system activity in older adults. The lack of changes in acoustic measures of voice with increased cognitive load may result from age-related changes in vocal quality and speech subsystems.
Collapse
|
15
|
The Relationship Between Pitch Discrimination and Fundamental Frequency Variation: Effects of Singing Status and Vocal Hyperfunction. J Voice 2023:S0892-1997(23)00010-3. [PMID: 36754684 PMCID: PMC10405643 DOI: 10.1016/j.jvoice.2023.01.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 01/09/2023] [Accepted: 01/09/2023] [Indexed: 02/08/2023]
Abstract
PURPOSE The purpose of this study was to investigate the relationship between pitch discrimination and fundamental frequency (fo) variation in running speech, with consideration of factors such as singing status and vocal hyperfunction (VH). METHOD Female speakers (18-69 years) with typical voices (26 non-singers; 27 singers) and speakers with VH (22 non-singers; 30 singers) completed a pitch discrimination task and read the Rainbow Passage. The pitch discrimination task was a two-alternative forced choice procedure, in which participants determined whether tokens were the same or different. Tokens were a prerecorded sustained /ɑ/ of the participant's own voice and a pitch-shifted version of their sustained /ɑ/, such that the difference in fo was adaptively modified. Pitch discrimination and Rainbow Passage fo variation were calculated for each participant and compared via Pearson's correlations for each group. RESULTS A significant strong correlation was found between pitch discrimination and fo variation for non-singers with typical voices. No significant correlations were found for the other three groups, with notable restrictions in the ranges of discrimination for both singer-groups and in the range of fo variation values for non-singers with VH. CONCLUSIONS Speakers with worse pitch discrimination may increase their fo variation to produce self-salient intonational changes, which is in contrast to previous findings from articulatory investigations. The erosion of this relationship in groups with singing training and/or with VH may be explained by the known influence of musical training on pitch discrimination or the biomechanical changes associated with VH restricting speakers' abilities to change their fo.
Collapse
|
16
|
Exploring the mechanics of fundamental frequency variation during phonation onset. Biomech Model Mechanobiol 2023; 22:339-356. [PMID: 36370231 PMCID: PMC10369356 DOI: 10.1007/s10237-022-01652-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Accepted: 10/20/2022] [Indexed: 11/15/2022]
Abstract
Fundamental frequency patterns during phonation onset have received renewed interest due to their promising application in objective classification of normal and pathological voices. However, the associated underlying mechanisms producing the wide array of patterns observed in different phonetic contexts are not yet fully understood. Herein, we employ theoretical and numerical analyses in an effort to elucidate the potential mechanisms driving opposing frequency patterns for initial/isolated vowels versus vowels preceded by voiceless consonants. Utilizing deterministic lumped-mass oscillator models of the vocal folds, we systematically explore the roles of collision and muscle activation in the dynamics of phonation onset. We find that an increasing trend in fundamental frequency, as observed for initial/isolated vowels, arises naturally through a progressive increase in system stiffness as collision intensifies as onset progresses, without the need for time-varying vocal fold tension or changes in aerodynamic loading. In contrast, reduction in cricothyroid muscle activation during onset is required to generate the decrease in fundamental frequency observed for vowels preceded by voiceless consonants. For such phonetic contexts, our analysis shows that the magnitude of reduction in the cricothyroid muscle activation and the activation level of the thyroarytenoid muscle are potential factors underlying observed differences in (relative) fundamental frequency between speakers with healthy and hyperfunctional voices. This work highlights the roles of sometimes competing laryngeal factors in producing the complex array of observed fundamental frequency patterns during phonation onset.
Collapse
|
17
|
Effects of Cognitive Stress on Voice Acoustics in Individuals With Hyperfunctional Voice Disorders. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2023; 32:264-274. [PMID: 36516470 PMCID: PMC10023146 DOI: 10.1044/2022_ajslp-22-00204] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
PURPOSE Autonomic nervous system dysfunction has been implicated in the development and persistence of hyperfunctional voice disorders (HVDs). The purpose of this study was to determine the effects of cognitive stress, which is known to arouse the autonomic nervous system, on voice acoustics in female speakers with and without HVDs. METHOD Adult female speakers-66 with HVDs, 66 without-were recorded while speaking with and without a cognitive stressor. Root-mean-square (RMS) of amplitude, fundamental frequency (f o), low-to-high spectral energy ratio (L/H ratio), cepstral peak prominence (CPP), and relative f o (RFF) were measured for each speaker and cognitive stress condition. Mixed-model analyses of variance and post hoc t tests were conducted to determine if cognitive stress affected voice acoustics and whether voice changes were greater for those with HVDs. RESULTS All measures differed significantly under cognitive stress for speakers with and without HVDs. RMS and CPP increased whereas f o, CPP, and RFF decreased under cognitive stress. Changes in these measures were not greater in those with HVDs. CONCLUSION Cognitive stress and presumed autonomic arousal affect voice similarly in female speakers with and without HVDs.
Collapse
|
18
|
Symptom Expression Across Voiced Speech Sounds in Adductor Laryngeal Dystonia. J Voice 2022:S0892-1997(22)00308-3. [PMID: 36424240 PMCID: PMC10199961 DOI: 10.1016/j.jvoice.2022.10.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Revised: 10/04/2022] [Accepted: 10/05/2022] [Indexed: 11/22/2022]
Abstract
OBJECTIVES Differential diagnosis for adductor laryngeal dystonia (AdLD) is often carried out by comparing symptom expression during sentences with either all voiced or voiced and voiceless consonants. However, empirical research examining the effects of phonetic context on symptoms is sparse. The purpose of this study was to examine whether symptom probabilities varied across voiced speech segments in an all-voiced sentence, and whether this variability was systematic with respect to phonetic features. METHODS Eighteen speakers with AdLD read aloud a sentence comprised entirely of voiced speech sounds. Speech segment boundaries and AdLD symptoms (phonatory breaks, frequency shifts, and creak) were labeled separately, and speech segments were coded as symptomatic or asymptomatic based on their temporal overlap. Generalized linear mixed effects models with a binomial outcome variable were used to compare the probability of symptom expression across: 1) all speech segments in the sentence, and 2) four speech sound classes (vowels, approximants, nasals, and obstruents). RESULTS Significant symptom variability was found across voiced speech segments in the sentence. Furthermore, the estimated probability of a symptom occurring on vowels and approximants was significantly greater than that of nasals and obstruents. CONCLUSION These results indicate that AdLD symptoms are not uniformly distributed across voiced speech segments with systematic variation across speech sound classes.To explain these findings, future work should investigate how the complex interactions between the vocal tract articulators and glottal configurations may influence symptom expression in this population.
Collapse
|
19
|
Spectral Aggregate of the High-Passed Fundamental Frequency and Its Relationship to the Primary Acoustic Features of Adductor Laryngeal Dystonia. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:4085-4095. [PMID: 36198059 PMCID: PMC9940896 DOI: 10.1044/2022_jslhr-22-00157] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
OBJECTIVE Currently, no clinically feasible objective measures exist that are specific to the signs of adductor laryngeal dystonia (LD), deterring effective diagnosis and treatment. This project sought to establish concurrent validity of a new automated acoustic outcome measure, designed to be specific to adductor laryngeal dystonia (AdLD): the spectral aggregate of the high-passed fundamental frequency contour (SAHf o). METHOD Twenty speakers with AdLD read voiced phoneme-loaded (more symptomatic) and voiceless phoneme-loaded (less symptomatic) sentences. LD discontinuities (defined as phonatory breaks, frequency shifts, and creak), the acoustic ramifications of laryngeal spasms, were manually identified. The frequency content of the f o contour was examined as a function of time, and content above 1000 Hz was summed to automatically calculate SAHf o. Multiple linear regression analysis was applied to SAHf o based on LD discontinuities and sentence type (voiced or voiceless phoneme-loaded). RESULTS The regression model accounted for 41.1% of the variance in SAHf o. Both the LD discontinuities and sentence type were statistically related to SAHf o. CONCLUSION Results of this study provide evidence of concurrent validity. SAHf o is an automatic outcome measure specific to acoustic signs of AdLD that may be useful to track treatment progress.
Collapse
|
20
|
Empirical Evaluation of the Role of Vocal Fold Collision on Relative Fundamental Frequency in Voicing Offset. J Voice 2022:S0892-1997(22)00291-0. [PMID: 36336485 PMCID: PMC10154433 DOI: 10.1016/j.jvoice.2022.09.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 09/16/2022] [Accepted: 09/19/2022] [Indexed: 11/06/2022]
Abstract
OBJECTIVES Relative fundamental frequency (RFF) is an acoustic measure of changes in fundamental frequency during voicing transitions. The physiological mechanisms underlying RFF remain unclear. Recent modeling suggests that changes in RFF during voicing offset are due to decreases in overall system stiffness as a direct result of the cessation of vocal fold collision. To evaluate this finding empirically, here we examined whether variable timing between the end of vocal fold collision and the final voicing cycle used to calculate RFF explained the variability in RFF across individual voicing offset utterances. METHODS RFF during voicing offset was calculated from /ifi/ utterances produced by 35 participants under endoscopy, with and without vocal effort. RFF was calculated via two methods, in which utterances were aligned by (1) the end of vocal fold collision, or (2) the end of voicing. Analyses of variance were used to determine the effects of vocal effort and RFF method on the mean and standard deviation of RFF. RESULTS Aligning by vocal fold collision resulted in statistically significantly lower standard deviations. RFF means were statistically higher using the collision method; however, the degree of vocal effort was statistically significant regardless of the method. CONCLUSIONS These results provide empirical evidence to support that decreases in RFF during voicing offset are a result of decreases in system stiffness due to termination of vocal fold collision.
Collapse
|
21
|
Lombard Effect in Individuals With Nonphonotraumatic Vocal Hyperfunction: Impact on Acoustic, Aerodynamic, and Vocal Fold Vibratory Parameters. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:2881-2895. [PMID: 35930680 PMCID: PMC9913286 DOI: 10.1044/2022_jslhr-21-00508] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 03/17/2022] [Accepted: 05/11/2022] [Indexed: 06/15/2023]
Abstract
PURPOSE This exploratory study aims to investigate variations in voice production in the presence of background noise (Lombard effect) in individuals with nonphonotraumatic vocal hyperfunction (NPVH) and individuals with typical voices using acoustic, aerodynamic, and vocal fold vibratory measures of phonatory function. METHOD Nineteen participants with NPVH and 19 participants with typical voices produced simple vocal tasks in three sequential background conditions: baseline (in quiet), Lombard (in noise), and recovery (5 min after removing the noise). The Lombard condition consisted of speech-shaped noise at 80 dB SPL through audiometric headphones. Acoustic measures from a microphone, glottal aerodynamic parameters estimated from the oral airflow measured with a circumferentially vented pneumotachograph mask, and vocal fold vibratory parameters from high-speed videoendoscopy were analyzed. RESULTS During the Lombard condition, both groups exhibited a decrease in open quotient and increases in sound pressure level, peak-to-peak glottal airflow, maximum flow declination rate, and subglottal pressure. During the recovery condition, the acoustic and aerodynamic measures of individuals with typical voices returned to those of the baseline condition; however, recovery measures for individuals with NPVH did not return to baseline values. CONCLUSIONS As expected, individuals with NPVH and participants with typical voices exhibited a Lombard effect in the presence of elevated background noise levels. During the recovery condition, individuals with NPVH did not return to their baseline state, pointing to a persistence of the Lombard effect after noise removal. This behavior could be related to disruptions in laryngeal motor control and may play a role in the etiology of NPVH. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.20415600.
Collapse
|
22
|
Resynthesis of Transmasculine Voices to Assess Gender Perception as a Function of Testosterone Therapy. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:2474-2489. [PMID: 35749662 PMCID: PMC9584127 DOI: 10.1044/2022_jslhr-21-00482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Revised: 12/22/2021] [Accepted: 03/31/2022] [Indexed: 06/15/2023]
Abstract
PURPOSE The goal of this study was to use speech resynthesis to investigate the effects of changes to individual acoustic features on speech-based gender perception of transmasculine voice samples following the onset of hormone replacement therapy (HRT) with exogenous testosterone. We hypothesized that mean fundamental frequency (f o) would have the largest effect on gender perception of any single acoustic feature. METHOD Mean f o, f o contour, and formant frequencies were calculated for three pairs of transmasculine speech samples before and after HRT onset. Sixteen speech samples with unique combinations of these acoustic features from each pair of speech samples were resynthesized. Twenty young adult listeners evaluated each synthesized speech sample for gender perception and synthetic quality. Two analyses of variance were used to investigate the effects of acoustic features on gender perception and synthetic quality. RESULTS Of the three acoustic features, mean f o was the only single feature that had a statistically significant effect on gender perception. Differences between the speech samples before and after HRT onset that were not captured by changes in f o and formant frequencies also had a statistically significant effect on gender perception. CONCLUSION In these transmasculine voice samples, mean f o was the most important acoustic feature for voice masculinization as a result of HRT; future investigations in a larger number of transmasculine speakers and on the effects of behavioral therapy-based changes in concert with HRT is warranted.
Collapse
|
23
|
LaDIVA: A neurocomputational model providing laryngeal motor control for speech acquisition and production. PLoS Comput Biol 2022; 18:e1010159. [PMID: 35737706 PMCID: PMC9258861 DOI: 10.1371/journal.pcbi.1010159] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Revised: 07/06/2022] [Accepted: 05/02/2022] [Indexed: 11/18/2022] Open
Abstract
Many voice disorders are the result of intricate neural and/or biomechanical impairments that are poorly understood. The limited knowledge of their etiological and pathophysiological mechanisms hampers effective clinical management. Behavioral studies have been used concurrently with computational models to better understand typical and pathological laryngeal motor control. Thus far, however, a unified computational framework that quantitatively integrates physiologically relevant models of phonation with the neural control of speech has not been developed. Here, we introduce LaDIVA, a novel neurocomputational model with physiologically based laryngeal motor control. We combined the DIVA model (an established neural network model of speech motor control) with the extended body-cover model (a physics-based vocal fold model). The resulting integrated model, LaDIVA, was validated by comparing its model simulations with behavioral responses to perturbations of auditory vocal fundamental frequency (fo) feedback in adults with typical speech. LaDIVA demonstrated capability to simulate different modes of laryngeal motor control, ranging from short-term (i.e., reflexive) and long-term (i.e., adaptive) auditory feedback paradigms, to generating prosodic contours in speech. Simulations showed that LaDIVA’s laryngeal motor control displays properties of motor equivalence, i.e., LaDIVA could robustly generate compensatory responses to reflexive vocal fo perturbations with varying initial laryngeal muscle activation levels leading to the same output. The model can also generate prosodic contours for studying laryngeal motor control in running speech. LaDIVA can expand the understanding of the physiology of human phonation to enable, for the first time, the investigation of causal effects of neural motor control in the fine structure of the vocal signal.
Collapse
|
24
|
What Can Altered Auditory Feedback Paradigms Tell Us About Vocal Motor Control in Individuals With Voice Disorders? PERSPECTIVES OF THE ASHA SPECIAL INTEREST GROUPS 2022; 7:959-976. [PMID: 37397620 PMCID: PMC10312128 DOI: 10.1044/2022_persp-21-00195] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Purpose The goal of this review article is to provide a summary of the progression of altered auditory feedback (AAF) as a method to understand the pathophysiology of voice disorders. This review article focuses on populations with voice disorders that have thus far been studied using AAF, including individuals with Parkinson's disease, cerebellar degeneration, hyperfunctional voice disorders, vocal fold paralysis, and laryngeal dystonia. Studies using AAF have found that individuals with Parkinson's disease, cerebellar degeneration, and laryngeal dystonia have hyperactive auditory feedback responses due to differing underlying causes. In persons with PD, the hyperactivity may be a compensatory mechanism for atypically weak feedforward motor control. In individuals with cerebellar degeneration and laryngeal dystonia, the reasons for hyperactivity remain unknown. Individuals with hyperfunctional voice disorders may have auditory-motor integration deficits, suggesting atypical updating of feedforward motor control. Conclusions These findings have the potential to provide critical insights to clinicians in selecting the most effective therapy techniques for individuals with voice disorders. Future collaboration between clinicians and researchers with the shared objective of improving AAF as an ecologically feasible and valid tool for clinical assessment may provide more personalized therapy targets for individuals with voice disorders.
Collapse
|
25
|
Voice and Speech Changes in Transmasculine Individuals Following Circumlaryngeal Massage and Laryngeal Reposturing. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2022; 31:1368-1382. [PMID: 35394801 PMCID: PMC9567379 DOI: 10.1044/2022_ajslp-21-00245] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 01/03/2022] [Accepted: 01/24/2022] [Indexed: 05/26/2023]
Abstract
PURPOSE The purpose of this study was to measure the short-term effects of circumlaryngeal massage and laryngeal reposturing on acoustic and perceptual characteristics of voice in transmasculine individuals. METHOD Fifteen transmasculine individuals underwent one session of sequential circumlaryngeal massage and laryngeal reposturing with a speech-language pathologist. Voice recordings were collected at three time points-baseline, postmassage, and postreposturing. Fundamental frequency (f o), formant frequencies, and relative fundamental frequency (RFF; an acoustic correlate of laryngeal tension) were measured. Estimates of vocal tract length (VTL) were derived from formant frequencies. Twelve listeners rated the perceived masculinity of participants' voices at each time point. Repeated-measures analyses of variance measured the effect of time point on f o, estimated VTL, RFF, and perceived voice masculinity. Significant effects were evaluated with post hoc Tukey's tests. RESULTS Between baseline and end of the session, f o decreased, VTL increased, and participant voices were perceived as more masculine, all with statistically significant differences. RFF did not differ significantly at any time point. Outcomes were highly variable at the individual level. CONCLUSION Circumlaryngeal massage and laryngeal reposturing have short-term effects on select acoustic (f o, estimated VTL) and perceptual characteristics (listener-assigned voice masculinity) of voice in transmasculine individuals. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.19529299.
Collapse
|
26
|
Effects of Age and Parkinson's Disease on the Relationship between Vocal Fold Abductory Kinematics and Relative Fundamental Frequency. J Voice 2022:S0892-1997(22)00070-4. [PMID: 35393167 PMCID: PMC9532464 DOI: 10.1016/j.jvoice.2022.03.007] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 03/07/2022] [Accepted: 03/08/2022] [Indexed: 11/16/2022]
Abstract
PURPOSE This study reports on two experiments to examine vocal fold abduction and its relationship with relative fundamental frequency (RFF), considering two attributes that have been shown to elicit group differences in RFF: age (Experiment 1) and Parkinson's disease (PD; Experiment 2). METHODS For both experiments, simultaneous acoustic and nasendoscopic recordings were collected as participants produced the utterance, /ifi/. RFF values were computed from the acoustic signal, whereas abduction duration and glottic angle at voicing offset were identified from the laryngoscopic images. In Experiment 1, 50 speakers with typical voices (18-83 years) were analyzed to examine (1A) the effects of speaker age on individual outcome measures (RFF, abduction duration, glottic angle) via Pearson's correlation coefficients, and (1B) the effects of abductory measures and age on RFF via an analysis of covariance. In Experiment 2, 20 speakers with PD and 20 matched controls were analyzed to examine (2A) the effects of group (with/without PD) on outcome measures via an analysis of variance, and (2B) the relationship of RFF with abduction duration, glottic angle, and age when considering group via an analysis of covariance. RESULTS Age demonstrated a significant, negative relationship with glottic angle (1A) but was not a significant factor when examining the relationship of vocal fold abduction and RFF (1B). Speaker group (with/without PD) demonstrated a significant effect on measures of RFF and abduction duration (2A) but was not a significant factor when examining the relationship of vocal fold abduction and RFF (2B). CONCLUSIONS RFF is sensitive to changes in vocal fold abductory patterns during devoicing, irrespective of speaker age or PD status.
Collapse
|
27
|
Clinical Cutoff Scores for Acoustic Indices of Vocal Hyperfunction That Combine Relative Fundamental Frequency and Cepstral Peak Prominence. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:1349-1369. [PMID: 35263546 PMCID: PMC9499364 DOI: 10.1044/2021_jslhr-21-00466] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
PURPOSE This study examined the discriminative ability of acoustic indices of vocal hyperfunction combining smoothed cepstral peak prominence (CPPS) and relative fundamental frequency (RFF). METHOD Demographic, CPPS, and RFF parameters were entered into logistic regression models trained on two 1:1 case-control groups: individuals with and without nonphonotraumatic vocal hyperfunction (NPVH; n = 360) and phonotraumatic vocal hyperfunction (PVH; n = 240). Equations from the final models were used to predict group membership in two independent test sets (n = 100 each). RESULTS Both CPPS and RFF parameters significantly improved model fits for NPVH and PVH after accounting for demographics. CPPS explained unique variance beyond RFF in both models. RFF explained unique variance beyond CPPS in the PVH model. Final models included CPPS and RFF offset parameters for both NPVH and PVH; RFF onset parameters were significant only in the PVH model. Area under the receiver operating characteristic curve analysis for the independent test sets revealed acceptable classification for NPVH (72%) and good classification for PVH (86%). CONCLUSIONS A combination of CPPS and RFF parameters showed better discriminative ability than either measure alone for PVH. Clinical cutoff scores for acoustic indices of vocal hyperfunction are proposed for assessment and screening purposes.
Collapse
|
28
|
Automated Relative Fundamental Frequency Algorithms for Use With Neck-Surface Accelerometer Signals. J Voice 2022; 36:156-169. [PMID: 32653267 PMCID: PMC7790853 DOI: 10.1016/j.jvoice.2020.06.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Accepted: 06/04/2020] [Indexed: 10/23/2022]
Abstract
OBJECTIVE Relative fundamental frequency (RFF) has been suggested as a potential acoustic measure of vocal effort. However, current clinical standards for RFF measures require time-consuming manual markings. Previous semi-automated algorithms have been developed to calculate RFF from microphone signals. The current study aimed to develop fully automated algorithms to calculate RFF from neck-surface accelerometer signals for ecological momentary assessment and ambulatory monitoring of voice. METHODS Training a set of 2646 /vowel-fricative-vowel/ utterances from 317 unique speakers, with and without voice disorders, was used to develop automated algorithms to calculate RFF values from neck-surface accelerometer signals. The algorithms first rejected utterances with poor vowel-to-noise ratios, then identified fricative locations, then used signal features to determine voicing boundary cycles, and finally calculated corresponding RFF values. These automated RFF values were compared to the clinical gold-standard of manual RFF calculated from simultaneously collected microphone signals in a novel test set of 639 utterances from 77 unique speakers. RESULTS Automated accelerometer-based RFF values resulted in an average mean bias error (MBE) across all cycles of 0.027 ST, with an MBE of 0.152 ST and -0.252 ST in the offset and onset cycles closest to the fricative, respectively. CONCLUSION All MBE values were smaller than the expected changes in RFF values following successful voice therapy, suggesting that the current algorithms could be used for ecological momentary assessment and ambulatory monitoring via neck-surface accelerometer signals.
Collapse
|
29
|
Impact of Vocal Effort on Respiratory and Articulatory Kinematics. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:5-21. [PMID: 34843405 PMCID: PMC9150749 DOI: 10.1044/2021_jslhr-21-00323] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 07/27/2021] [Accepted: 08/24/2021] [Indexed: 06/13/2023]
Abstract
PURPOSE The goal of this study was to examine the effects of increases in vocal effort, without changing speech intensity, on respiratory and articulatory kinematics in young adults with typical voices. METHOD A total of 10 participants completed a reading task under three speaking conditions: baseline, mild vocal effort, and maximum vocal effort. Respiratory inductance plethysmography bands around the chest and abdomen were used to estimate lung volumes during speech, and sensor coils for electromagnetic articulography were used to transduce articulatory movements, resulting in the following outcome measures: lung volume at speech initiation (LVSI) and at speech termination (LVST), articulatory kinematic vowel space (AKVS) of two points on the tongue dorsum (body and blade), and lip aperture. RESULTS With increases in vocal effort, and no statistical changes in speech intensity, speakers showed: (a) no statistically significant differences in LVST, (b) statistically significant increases in LVSI, (c) no statistically significant differences in AKVS measures, and (d) statistically significant reductions in lip aperture. CONCLUSIONS Speakers with typical voices exhibited larger lung volumes at speech initiation during increases in vocal effort, paired with reduced lip displacements. To our knowledge, this is the first study to demonstrate evidence that articulatory kinematics are impacted by modulations in vocal effort. However, the mechanisms underlying vocal effort may differ between speakers with and without voice disorders. Thus, future work should examine the relationship between articulatory kinematics, respiratory kinematics, and laryngeal-level changes during vocal effort in speakers with and without voice disorders. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.17065457.
Collapse
|
30
|
Assessing Ecologically Valid Methods of Auditory Feedback Measurement in Individuals With Typical Speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:121-135. [PMID: 34941381 PMCID: PMC9153919 DOI: 10.1044/2021_jslhr-21-00377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 09/15/2021] [Accepted: 09/16/2021] [Indexed: 06/14/2023]
Abstract
PURPOSE Auditory feedback is thought to contribute to the online control of speech production. Yet, the standard method of estimating auditory feedback control (i.e., reflexive responses to auditory-motor perturbations), although sound, requires specialized instrumentation, meticulous calibration, unnatural tasks, and specific acoustic environments. The purpose of this study was to explore more ecologically valid features of speech production to determine their relationships with auditory feedback mechanisms. METHOD Two previously proposed measures of within-utterance variability (centering and baseline variability) were compared with reflexive response magnitudes in 30 adults with typical speech. These three measures were estimated for both the laryngeal and articulatory subsystems of speech. RESULTS Regardless of the speech subsystem, neither centering nor baseline variability was shown to be related to reflexive response magnitudes. Likewise, no relationships were found between centering and baseline variability. CONCLUSIONS Despite previous suggestions that centering and baseline variability may be related to auditory feedback mechanisms, this study did not support these assertions. However, the detection of such relationships may have required a larger degree of variability in responses, relative to that found in those with typical speech. Future research on these relationships is warranted in populations with more heterogeneous responses, such as children or clinical populations. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.17330546.
Collapse
|
31
|
Feedback and Feedforward Auditory-Motor Processes for Voice and Articulation in Parkinson's Disease. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:4682-4694. [PMID: 34731577 PMCID: PMC9150666 DOI: 10.1044/2021_jslhr-21-00153] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 06/03/2021] [Accepted: 07/27/2021] [Indexed: 06/13/2023]
Abstract
PURPOSE Unexpected and sustained manipulations of auditory feedback during speech production result in "reflexive" and "adaptive" responses, which can shed light on feedback and feedforward auditory-motor control processes, respectively. Persons with Parkinson's disease (PwPD) have shown aberrant reflexive and adaptive responses, but responses appear to differ for control of vocal and articulatory features. However, these responses have not been examined for both voice and articulation in the same speakers and with respect to auditory acuity and functional speech outcomes (speech intelligibility and naturalness). METHOD Here, 28 PwPD on their typical dopaminergic medication schedule and 28 age-, sex-, and hearing-matched controls completed tasks yielding reflexive and adaptive responses as well as auditory acuity for both vocal and articulatory features. RESULTS No group differences were found for any measures of auditory-motor control, conflicting with prior findings in PwPD while off medication. Auditory-motor measures were also compared with listener ratings of speech function: first formant frequency acuity was related to speech intelligibility, whereas adaptive responses to vocal fundamental frequency manipulations were related to speech naturalness. CONCLUSIONS These results support that auditory-motor processes for both voice and articulatory features are intact for PwPD receiving medication. This work is also the first to suggest associations between measures of auditory-motor control and speech intelligibility and naturalness.
Collapse
|
32
|
Reliability and Accuracy of Expert Auditory-Perceptual Evaluation of Voice via Telepractice Platforms. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2021; 30:2446-2455. [PMID: 34473568 PMCID: PMC9132030 DOI: 10.1044/2021_ajslp-21-00091] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Revised: 05/23/2021] [Accepted: 05/24/2021] [Indexed: 05/24/2023]
Abstract
Purpose This study assessed the reliability and accuracy of auditory-perceptual voice evaluations by experienced clinicians via telepractice platforms. Method Voice samples from 20 individuals were recorded after transmission via telepractice platforms. Twenty experienced clinicians (10 speech-language pathologists, 10 laryngologists) evaluated the samples for dysphonia percepts (overall severity, roughness, breathiness, and strain) using a modified Consensus Auditory-Perceptual Evaluation of Voice. Reliability was calculated as the mean of squared differences between repeated ratings (intrarater agreement), and between individual and group mean ratings (interrater agreement). Repeated measures analyses of variance were constructed to measure effects of transmission condition (e.g., original recording, WebEx, Zoom), dysphonia percept, and their interaction on intrarater agreement, interrater agreement, and average ratings. Significant effects were evaluated with post hoc Tukey's tests. Results There were significant effects of transmission condition, percept, and their interaction on average ratings, and a significant effect of percept on interrater agreement. Post hoc testing revealed statistically, but not clinically, significant differences in average roughness ratings across transmission conditions, and significant differences in interrater agreement for several percepts. Overall severity had the highest agreement and strain had the lowest. Conclusion Telepractice transmission does not substantially reduce reliability or accuracy of auditory-perceptual voice evaluations by experienced clinicians.
Collapse
|
33
|
Listener Age and Gender Diversity: Effects on Voice-based Perception of Gender. J Voice 2021; 35:739-745. [PMID: 32165021 PMCID: PMC7483284 DOI: 10.1016/j.jvoice.2020.02.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Revised: 02/05/2020] [Accepted: 02/06/2020] [Indexed: 10/24/2022]
Abstract
OBJECTIVE An important clinical outcome of voice masculinization treatments in transmasculine speakers is voice-based perception of gender. Rigorous assessments of voice treatment that utilize ratings of perception of gender typically do not control for demographic characteristics of the listeners. The objective of the present study was to determine the effect of listeners' age and gender diversity on voice-based judgments of speaker gender. METHODS Speech stimuli were produced by a single transmasculine individual over approximately one year of hormone replacement therapy, during which he experienced significant changes in his voice. Three groups of listeners rated speech stimuli on a visual analog scale with anchors ranging from "definitely male" to "guessing male" to "guessing female" to "definitely female." Listener groups were N = 10 cisgender young adults, N = 10 cisgender older adults, and N = 10 gender diverse individuals. RESULTS All groups rated the speaker as consistently female through week 14 of hormone replacement therapy and consistently male after week 28. Mean responses of the three groups of listeners were highly correlated (Pearson's correlations all r > 0.97). CONCLUSION Given reasonable group sizes, average ratings of gender perception of a transmasculine speaker are not highly influenced by varying listener age and gender minority status.
Collapse
|
34
|
Accuracy of Acoustic Measures of Voice via Telepractice Videoconferencing Platforms. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:2586-2599. [PMID: 34157251 PMCID: PMC8632479 DOI: 10.1044/2021_jslhr-20-00625] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Revised: 12/19/2020] [Accepted: 03/23/2021] [Indexed: 05/31/2023]
Abstract
Purpose Telepractice improves patient access to clinical care for voice disorders. Acoustic assessment has the potential to provide critical, objective information during telepractice, yet its validity via telepractice is currently unknown. The current study investigated the accuracy of acoustic measures of voice in a variety of telepractice platforms. Method Twenty-nine voice samples from individuals with dysphonia were transmitted over six video conferencing platforms (Zoom with and without enhancements, Cisco WebEx, Microsoft Teams, Doxy.me, and VSee Messenger). Standard time-, spectral-, and cepstral-based acoustic measures were calculated. The effect of transmission condition on each acoustic measure was assessed using repeated-measures analyses of variance. For those acoustic measures for which transmission condition was a significant factor, linear regression analysis was performed on the difference between the original recording and each telepractice platform, with the overall severity of dysphonia, Internet speed, and ambient noise from the transmitter as predictors. Results Transmission condition was a statistically significant factor for all acoustic measures except for mean fundamental frequency (f o). Ambient noise from the transmitter was a significant predictor of differences between platforms and the original recordings for all acoustic measures except f o measures. All telepractice platforms affected acoustic measures in a statistically significantly manner, although the effects of platforms varied by measure. Conclusions Overall, measures of f o were the least impacted by telepractice transmission. Microsoft Teams had the least and Zoom (with enhancements) had the most pronounced effects on acoustic measures. These results provide valuable insight into the relative validity of acoustic measures of voice when collected via telepractice. Supplemental Material https://doi.org/10.23641/asha.14794812.
Collapse
|
35
|
Impaired auditory discrimination and auditory-motor integration in hyperfunctional voice disorders. Sci Rep 2021; 11:13123. [PMID: 34162907 PMCID: PMC8222324 DOI: 10.1038/s41598-021-92250-8] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Accepted: 06/04/2021] [Indexed: 12/04/2022] Open
Abstract
Hyperfunctional voice disorders (HVDs) are the most common class of voice disorders, consisting of diagnoses such as vocal fold nodules and muscle tension dysphonia. These speech production disorders result in effort, fatigue, pain, and even complete loss of voice. The mechanisms underlying HVDs are largely unknown. Here, the auditory-motor control of voice fundamental frequency (fo) was examined in 62 speakers with and 62 speakers without HVDs. Due to the high prevalence of HVDs in singers, and the known impacts of singing experience on auditory-motor function, groups were matched for singing experience. Speakers completed three tasks, yielding: (1) auditory discrimination of voice fo; (2) reflexive responses to sudden fo shifts; and (3) adaptive responses to sustained fo shifts. Compared to controls, and regardless of singing experience, individuals with HVDs showed: (1) worse auditory discrimination; (2) comparable reflexive responses; and (3) a greater frequency of atypical adaptive responses. Atypical adaptive responses were associated with poorer auditory discrimination, directly implicating auditory function in this motor disorder. These findings motivate a paradigm shift for understanding development and treatment of HVDs.
Collapse
|
36
|
The Effect of Visual Sort and Rate Versus Visual Analog Scales on the Reliability of Judgments of Dysphonia. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:1571-1580. [PMID: 33909472 PMCID: PMC8608224 DOI: 10.1044/2021_jslhr-20-00623] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Purpose The reliability of auditory-perceptual judgments between listeners is a long-standing problem in the assessment of voice disorders. The purpose of this study was to determine whether a relatively novel experimental scaling method, called visual sort and rate (VSR), yielded stronger reliability than the more frequently used method of visual analog scales (VAS) for ratings of overall severity (OS) and breathiness (BR) in speakers with voicedisorders. Method Fifty speech samples were selected from a database of speakers with voice disorders. Twenty-two inexperienced listeners provided ratings of OS or BR in four rating blocks: VSR-OS, VSR-BR, VAS-OS, and VSR-BR. For the VAS task, listeners rated each speaker for BR or OS using a vertically oriented 100-mm VAS. For the VSR task, stimuli were distributed into sets of samples with a range of speaker severities in each set. Listeners sorted and ranked samples for OS or BR within each set, and final ratings were captured on a vertically oriented 100-mm VAS. Interrater variability, defined as the mean of the squared differences between a listener's ratings and group mean ratings, and intrarater reliability (Pearson r) were compared across rating tasks for OS and BR using paired t tests. Results Results showed that listeners had significantly less interrater variability (better reliability) when using VSR methods compared to VAS for judgments of both OS and BR. Intrarater reliability was high across rating tasks and dimensions; however, ratings of BR were significantly more consistent within individual listeners when using VAS than when using VSR. Conclusions VSR is an experimental method that decreases variability of auditory-perceptual judgments between inexperienced listeners when rating speakers with a range of dysphonic severities and disorders. Future research should determine whether a clinically viable tool may be developed based on VSR principles and whether such benefits extend to experienced listeners.
Collapse
|
37
|
Physics of phonation offset: Towards understanding relative fundamental frequency observations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:3654. [PMID: 34241131 PMCID: PMC8163514 DOI: 10.1121/10.0005006] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Revised: 04/15/2021] [Accepted: 04/21/2021] [Indexed: 05/26/2023]
Abstract
Relative fundamental frequency (RFF) is a promising assessment technique for vocal pathologies. Herein, we explore the underlying laryngeal factors dictating RFF behaviours during phonation offset. To gain physical insights, we analyze a simple impact oscillator model and follow that with a numerical study using the well-established body-cover model of the vocal folds (VFs). Study of the impact oscillator suggests that the observed decrease in fundamental frequency during offset is due, at least in part, to the increase in the neutral gap between the VFs during abduction and the concomitant decrease in collision forces. Moreover, the impact oscillator elucidates a correlation between sharper drops in RFF and increased stiffness of the VFs, supporting experimental RFF studies. The body-cover model study further emphasizes the correlation between the drops in RFF and collision forces. The numerical analysis also illustrates the sensitivity of RFF to abduction initiation time relative to the phase of the phonation cycle, and the abduction period length. In addition, the numerical simulations display the potential role of the cricothyroid muscle to mitigate the RFF reduction. Last, simplified models of phonotraumatic vocal hyperfunction are explored, demonstrating that the observed sharper drops in RFF are associated with increased pre-offset collision forces.
Collapse
|
38
|
Oral configurations during vowel nasalization in English. SPEECH COMMUNICATION 2021; 129:17-24. [PMID: 34621100 PMCID: PMC8492006 DOI: 10.1016/j.specom.2021.02.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Speech nasalization is achieved primarily through the opening and closing of the velopharyngeal port. However, the resultant acoustic features can also be influenced by tongue configuration. Although vowel nasalization is not contrastive in English, two previous studies have found possible differences in the oral articulation of nasal and oral vowel productions, albeit with inconsistent results. In an attempt to further understand the conflicting findings, we evaluated the oral kinematics of nasalized and non-nasalized vowels in a cohort of both male and female American English speakers via electromagnetic articulography. Tongue body and lip positions were captured during vowels produced in nasal and oral contexts (e.g., /mɑm/, /bɑb/). Large contrasts were seen in all participants between tongue position of /æ/ in oral and nasal contexts, in which tongue positions were higher and more forward during /mæm/ than /bæb/. Lip aperture was smaller in a nasal context for /æ/. Lip protrusion was not different between vowels in oral and nasal contexts. Smaller contrasts in tongue and lip position were seen for vowels /ɑ, i, u/; this is consistent with biomechanical accounts of vowel production that suggest that /i, u/ are particularly constrained, whereas /æ/ has fewer biomechanical constraints, allowing for more flexibility for articulatory differences in different contexts. Thus we conclude that speakers of American English do indeed use different oral configurations for vowels that are in nasal and oral contexts, despite vowel nasalization being non-contrastive. This effect was consistent across speakers for only one vowel, perhaps accounting for previously-conflicting results.
Collapse
|
39
|
Acoustic Identification of the Voicing Boundary during Intervocalic Offsets and Onsets based on Vocal Fold Vibratory Measures. APPLIED SCIENCES (BASEL, SWITZERLAND) 2021; 11:3816. [PMID: 36188437 PMCID: PMC9524108 DOI: 10.3390/app11093816] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Methods for automating relative fundamental frequency (RFF)-an acoustic estimate of laryngeal tension-rely on manual identification of voiced/unvoiced boundaries from acoustic signals. This study determined the effect of incorporating features derived from vocal fold vibratory transitions for acoustic boundary detection. Simultaneous microphone and flexible nasendoscope recordings were collected from adults with typical voices (N=69) and with voices characterized by excessive laryngeal tension (N=53) producing voiced-unvoiced-voiced utterances. Acoustic features that coincided with vocal fold vibratory transitions were identified and incorporated into an automated RFF algorithm ("aRFF-APH"). Voiced/unvoiced boundary detection accuracy was compared between the aRFF-APH algorithm, a recently published version of the automated RFF algorithm ("aRFF-AP"), and gold-standard, manual RFF estimation. Chi-square tests were performed to characterize differences in boundary cycle identification accuracy among the three RFF estimation methods. Voiced/unvoiced boundary detection accuracy significantly differed by RFF estimation method for voicing offsets and onsets. Of 7721 productions, 76.0% of boundaries were accurately identified via the aRFF-APH algorithm, compared to 70.3% with the aRFF-AP algorithm and 20.4% with manual estimation. Incorporating acoustic features that corresponded with voiced/unvoiced boundaries led to improvements in boundary detection accuracy that surpassed the gold-standard method for calculating RFF.
Collapse
|
40
|
Changes in Relative Fundamental Frequency Under Increased Cognitive Load in Individuals With Healthy Voices. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:1189-1196. [PMID: 33788635 PMCID: PMC8608166 DOI: 10.1044/2021_jslhr-20-00134] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2020] [Revised: 11/19/2020] [Accepted: 01/06/2021] [Indexed: 05/26/2023]
Abstract
Purpose The purpose of this study was to determine the effect of cognitive load on relative fundamental frequency (RFF) in individuals with healthy voices. Method Twenty adults with healthy voices read sentences under different cognitive load conditions. Each sentence contained color terms printed in colored ink, creating an embedded Stroop task. Participants read the ink color in which a word was printed, rather than the color term itself. Sentences with mismatched ink colors and printed words constituted an increased cognitive load. RFF, an acoustic correlate of laryngeal tension, was calculated for the 10 voicing cycles preceding (i.e., offset) and following (i.e., onset) voiceless consonants. Repeated measures analyses of variance were constructed to assess the effects of RFF cycle, cognitive load, and their interaction on mean RFF offset and onset. Results There was a significant effect of cognitive load condition on RFF offset. There was no significant effect of condition on RFF onset nor significant interaction between cycle and condition on RFF onset or offset values. Conclusion Reduced mean RFF offset may indicate an increase in laryngeal muscle tension during a cognitively demanding task.
Collapse
|
41
|
The Relationship Between Voice Onset Time and Increase in Vocal Effort and Fundamental Frequency. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:1197-1209. [PMID: 33820431 PMCID: PMC8608153 DOI: 10.1044/2021_jslhr-20-00505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Revised: 10/19/2020] [Accepted: 01/13/2021] [Indexed: 06/12/2023]
Abstract
Purpose Prior work suggests that voice onset time (VOT) may be impacted by laryngeal tension: VOT means decrease when individuals with typical voices increase their fundamental frequency (f o) and VOT variability is increased in individuals with vocal hyperfunction, a voice disorder characterized by increased laryngeal tension. This study further explored the relationship between VOT and laryngeal tension during increased f o, vocal effort, and vocal strain. Method Sixteen typical speakers of American English were instructed to produce VOT utterances under four conditions: baseline, high pitch, effort, and strain. Repeated-measures analysis of variance models were used to analyze the effects of condition on VOT means and standard deviations (SDs); pairwise comparisons were used to determine significant differences between conditions. Results Voicing, condition, and their interaction significantly affected VOT means. Voiceless VOT means significantly decreased for high pitch (p < .001) relative to baseline; however, no changes in voiceless VOT means were found for effort or strain relative to baseline. Although condition had a significant effect on VOT SDs, there were no significant differences between effort, strain, and high pitch conditions relative to baseline. Conclusions Speakers with typical voices likely engage different musculature to increase pitch than to increase vocal effort and strain. The increased VOT variability present with vocal hyperfunction is not seen in individuals with typical voices using increased effort and strain, supporting the assertion that this feature of vocal hyperfunction may be related to disordered vocal motor control rather than resulting from effortful voice production.
Collapse
|
42
|
Vocal fold kinematics and relative fundamental frequency as a function of obstruent type and speaker age. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:2189. [PMID: 33940922 PMCID: PMC8018794 DOI: 10.1121/10.0003961] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2020] [Revised: 03/02/2021] [Accepted: 03/11/2021] [Indexed: 06/12/2023]
Abstract
The acoustic measure, relative fundamental frequency (RFF), has been proposed as an objective metric for assessing vocal hyperfunction; however, its underlying physiological mechanisms have not yet been fully characterized. This study aimed to characterize the relationship between RFF and vocal fold kinematics. Simultaneous acoustic and high-speed videoendoscopic (HSV) recordings were collected as younger and older speakers repeated the utterances /ifi/ and /iti/. RFF values at voicing offsets and onsets surrounding the obstruents were estimated from acoustic recordings, whereas glottal angles, durations of voicing offset and onset, and a kinematic estimate of laryngeal stiffness (KS) were obtained from HSV images. No differences were found between younger and older speakers for any measure. RFF did not differ between the two obstruents at voicing offset; however, fricatives necessitated larger glottal angles and longer durations to devoice. RFF values were lower and glottal angles were greater for stops relative to fricatives at voicing onset. KS values were greater in stops relative to fricatives. The less adducted vocal folds with greater KS and lower RFF at voicing onset for stops relative to fricatives in this study were in accordance with prior speculations that decreased vocal fold contact area and increased laryngeal stiffness may decrease RFF.
Collapse
|
43
|
Perceptual and Acoustic Assessment of Strain Using Synthetically Modified Voice Samples. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:3897-3908. [PMID: 33151770 PMCID: PMC8608200 DOI: 10.1044/2020_jslhr-20-00294] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 07/23/2020] [Accepted: 08/17/2020] [Indexed: 06/11/2023]
Abstract
Purpose Assessment of strained voice quality is difficult due to the weak reliability of auditory-perceptual evaluation and lack of strong acoustic correlates. This study evaluated the contributions of relative fundamental frequency (RFF) and mid-to-high frequency noise to the perception of strain. Method Stimuli were created using recordings of speakers producing /ifi/ with a comfortable voice and with maximum vocal effort. RFF values of the comfortable voice samples were synthetically lowered, and RFF values of the maximum vocal effort samples were synthetically raised. Mid-to-high frequency noise was added to the samples. Twenty listeners rated strain in a visual sort-and-rate task. The effects of RFF modification and added noise on strain were assessed using an analysis of variance; intra- and interrater reliability were compared with and without noise. Results Lowering RFF in the comfortable voice samples increased their perceived strain, whereas raising RFF in the maximum vocal effort samples decreased their strain. Adding noise increased strain and decreased intra- and interrater reliability relative to samples without added noise. Conclusions Both RFF and mid-to-high frequency noise contribute to the perception of strain. The presence of dysphonia may decrease the reliability of auditory-perceptual evaluation of strain, which supports the need for complementary objective assessments. Supplemental Material https://doi.org/10.23641/asha.13172252.
Collapse
|
44
|
The Relation of Articulatory and Vocal Auditory-Motor Control in Typical Speakers. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:3628-3642. [PMID: 33079610 PMCID: PMC8582832 DOI: 10.1044/2020_jslhr-20-00192] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 07/16/2020] [Accepted: 08/12/2020] [Indexed: 05/14/2023]
Abstract
Purpose The purpose of this study was to explore the relationship between feedback and feedforward control of articulation and voice by measuring reflexive and adaptive responses to first formant (F 1) and fundamental frequency (f o) perturbations. In addition, perception of F 1 and f o perturbation was estimated using passive (listening) and active (speaking) just noticeable difference paradigms to assess the relation of auditory acuity to reflexive and adaptive responses. Method Twenty healthy women produced single words and sustained vowels while the F 1 or f o of their auditory feedback was suddenly and unpredictably perturbed to assess reflexive responses or gradually and predictably perturbed to assess adaptive responses. Results Typical speakers' reflexive responses to sudden perturbation of F 1 were related to their adaptive responses to gradual perturbation of F 1. Specifically, speakers with larger reflexive responses to sudden perturbation of F 1 had larger adaptive responses to gradual perturbation of F 1. Furthermore, their reflexive responses to sudden perturbation of F 1 were associated with their passive auditory acuity to F 1 such that speakers with better auditory acuity to F 1 produced larger reflexive responses to sudden perturbations of F 1. Typical speakers' adaptive responses to gradual perturbation of F 1 were not associated with their auditory acuity to F 1. Speakers' reflexive and adaptive responses to perturbation of f o were not related, nor were their responses related to either measure of auditory acuity to f o. Conclusion These findings indicate that there may be disparate feedback and feedforward control mechanisms for articulatory and vocal error correction based on auditory feedback.
Collapse
|
45
|
An Updated Theoretical Framework for Vocal Hyperfunction. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2020; 29:2254-2260. [PMID: 33007164 PMCID: PMC8740570 DOI: 10.1044/2020_ajslp-20-00104] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2020] [Revised: 08/07/2020] [Accepted: 08/09/2020] [Indexed: 05/21/2023]
Abstract
Purpose The purpose of this viewpoint article is to facilitate research on vocal hyperfunction (VH). VH is implicated in the most commonly occurring types of voice disorders, but there remains a pressing need to increase our understanding of the etiological and pathophysiological mechanisms associated with VH to improve the prevention, diagnosis, and treatment of VH-related disorders. Method A comprehensive theoretical framework for VH is proposed based on an integration of prevailing clinical views and research evidence. Results The fundamental structure of the current framework is based on a previous (simplified) version that was published over 30 years ago (Hillman et al., 1989). A central premise of the framework is that there are two primary manifestations of VH-phonotraumatic VH and nonphonotraumatic VH-and that multiple factors contribute and interact in different ways to cause and maintain these two types of VH. Key hypotheses are presented about the way different factors may contribute to phonotraumatic VH and nonphonotraumatic VH and how the associated disorders may respond to treatment. Conclusions This updated and expanded framework is meant to help guide future research, particularly the design of longitudinal studies, which can lead to a refinement in knowledge about the etiology and pathophysiology of VH-related disorders. Such new knowledge should lead to further refinements in the framework and serve as a basis for improving the prevention and evidence-based clinical management of VH.
Collapse
|
46
|
Hey Siri: How Effective are Common Voice Recognition Systems at Recognizing Dysphonic Voices? Laryngoscope 2020; 131:1599-1607. [PMID: 32949415 DOI: 10.1002/lary.29082] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 08/13/2020] [Accepted: 08/16/2020] [Indexed: 11/11/2022]
Abstract
OBJECTIVES/HYPOTHESIS Interaction with voice recognition systems, such as Siri™ and Alexa™, is an increasingly important part of everyday life. Patients with voice disorders may have difficulty with this technology, leading to frustration and reduction in quality of life. This study evaluates the ability of common voice recognition systems to transcribe dysphonic voices. STUDY DESIGN Retrospective evaluation of "Rainbow Passage" voice samples from patients with and without voice disorders. METHODS Participants with (n = 30) and without (n = 23) voice disorders were recorded reading the "Rainbow Passage". Recordings were played at standardized intensity and distance-to-dictation programs on Apple iPhone 6S™, Apple iPhone 11 Pro™, and Google Voice™. Word recognition scores were calculated as the proportion of correctly transcribed words. Word recognition scores were compared to auditory-perceptual and acoustic measures. RESULTS Mean word recognition scores for participants with and without voice disorders were, respectively, 68.6% and 91.9% for Apple iPhone 6S™ (P < .001), 71.2% and 93.7% for Apple iPhone 11 Pro™ (P < .001), and 68.7% and 93.8% for Google Voice™ (P < .001). There were strong, approximately linear associations between CAPE-V ratings of overall severity of dysphonia and word recognition score, with correlation coefficients (R2 ) of 0.609 (iPhone 6S™), 0.670 (iPhone 11 Pro™), and 0.619 (Google Voice™). These relationships persisted when controlling for diagnosis, age, gender, fundamental frequency, and speech rate (P < .001 for all systems). CONCLUSION Common voice recognition systems function well with nondysphonic voices but are poor at accurately transcribing dysphonic voices. There was a strong negative correlation with word recognition scores and perceptual voice evaluation. As our society increasingly interfaces with automated voice recognition technology, the needs of patients with voice disorders should be considered. LEVEL OF EVIDENCE 4 Laryngoscope, 131:1599-1607, 2021.
Collapse
|
47
|
Auditory-Motor Perturbations of Voice Fundamental Frequency: Feedback Delay and Amplification. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:2846-2860. [PMID: 32755506 PMCID: PMC7890227 DOI: 10.1044/2020_jslhr-19-00407] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Revised: 03/30/2020] [Accepted: 06/10/2020] [Indexed: 06/11/2023]
Abstract
Purpose Gradual and sudden perturbations of vocal fundamental frequency (f o), also known as adaptive and reflexive f o perturbations, are techniques to study the influence of auditory feedback on voice f o control mechanisms. Previous vocal f o perturbations have incorporated varied setup-specific feedback delays and amplifications. Here, we investigated the effects of feedback delays (10-100 ms) and amplifications on both adaptive and reflexive f o perturbation paradigms, encapsulating the variability in equipment-specific delays (3-45 ms) and amplifications utilized in previous experiments. Method Responses to adaptive and reflexive f o perturbations were recorded in 24 typical speakers for four delay conditions (10, 40, 70, and 100 ms) or three amplification conditions (-10, +5, and +10 dB relative to microphone) in a counterbalanced order. Repeated-measures analyses of variance were carried out on the magnitude of f o responses to determine the effect of feedback condition. Results There was a statistically significant effect of the level of auditory feedback amplification on the response magnitude during adaptive f o perturbations, driven by the difference between +10- and -10-dB amplification conditions (hold phase difference: M = 38.3 cents, SD = 51.2 cents; after-effect phase: M = 66.1 cents, SD = 84.6 cents). No other statistically significant effects of condition were found for either paradigm. Conclusions Experimental equipment delays below 100 ms in behavioral paradigms do not affect the results of f o perturbation paradigms. As there is no statistically significant difference between the response magnitudes elicited by +5- and +10-dB auditory amplification conditions, this study is a confirmation that an auditory feedback amplification of +5 dB relative to microphone is sufficient to elicit robust compensatory responses for f o perturbation paradigms.
Collapse
|
48
|
The Impact of Communication Modality on Voice Production. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:2913-2920. [PMID: 32762517 PMCID: PMC7890225 DOI: 10.1044/2020_jslhr-20-00161] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Revised: 06/04/2020] [Accepted: 06/18/2020] [Indexed: 06/11/2023]
Abstract
Purpose Communicating remotely using audio and audiovisual technology is ubiquitous in modern work and social environments. Remote communication is increasing in medicine and in voice therapy delivery, and this evolution may have an impact on speakers' voices. This study sought to determine whether these communication modalities impact the voice production of typical speakers. Method The speech acoustics of 12 participants with healthy voices were recorded as they held standardized conversations with a single investigator using three communication modalities: in-person, remote-audio, and remote-audiovisual. Participants rated their vocal effort on a 100-mm visual analog scale. Results Compared to in-person communication, self-ratings of vocal effort were statistically significantly increased for remote-audiovisual communication; vocal effort during remote-audio and in-person communication were not significantly different. In comparison to in-person communication, vocal intensity and smoothed cepstral peak prominence (CPPS) were statistically significantly higher during remote-audio and remote-audiovisual communication. Effect sizes for CPPS changes were larger than for sound pressure level (SPL), and changes in CPPS and SPL between in-person and remote-audiovisual communication were not significantly correlated. Conclusions Vocal effort and SPL were increased when using remote-audio and remote-audiovisual communication in comparison to in-person communication. Voice quality was also impacted by technology use, with changes in CPPS that were consistent with, but not fully explained by, increases in SPL. This may impact the telepractice delivery of voice therapy, and further investigation is warranted.
Collapse
|
49
|
Acuity to Changes in Self-Generated Vocal Pitch in Parkinson's Disease. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:3208-3214. [PMID: 32853119 PMCID: PMC7890224 DOI: 10.1044/2020_jslhr-20-00003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/02/2020] [Revised: 06/19/2020] [Accepted: 06/22/2020] [Indexed: 06/11/2023]
Abstract
Purpose Given the role of auditory perception in voice production, studies have investigated whether impairments in auditory perception may underlie the noted disruptions in speech in Parkinson's disease (PD). Studies of loudness perception in PD show impairments in the perception of self-generated speech, but not external tones. Studies of pitch perception in PD have only examined external tones, but these studies differed in terms of the interstimulus intervals (ISIs) that were used, did not examine the impact of cognition, and report conflicting results. To clarify pitch perception in PD, this work investigated perception of self-generated vocal pitch, controlling for cognition and ISI. Method A total of 30 individuals with and without PD completed (a) hearing threshold testing, (b) the Montreal Cognitive Assessment, and (c) an adaptive just-noticeable-difference paradigm under two separate ISIs (100 ms and 1,000 ms) to assess acuity to self-generated vocal pitch. Results There was no significant difference in acuity between individuals with and without PD. Both groups demonstrated significantly worse acuity for longer compared to shorter ISIs. Montreal Cognitive Assessment scores were not a significant predictor of acuity. Conclusions The results suggest that acuity to self-generated vocal pitch does not differ between individuals with and without PD.
Collapse
|
50
|
Acoustic Model of Perceived Overall Severity of Dysphonia in Adductor-Type Laryngeal Dystonia. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:2713-2722. [PMID: 32692616 PMCID: PMC7872728 DOI: 10.1044/2020_jslhr-19-00354] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Revised: 01/28/2020] [Accepted: 05/19/2020] [Indexed: 05/19/2023]
Abstract
Purpose This study is a secondary analysis of existing data. The goal of the study was to construct an acoustic model of perceived overall severity of dysphonia in adductory laryngeal dystonia (AdLD). We predicted that acoustic measures (a) related to voice and pitch breaks and (b) related to vocal effort would form the primary elements of a model corresponding to auditory-perceptual ratings of overall severity of dysphonia. Method Twenty inexperienced listeners evaluated the overall severity of dysphonia of speech stimuli from 19 individuals with AdLD. Acoustic features related to primary signs of AdLD (hyperadduction resulting in pitch and voice breaks) and to a potential secondary symptom of AdLD (vocal effort, measures of relative fundamental frequency) were computed from the speech stimuli. Multiple linear regression analysis was applied to construct an acoustic model of the overall severity of dysphonia. Results The acoustic model included an acoustic feature related to pitch and voice breaks and three acoustic measures derived from relative fundamental frequency; it explained 84.9% of the variance in the auditory-perceptual ratings of overall severity of dysphonia in the speech samples. Conclusions Auditory-perceptual ratings of overall severity of dysphonia in AdLD were related to acoustic features of primary signs (pitch and voice breaks, hyperadduction associated with laryngeal spasms) and were also related to acoustic features of vocal effort. This suggests that compensatory vocal effort may be a secondary symptom in AdLD. Future work to generalize this acoustic model to a larger, independent data set is necessary before clinical translation is warranted.
Collapse
|