1
|
Dimos K, He L, Dellwo V. Shouting affects temporal properties of the speech amplitude envelope. JASA EXPRESS LETTERS 2024; 4:015202. [PMID: 38169314 DOI: 10.1121/10.0023995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 11/27/2023] [Indexed: 01/05/2024]
Abstract
Distinguishing shouted from non-shouted speech is crucial in communication. We examined how shouting affects temporal properties of the amplitude envelope (ENV) in a total of 720 sentences read by 18 Swiss German speakers in normal and shouted modes; shouting was characterised by maintaining sound pressure levels of ≥80 dB sound pressure level (dB-SPL) (C-weighted) at a 1-meter distance from the mouth. Generalized additive models revealed significant temporal alterations of ENV in shouted speech, marked by steeper ascent, delayed peak, and extended high levels. These findings offer potential cues for identifying shouting, particularly useful when fine-structure and dynamic range cues are absent, for example, in cochlear implant users.
Collapse
Affiliation(s)
- Kostis Dimos
- Department of Computational Linguistics, University of Zurich, Zurich, , ,
| | - Lei He
- Department of Computational Linguistics, University of Zurich, Zurich, , ,
| | - Volker Dellwo
- Department of Computational Linguistics, University of Zurich, Zurich, , ,
| |
Collapse
|
2
|
Knowles T, Badh G. Impact of Face Masks on Speech in Parkinson's Disease: Effect of Clear and Loud Speech Styles. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:3052-3075. [PMID: 36827515 DOI: 10.1044/2022_jslhr-22-00291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
PURPOSE The purpose of this study is to quantify the combined effects of face masks and effortful speech styles on speech intensity, spectral moments, and measures of spectral balance in talkers with Parkinson's disease. METHOD Fifteen people with Parkinson's disease and 15 healthy, older controls read aloud sentences in three face mask conditions and three speech style conditions. Mask conditions included no mask, surgical masks, and KN95 masks. Speech styles included habitual, clear, and loud. Acoustic measures of intensity, spectral moments, and spectral balance were modeled as a function of speaker group, mask, and speech style. RESULTS Overall, talkers with PD demonstrated lower concentrations of high-frequency spectral energy in their speech. Face masks attenuated high-frequency energy, whereas clear followed by loud speaking styles amplified high frequencies. Overall, the attenuation observed by face masks was preserved across speech styles, and both mask and speech patterns were observed to be similar across groups. DISCUSSION Clear and loud speech styles were effective in compensating for the damping effects of masks in talkers with and without PD. However, given that people with PD demonstrated poorer overall spectral balance compared to controls, the gains afforded by speaking clearly or loudly may be limited when wearing a face mask.
Collapse
Affiliation(s)
- Thea Knowles
- Department of Communicative Sciences and Disorders, Michigan State University, East Lansing
| | - Gursharan Badh
- Department of Communicative Disorders and Sciences, University at Buffalo, NY
| |
Collapse
|
3
|
Fujiki RB, Kostas G, Thibeault SL. Relationship Between Auditory-Perceptual and Objective Measures of Resonance in Children with Cleft Palate: Effects of Intelligibility and Dysphonia. Cleft Palate Craniofac J 2023:10556656231162238. [PMID: 36890706 DOI: 10.1177/10556656231162238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/10/2023] Open
Abstract
To investigate the relationship between auditory-perceptual ratings of resonance and nasometry scores in children with cleft palate. Factors which may impact this relationship were examined including articulation, intelligibility, dysphonia, sex, and cleft-related diagnosis. Retrospective, observational cohort study. Outpatient pediatric cranio-facial anomalies clinic. Four hundred patients <18 years of age identified with CP ± L, seen for auditory-perceptual and nasometry evaluations of hypernasality as well as assessments of articulation and voice. Relationship between auditory-perceptual ratings of resonance and nasometry scores. Pearson's correlations indicated that auditory-perceptual resonance ratings and nasometry scores were significantly correlated across oral-sound stimuli on the picture-cued portion of the MacKay-Kummer SNAP-R Test (r values .69 to.72) and the zoo reading passage (r = .72). Linear regression indicated that intelligibility (p ≤ .001) and dysphonia (p = .009) significantly impacted the relationship between perceptual and objective assessments of resonance on the Zoo passage. Moderation analyses indicated that the relationship between auditory-perceptual and nasometry values weakened as severity of speech intelligibility increased (P < .001) and when children presented with moderate dysphonia (p ≤ .001). No significant impact of articulation testing or sex were observed. Speech intelligibility and dysphonia alter the relationship between auditory-perceptual and nasometry assessments of hypernasality in children with cleft palate. SLPs should be aware of potential sources of auditory-perceptual bias and shortcomings of the Nasometer when following patients with limited intelligibility or moderate dysphonia. Future study may identify the mechanisms by which intelligibility and dysphonia affect auditory-perceptual and nasometry evaluations.
Collapse
Affiliation(s)
| | - George Kostas
- Department of Surgery, 5228University of Wisconsin Madison, Madison, WI, USA
| | - Susan L Thibeault
- Department of Surgery, 5228University of Wisconsin Madison, Madison, WI, USA
| |
Collapse
|
4
|
Hakanpää T, Waaramaa T, Laukkanen AM. Training the Vocal Expression of Emotions in Singing: Effects of Including Acoustic Research-Based Elements in the Regular Singing Training of Acting Students. J Voice 2023; 37:293.e7-293.e23. [PMID: 33495033 DOI: 10.1016/j.jvoice.2020.12.032] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Revised: 12/21/2020] [Accepted: 12/22/2020] [Indexed: 11/16/2022]
Abstract
OBJECTIVES This study examines the effects of including acoustic research-based elements of the vocal expression of emotions in the singing lessons of acting students during a seven-week teaching period. This information may be useful in improving the training of interpretation in singing. STUDY DESIGN Experimental comparative study. METHODS Six acting students participated in seven weeks of extra training concerning voice quality in the expression of emotions in singing. Song samples were recorded before and after the training. A control group of six acting students were recorded twice within a seven-week period, during which they participated in ordinary training. All participants sang on the vowel [a:] and on a longer phrase expressing anger, sadness, joy, tenderness, and neutral states. The vowel and phrase samples were evaluated by 34 listeners for the perceived emotion. Additionally, the vowel samples were analyzed for formant frequencies (F1-F4), sound pressure level (SPL), spectral structure (Alpha ratio = SPL 1500-5000 Hz - SPL 50-1500 Hz), harmonic-to-noise ratio (HNR), and perturbation (jitter, shimmer). RESULTS The number of correctly perceived expressions improved in the test group's vowel samples, while no significant change was observed in the control group. The overall recognition was higher for the phrases than for the vowel samples. Of the acoustic parameters, F1 and SPL significantly differentiated emotions in both groups, and HNR specifically differentiated emotions in the test group. The Alpha ratio was found to statistically significantly differentiate emotion expression after training. CONCLUSIONS The expression of emotion in the singing voice improved after seven weeks of voice quality training. The F1, SPL, Alpha ratio, and HNR differentiated emotional expression. The variation in acoustic parameters became wider after training. Similar changes were not observed after seven weeks of ordinary voice training.
Collapse
Affiliation(s)
- Tua Hakanpää
- Speech and Voice Research Laboratory, Faculty of Social Sciences, Tampere University, Tampere, Finland.
| | - Teija Waaramaa
- Speech and Voice Research Laboratory, Faculty of Social Sciences, Tampere University, Tampere, Finland; Communication Sciences, University of Vaasa, Vaasa, Finland
| | - Anne-Maria Laukkanen
- Speech and Voice Research Laboratory, Faculty of Social Sciences, Tampere University, Tampere, Finland
| |
Collapse
|
5
|
Garnier M, Smith J, Wolfe J. Lip hyper-articulation in loud voice: Effect on resonance-harmonic proximity. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:3695. [PMID: 36586885 DOI: 10.1121/10.0016595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 12/01/2022] [Indexed: 06/17/2023]
Abstract
Men and women speakers were recorded while producing sustained vowels at comfortable and loud levels. Following comfortable speech, loud levels were produced in three different conditions: first without specific instruction (UL); then maintaining the same pitch as the comfortable level (PL); and finally, keeping both pitch and lip articulation constant (PAL). The sound pressure level, the fundamental frequency ( fo), the first two vocal tract resonances (R1 and R2), the lip geometry, and the larynx height were measured. For women, a closer proximity of R1 to its nearest harmonic, nfo, was observed in UL. However, no such increased proximity was found in PL, when speakers could, and did, hyper-articulate. Also, no increased proximity was observed in PAL, when lip articulation was constrained. No significant increase in R1: nfo proximity was observed in men in any of the three loud conditions. Finally, R2 was not observed significantly closer to a voice harmonic in loud speech, for neither men nor women.
Collapse
Affiliation(s)
- Maëva Garnier
- Univ. Grenoble Alpes, CNRS, Grenoble Institute of Engineering Univ. Grenoble Alpes, GIPSA-Lab, 38000 Grenoble, France
| | - John Smith
- School of Physics, UNSW Sydney, Sydney, New South Wales 2052, Australia
| | - Joe Wolfe
- School of Physics, UNSW Sydney, Sydney, New South Wales 2052, Australia
| |
Collapse
|
6
|
Zainaee S, Khadivi E, Jamali J, Sobhani-Rad D, Maryn Y, Ghaemi H. The acoustic voice quality index, version 2.06 and 3.01, for the Persian-speaking population. JOURNAL OF COMMUNICATION DISORDERS 2022; 100:106279. [PMID: 36399989 DOI: 10.1016/j.jcomdis.2022.106279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 11/05/2022] [Accepted: 11/07/2022] [Indexed: 06/16/2023]
Abstract
INTRODUCTION Dysphonia assessment includes approaches like acoustic analysis, which is non-invasive and easy to use and provides an understandable numerical output. The Acoustic Voice Quality Index (AVQI) is an acoustic model that can detect dysphonia. The Persian language is spoken by around 70,000,000 native speakers. Since AVQI versions 2.06 and 3.01 have not been validated for the Persian yet, this study investigated their concurrent validity and diagnostic accuracy among the Persian-speaking population. METHODS This scale development study was conducted from 2020 to 2021 on 180 normophonic and dysphonic native Persian-speaking residents of Mashhad, Iran. Five raters rated the samples by auditory-perceptual-judgments, including Grade from the Grade-Rough-Breathy-Asthenic-Strained (an ordinal scale) and the overall dysphonia severity from the Persian version Consensus Auditory Perceptual Evaluation of Voice (a continuous scale) to investigate both versions' concurrent validity. The intra- and inter-rater reliability and concurrent validity were evaluated for both scales. Both versions' diagnostic accuracy was assessed by the receiver operating characteristic, and the optimal thresholds were determined. RESULTS AVQI-version-2-Persian thresholds of 3.47 and 4.04 provided sensitivity of 88.30% and 85.53% and specificity of 79.07% and 85.58% by the ordinal and continuous scales, respectively. AVQI-version-3-Persian thresholds of 3.07 and 3.03 also rendered sensitivity of 74.47% and 85.53%, and specificity of 97.67% and 91.35% by the ordinal and continuous scales sequentially. CONCLUSION The significant values of concurrent validities and diagnostic accuracies of both versions of AVQI-Persian confirmed that it can discriminate between normal and pathological voices among the Persian-speaking population. Hence, it can be used for screening or diagnosis purposes.
Collapse
Affiliation(s)
- Shahryar Zainaee
- Department of Speech Therapy, School of Paramedical sciences, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Ehsan Khadivi
- Sinus and Surgical Endoscopic Research Center, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Jamshid Jamali
- Department of Biostatistics, School of Health, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Davood Sobhani-Rad
- Department of Speech Therapy, School of Paramedical sciences, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Youri Maryn
- Department of Speech, Language and Hearing Sciences, Faculty of Medicine and Health Sciences, University of Ghent, Ghent, Belgium
| | - Hamide Ghaemi
- Department of Speech Therapy, School of Paramedical sciences, Mashhad University of Medical Sciences, Mashhad, Iran.
| |
Collapse
|
7
|
Whelan BM, Theodoros D, Cahill L, Vaezipour A, Vogel AP, Finch E, Farrell A, Cardell E. Feasibility of a Telerehabilitation Adaptation of the Be Clear Speech Treatment Program for Non-Progressive Dysarthria. Brain Sci 2022; 12:brainsci12020197. [PMID: 35203960 PMCID: PMC8870717 DOI: 10.3390/brainsci12020197] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2021] [Revised: 01/24/2022] [Accepted: 01/28/2022] [Indexed: 11/21/2022] Open
Abstract
This study evaluated the feasibility and outcomes of a telerehabilitation adaptation of the Be Clear speech treatment program for adults with non-progressive dysarthria to determine clinical delivery viability and future research directions. Treatment effects on speech clarity, intelligibility, communication effectiveness, and participation, as well as psychosocial outcomes in 15 participants with non-progressive dysarthria, were explored. Intervention involved daily 1-h online sessions (4 days per week for 4 weeks, totalling 16 sessions) and daily home practice. Outcome measures were obtained at baseline (PRE), post-treatment (POST), and 12 weeks following treatment (FUP). Feasibility measures targeting participant satisfaction, treatment adherence and fidelity, and technical viability were also employed. The programme was feasible concerning technical viability and implementation, treatment adherence and fidelity. High levels of participant satisfaction were reported. Increases in overall ratings of communication participation and effectiveness were identified at POST and FUP. Reductions in speech rate were identified at FUP. Improvements in aspects of lingual and laryngeal function were also noted after treatment. Over time, improvements relating to the negative impact of dysarthria were identified. Naïve listeners perceived negligible changes in speech clarity following treatment. Online delivery of the Be Clear speech treatment program was feasible, and some positive speech benefits were observed. Due to the small sample size included in this research, statistically significant findings related to speech outcomes must be interpreted with caution. An adequately powered randomised controlled trial of Be Clear online is warranted to evaluate treatment efficacy.
Collapse
Affiliation(s)
- Brooke-Mai Whelan
- Recover Injury Research Centre, Faculty of Health and Behavioural Sciences, The University of Queensland, Brisbane 4072, Australia; (D.T.); (L.C.); (A.V.)
- Faculty of Health and Behavioural Sciences, School of Health and Rehabilitation Sciences, The University of Queensland, Brisbane 4072, Australia;
- Correspondence:
| | - Deborah Theodoros
- Recover Injury Research Centre, Faculty of Health and Behavioural Sciences, The University of Queensland, Brisbane 4072, Australia; (D.T.); (L.C.); (A.V.)
- Faculty of Health and Behavioural Sciences, School of Health and Rehabilitation Sciences, The University of Queensland, Brisbane 4072, Australia;
| | - Louise Cahill
- Recover Injury Research Centre, Faculty of Health and Behavioural Sciences, The University of Queensland, Brisbane 4072, Australia; (D.T.); (L.C.); (A.V.)
| | - Atiyeh Vaezipour
- Recover Injury Research Centre, Faculty of Health and Behavioural Sciences, The University of Queensland, Brisbane 4072, Australia; (D.T.); (L.C.); (A.V.)
| | - Adam P. Vogel
- Centre for the Neuroscience of Speech, Department of Audiology and Speech Pathology, Melbourne School of Health Sciences, The University of Melbourne, Melbourne 3010, Australia;
- Redenlab Inc., Melbourne 3000, Australia
| | - Emma Finch
- Faculty of Health and Behavioural Sciences, School of Health and Rehabilitation Sciences, The University of Queensland, Brisbane 4072, Australia;
- Centre for Functioning and Health Research, Metro South Hospital and Health Service, Queensland Health, Brisbane 4102, Australia
- The Princess Alexandra Hospital, Metro South Hospital and Health Service, Queensland Health, Brisbane 4102, Australia;
| | - Anna Farrell
- The Princess Alexandra Hospital, Metro South Hospital and Health Service, Queensland Health, Brisbane 4102, Australia;
- The Royal Brisbane and Women’s Hospital, Metro North Hospital and Health Service, Queensland Health, Brisbane 4029, Australia
| | - Elizabeth Cardell
- Menzies Health Institute Queensland, School of Medicine and Dentistry, Griffith University, Gold Coast 4215, Australia;
| |
Collapse
|
8
|
Ertan E, Gürvit HI, Hanağası HH, Bilgiç B, Tunçer MA, Yılmaz C. Intensive voice treatment (the Lee Silverman Voice Treatment [LSVT ®LOUD]) for individuals with Wilson's disease and adult cerebral palsy: two case reports. LOGOP PHONIATR VOCO 2021; 47:262-270. [PMID: 34287100 DOI: 10.1080/14015439.2021.1951348] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
Objective: In this case report, we aimed to examine the effects of an intensive voice treatment (the Lee Silverman Voice Treatment [LSVT®LOUD]) for Wilson's disease (WD), and adult cerebral palsy (CP), and dysarthria.Method: The participants received LSVT®LOUD four times a week for 4 weeks. Acoustic, perceptual (GRBAS) analyses were performed and data from the Voice Handicap Index (VHI) were obtained before and after treatment.Results: Besides the Harmonics-to Noise Ratio (HNR) value (dB) of the participant with WD, for both participants' fundamental frequencies (Hz), jitter (%), and shimmer (%) values showed significant differences (p < .05) after therapy. Both participants showed significant improvements (p < .05) in the duration (s) and the sound pressure level (dB, SPL) of sustained vowel phonation (/a/), in SPL (dB) of pitch range (high and low /a/) and reading and conversation (p < .01). There was a positive improvement in the high-frequency values (Hz) of both participants but not in the low-frequency values (Hz) in the participant with WD. Perceptual analysis with GRBAS judgements of sustained vowel (/a/) and paragraph reading of two participants also showed improvement. After therapy, perceived loudness of the participants' voice increased.Conclusions: The findings provide some preliminary observations that the individuals with WD and the adult individuals with CP can respond positively to intensive speech treatment such as LSVT®LOUD. Further studies are needed to investigate speech treatments specific to WD and adult CP.
Collapse
Affiliation(s)
- Esra Ertan
- Institut für Deutsche Sprache und Linguistik, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Hakan I Gürvit
- Behavioral Neurology and Movement Disorders Unit, Department of Neurology, Istanbul Faculty of Medicine, Istanbul University, Istanbul, Turkey
| | - Haşmet H Hanağası
- Behavioral Neurology and Movement Disorders Unit, Department of Neurology, Istanbul Faculty of Medicine, Istanbul University, Istanbul, Turkey
| | - Başar Bilgiç
- Behavioral Neurology and Movement Disorders Unit, Department of Neurology, Istanbul Faculty of Medicine, Istanbul University, Istanbul, Turkey
| | - Müge A Tunçer
- Department of Speech and Language Therapy, Faculty of Health Science, Sıtkı Koçman University, Muğla, Turkey
| | - Cemil Yılmaz
- Department of Speech and Language Therapy, Faculty of Health Science, Anadolu University, Eskişehir, Turkey
| |
Collapse
|
9
|
Whitfield JA, Holdosh SR, Kriegel Z, Sullivan LE, Fullenkamp AM. Tracking the Costs of Clear and Loud Speech: Interactions Between Speech Motor Control and Concurrent Visuomotor Tracking. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:2182-2195. [PMID: 33719529 DOI: 10.1044/2020_jslhr-20-00264] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Purpose Prior work has demonstrated that competing tasks impact habitual speech production. The purpose of this investigation was to quantify the extent to which clear and loud speech are affected by concurrent performance of an attention-demanding task. Method Speech kinematics and acoustics were collected while participants spoke using habitual, loud, and clear speech styles. The styles were performed in isolation and while performing a secondary tracking task. Results Compared to the habitual style, speakers exhibited expected increases in lip aperture range of motion and speech intensity for the clear and loud styles. During concurrent visuomotor tracking, there was a decrease in lip aperture range of motion and speech intensity for the habitual style. Tracking performance during habitual speech did not differ from single-task tracking. For loud and clear speech, speakers retained the gains in speech intensity and range of motion, respectively, while concurrently tracking. A reduction in tracking performance was observed during concurrent loud and clear speech, compared to tracking alone. Conclusions These data suggest that loud and clear speech may help to mitigate motor interference associated with concurrent performance of an attention-demanding task. Additionally, reductions in tracking accuracy observed during concurrent loud and clear speech may suggest that these higher effort speaking styles require greater attentional resources than habitual speech.
Collapse
Affiliation(s)
- Jason A Whitfield
- Department of Communication Sciences and Disorders, Bowling Green State University, OH
| | - Serena R Holdosh
- Department of Communication Sciences and Disorders, Bowling Green State University, OH
| | - Zoe Kriegel
- Department of Communication Sciences and Disorders, Bowling Green State University, OH
| | - Lauren E Sullivan
- Department of Communication Sciences and Disorders, Bowling Green State University, OH
| | - Adam M Fullenkamp
- School of Human Movement, Sport, & Leisure Studies, Bowling Green State University, OH
| |
Collapse
|
10
|
Xue Y, Marxen M, Akagi M, Birkholz P. Acoustic and articulatory analysis and synthesis of shouted vowels. COMPUT SPEECH LANG 2021. [DOI: 10.1016/j.csl.2020.101156] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
11
|
Understanding Lombard speech: a review of compensation techniques towards improving speech based recognition systems. Artif Intell Rev 2020. [DOI: 10.1007/s10462-020-09907-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
12
|
Mefferd AS, Dietrich MS. Tongue- and Jaw-Specific Articulatory Changes and Their Acoustic Consequences in Talkers With Dysarthria due to Amyotrophic Lateral Sclerosis: Effects of Loud, Clear, and Slow Speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:2625-2636. [PMID: 32697631 PMCID: PMC7872725 DOI: 10.1044/2020_jslhr-19-00309] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Purpose This study aimed to determine how tongue and jaw displacement changes impact acoustic vowel contrast in talkers with amyotrophic lateral sclerosis (ALS) and controls. Method Ten talkers with ALS and 14 controls participated in this study. Loud, clear, and slow speech cues were used to elicit tongue and jaw kinematic as well as acoustic changes. Speech kinematics was recorded using three-dimensional articulography. Independent tongue and jaw displacements were extracted during the diphthong /ai/ in kite. Acoustic distance between diphthong onset and offset in Formant 1-Formant 2 vowel space indexed acoustic vowel contrast. Results In both groups, all three speech modifications elicited increases in jaw displacement (typical < slow < loud < clear). By contrast, only slow speech elicited significantly increased independent tongue displacement in the ALS group (typical = loud = clear < slow), whereas all three speech modifications elicited significantly increased independent tongue displacement in controls (typical < loud < clear = slow). Furthermore, acoustic vowel contrast significantly increased in response to clear and slow speech in the ALS group, whereas all three speech modifications elicited significant increases in acoustic vowel contrast in controls (typical < loud < slow < clear). Finally, only jaw displacements accounted for acoustic vowel contrast gains in the ALS group. In controls, however, independent tongue displacements accounted for increases in vowel acoustic contrast during loud and slow speech, whereas jaw and independent tongue displacements accounted equally for acoustic vowel contrast change during clear speech. Conclusion Kinematic findings suggest that slow speech may be better suited to target independent tongue displacements in talkers with ALS than clear and loud speech. However, given that gains in acoustic vowel contrast were comparable for slow and clear speech cues in these talkers, future research is needed to determine potential differential impacts of slow and clear speech on perceptual measures, such as intelligibility. Finally, findings suggest that acoustic vowel contrast gains are predominantly jaw driven in talkers with ALS. Therefore, the acoustic and perceptual consequences of direct instructions of enhanced jaw movements should be compared to cued speech modification, such as clear and slow speech in these talkers.
Collapse
Affiliation(s)
- Antje S. Mefferd
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN
| | - Mary S. Dietrich
- Department of Biostatistics and School of Nursing, Vanderbilt University, Nashville, TN
| |
Collapse
|
13
|
Vurma A. Amplitude Effects of Vocal Tract Resonance Adjustments When Singing Louder. J Voice 2020; 36:292.e11-292.e22. [PMID: 32624371 DOI: 10.1016/j.jvoice.2020.05.020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2020] [Revised: 05/20/2020] [Accepted: 05/26/2020] [Indexed: 10/23/2022]
Abstract
In the literature on vocal pedagogy we may find suggestions to increase the mouth opening when singing louder. It is known that sopranos tend to sing loud high notes with a wider mouth opening which raises the frequency of the first resonance of the vocal tract (fR1) to tune it close to the fundamental. Our experiment with classically trained male singers revealed that they also tended to raise the fR1 with the dynamics at pitches where the formant tuning does not seem relevant. The analysis by synthesis showed that such behaviour may contribute to the strengthening of the singer's formant by several dB-s and to a rise in the centre of spectral gravity. The contribution of the fR1 raising to the overall sound level was less consistent. Changing the extent of the mouth opening with the dynamics may create several simultaneous semantic cues that signal how prominent the produced sound is and how great the physical effort by the singer is. The diminishing of the mouth opening when singing piano may also have an importance as it helps singers to produce a quieter sound by increasing the distance between the fR1 and higher resonances, which lowers the transfer function of the vocal tract at the relevant spectral regions.
Collapse
Affiliation(s)
- Allan Vurma
- Estonian Academy of Music and Theatre, Tatari 13, Tallinn 10116, Estonia.
| |
Collapse
|
14
|
Rubin AD, Codino J, Costeloe A, Johns MM, Collum A, Bottalico P. The Effect of Unilateral Hearing Protection on Vocal Intensity With Varying Degrees of Background Noise. J Voice 2020; 35:886-891. [PMID: 32362577 DOI: 10.1016/j.jvoice.2020.03.019] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2020] [Revised: 03/19/2020] [Accepted: 03/23/2020] [Indexed: 11/28/2022]
Abstract
INTRODUCTION The Lombard effect (LE) is a phenomenon in which speakers adjust their vocal production by raising the volume in noisy environments. As a result, the LE can create problems of vocal strain, fatigue and potential injury. OBJECTIVES This study aims to examine the difference in vocal intensity output in subjects wearing unilateral hearing protection versus no hearing protection in the presence of background noise. METHODS Each subject was seated inside a sound booth wearing a head-mounted microphone. Subjects were asked to read an excerpt from "The Rainbow Passage" while various levels of background noise were played: 50, 60, 70, and 80 dBA (Multitalker Babble). Each noise level was played while the subject was with and without unilateral ear protection (Optime 98 Earmuff [3M]) in random order. The earmuff has a noise reduction rating of 25 dB. After each reading of the text, subjects were asked to rate communication disturbance, vocal clarity, and discomfort during talking using a 10 cm visual analogue scale. RESULTS The LE is reduced from 0.38 dB/dB to 0.29 dB/dB with unilateral ear occlusion. However, self-perception of disturbance, clarity and comfort were not affected by unilateral occlusion, only by noise level. CONCLUSIONS Unilateral hearing protection reduces the LE and may protect against phonotrauma when speaking in an environment with loud background noise.
Collapse
Affiliation(s)
- Adam D Rubin
- Lakeshore Professional Voice Center, Lakeshore Ear, Nose, and Throat Center, St. Clair Shores, Michigan.
| | - Juliana Codino
- Lakeshore Professional Voice Center, Lakeshore Ear, Nose, and Throat Center, St. Clair Shores, Michigan
| | - Anya Costeloe
- Ascension St. John Macomb-Oakland Hospital, Warren, Michigan
| | - Michael M Johns
- USC Caruso Department of Otolaryngology Head and Neck Surgery, Los Angeles, CA
| | - Austin Collum
- Lakeshore Professional Voice Center, Lakeshore Ear, Nose, and Throat Center, St. Clair Shores, Michigan
| | - Pasquale Bottalico
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, Champaign, Illinois.
| |
Collapse
|
15
|
Meilán JJG, Martínez-Sánchez F, Martínez-Nicolás I, Llorente TE, Carro J. Changes in the Rhythm of Speech Difference between People with Nondegenerative Mild Cognitive Impairment and with Preclinical Dementia. Behav Neurol 2020; 2020:4683573. [PMID: 32351632 PMCID: PMC7178534 DOI: 10.1155/2020/4683573] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2019] [Revised: 03/25/2020] [Accepted: 03/26/2020] [Indexed: 11/17/2022] Open
Abstract
This study explores several speech parameters related to mild cognitive impairment, as well as those that might be flagging the presence of an underlying neurodegenerative process. Speech is an excellent biomarker because it is not invasive and, what is more, its analysis is rapid and economical. Our aim has been to ascertain whether the typical speech patterns of people with Alzheimer's disease are also present during the disorder's preclinical stages. To do so, we shall be using a task that involves reading out aloud. This is followed by an analysis of the recordings, looking for the possible parameters differentiating between those older people with MCI and a high probability of developing dementia and those with MCI that will not do so. We found that the disease's most differentiating parameters prior to its onset involve changes in speech duration and an alteration in rhythm rate and intensity. These parameters seem to be related to the first difficulties in lexical access among older people with AD.
Collapse
Affiliation(s)
- Juan J. G. Meilán
- Faculty of Psychology, University of Salamanca, Salamanca, Spain
- Institute of Neurosciences of Castile and Leon, Salamanca., Spain
| | | | - Israel Martínez-Nicolás
- Faculty of Psychology, University of Salamanca, Salamanca, Spain
- Institute of Neurosciences of Castile and Leon, Salamanca., Spain
| | - Thide E. Llorente
- Faculty of Psychology, University of Salamanca, Salamanca, Spain
- Institute of Neurosciences of Castile and Leon, Salamanca., Spain
| | - Juan Carro
- Faculty of Psychology, University of Salamanca, Salamanca, Spain
- Institute of Neurosciences of Castile and Leon, Salamanca., Spain
| |
Collapse
|
16
|
Baghel S, Prasanna SRM, Guha P. Exploration of excitation source information for shouted and normal speech classification. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:1250. [PMID: 32113325 DOI: 10.1121/10.0000757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2019] [Accepted: 01/31/2020] [Indexed: 06/10/2023]
Abstract
Discrimination between shouted and normal speech is an essential prerequisite for many speech processing applications. Existing works have established that excitation source information plays a significant role in shouted speech production. In speech processing literature, various features have been proposed to model different aspects of the excitation source. The principal contribution of this work is to explore three such features, Discrete Cosine Transform of Integrated Linear Prediction Residual (DCT-ILPR), Mel-Power Difference of Spectrum in Sub-bands (MPDSS), and Residual Mel-Frequency Cepstral Coefficient (RMFCC), for shouted and normal speech classification. The DCT-ILPR feature represents the shape of the glottal cycle, MPDSS estimates the periodicity of the excitation source spectrum, and RMFCC characterizes smoothed spectral information of the excitation source. The authors have also contributed a dataset containing shouted and normal speech. This work is evaluated on three datasets and benchmarked against three baseline methods. Deep neural networks are used to study the classification performance of individual features and their combinations. The generalization performance of features (and combinations) is also investigated. Fusion of excitation source features with Mel-Frequency Cepstral Coefficients (MFCC) provides the best performance compared to other combinations. Noise analysis shows that adding excitation features with MFCC+ΔΔ provides a more robust classification system.
Collapse
Affiliation(s)
- Shikha Baghel
- Department of Electronics and Electrical Engineering, Indian Institute of Technology Guwahati, Guwahati, Assam 781039, India
| | - S R Mahadeva Prasanna
- Department of Electronics and Electrical Engineering, Indian Institute of Technology Guwahati, Guwahati, Assam 781039, India
| | - Prithwijit Guha
- Department of Electronics and Electrical Engineering, Indian Institute of Technology Guwahati, Guwahati, Assam 781039, India
| |
Collapse
|
17
|
Mefferd AS, Dietrich MS. Tongue- and Jaw-Specific Articulatory Underpinnings of Reduced and Enhanced Acoustic Vowel Contrast in Talkers With Parkinson's Disease. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2019; 62:2118-2132. [PMID: 31306611 PMCID: PMC6808361 DOI: 10.1044/2019_jslhr-s-msc18-18-0192] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
Purpose This study sought to identify the articulator-specific mechanisms that underlie reduced and enhanced acoustic vowel contrast in talkers with dysarthria due to Parkinson's disease (PD). Method Seventeen talkers with mild-moderate dysarthria due to PD and 17 controls completed a sentence repetition task using typical, slow, loud, and clear speech. Tongue and jaw articulatory movements were recorded using 3D electromagnetic articulography. Independent tongue displacements, jaw displacements, and acoustic vowel contrast were calculated for the diphthong /aɪ/ embedded in the word kite. Results During typical speech, independent tongue displacement, but not jaw displacement, contributed significantly to the intertalker variance in acoustic vowel contrast. Loudness-related acoustic vowel contrast gains were predominantly jaw driven in controls but driven by the tongue and jaw in talkers with PD. Further, in both groups, clarity-related acoustic vowel contrast gains were predominantly jaw driven. Finally, in both groups, rate-related acoustic vowel contrast gains were predominantly tongue driven; however, the jaw also contributed. These jaw contributions were greater in the PD group than in the control group. Conclusions Findings suggest that a tongue-specific articulatory impairment underlies acoustic vowel contrast deterioration in talkers with PD, at least during the early stages of speech decline. Findings further suggest that slow speech engages the impaired tongue more than loud and clear speech in talkers with PD. However, slow speech was also associated with an abnormally strong jaw response in these talkers, which suggests that a compensatory articulatory behavior may also be elicited.
Collapse
Affiliation(s)
- Antje S. Mefferd
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN
| | - Mary S. Dietrich
- Department of Biostatistics and School of Nursing, Vanderbilt University, Nashville, TN
| |
Collapse
|
18
|
Koenig LL, Fuchs S. Vowel Formants in Normal and Loud Speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2019; 62:1278-1295. [PMID: 31084509 DOI: 10.1044/2018_jslhr-s-18-0043] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Purpose This study evaluated how 1st and 2nd vowel formant frequencies (F1, F2) differ between normal and loud speech in multiple speaking tasks to assess claims that loudness leads to exaggerated vowel articulation. Method Eleven healthy German-speaking women produced normal and loud speech in 3 tasks that varied in the degree of spontaneity: reading sentences that contained isolated /i: a: u:/, responding to questions that included target words with controlled consonantal contexts but varying vowel qualities, and a recipe recall task. Loudness variation was elicited naturalistically by changing interlocutor distance. First and 2nd formant frequencies and average sound pressure level were obtained from the stressed vowels in the target words, and vowel space area was calculated from /i: a: u:/. Results Comparisons across many vowels indicated that high, tense vowels showed limited formant variation as a function of loudness. Analysis of /i: a: u:/ across speech tasks revealed vowel space reduction in the recipe retell task compared to the other 2. Loudness changes for F1 were consistent in direction but variable in extent, with few significant results for high tense vowels. Results for F2 were quite varied and frequently not significant. Speakers differed in how loudness and task affected formant values. Finally, correlations between sound pressure level and F1 were generally positive but varied in magnitude across vowels, with the high tense vowels showing very flat slopes. Discussion These data indicate that naturalistically elicited loud speech in typical speakers does not always lead to changes in vowel formant frequencies and call into question the notion that increasing loudness is necessarily an automatic method of expanding the vowel space. Supplemental Material https://doi.org/10.23641/asha.8061740.
Collapse
Affiliation(s)
- Laura L Koenig
- Adelphi University, Garden City, NY
- Haskins Laboratories, New Haven, CT
- Leibniz-Centre General Linguistics (ZAS), Berlin, Germany
| | - Susanne Fuchs
- Leibniz-Centre General Linguistics (ZAS), Berlin, Germany
| |
Collapse
|
19
|
Garnier M, Ménard L, Alexandre B. Hyper-articulation in Lombard speech: An active communicative strategy to enhance visible speech cues? THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:1059. [PMID: 30180713 DOI: 10.1121/1.5051321] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2017] [Accepted: 08/02/2018] [Indexed: 06/08/2023]
Abstract
This study investigates the hypothesis that speakers make active use of the visual modality in production to improve their speech intelligibility in noisy conditions. Six native speakers of Canadian French produced speech in quiet conditions and in 85 dB of babble noise, in three situations: interacting face-to-face with the experimenter (AV), using the auditory modality only (AO), or reading aloud (NI, no interaction). The audio signal was recorded with the three-dimensional movements of their lips and tongue, using electromagnetic articulography. All the speakers reacted similarly to the presence vs absence of communicative interaction, showing significant speech modifications with noise exposure in both interactive and non-interactive conditions, not only for parameters directly related to voice intensity or for lip movements (very visible) but also for tongue movements (less visible); greater adaptation was observed in interactive conditions, though. However, speakers reacted differently to the availability or unavailability of visual information: only four speakers enhanced their visible articulatory movements more in the AV condition. These results support the idea that the Lombard effect is at least partly a listener-oriented adaptation. However, to clarify their speech in noisy conditions, only some speakers appear to make active use of the visual modality.
Collapse
Affiliation(s)
- Maëva Garnier
- Centre National de la Recherche Scientifique, Laboratoire Grenoble Images Parole Signal Automatique, 11 rue des Mathématiques, Grenoble Campus, Boîte Postale 46, F-38402 Saint Martin d'Hères Cedex, France
| | - Lucie Ménard
- Département de Linguistique, Laboratoire de Phonétique, Center for Research on Brain, Language, and Music, Université du Québec à Montréal, 320, Ste-Catherine Est, Montréal, Quebec H2X 1L7, Canada
| | - Boris Alexandre
- Centre National de la Recherche Scientifique, Laboratoire Grenoble Images Parole Signal Automatique, 11 rue des Mathématiques, Grenoble Campus, Boîte Postale 46, F-38402 Saint Martin d'Hères Cedex, France
| |
Collapse
|
20
|
Whitfield JA, Dromey C, Palmer P. Examining Acoustic and Kinematic Measures of Articulatory Working Space: Effects of Speech Intensity. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2018; 61:1104-1117. [PMID: 29710247 DOI: 10.1044/2018_jslhr-s-17-0388] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/12/2017] [Accepted: 01/24/2018] [Indexed: 06/08/2023]
Abstract
PURPOSE The purpose of this study was to examine the effect of speech intensity on acoustic and kinematic vowel space measures and conduct a preliminary examination of the relationship between kinematic and acoustic vowel space metrics calculated from continuously sampled lingual marker and formant traces. METHOD Young adult speakers produced 3 repetitions of 2 different sentences at 3 different loudness levels. Lingual kinematic and acoustic signals were collected and analyzed. Acoustic and kinematic variants of several vowel space metrics were calculated from the formant frequencies and the position of 2 lingual markers. Traditional metrics included triangular vowel space area and the vowel articulation index. Acoustic and kinematic variants of sentence-level metrics based on the articulatory-acoustic vowel space and the vowel space hull area were also calculated. RESULTS Both acoustic and kinematic variants of the sentence-level metrics significantly increased with an increase in loudness, whereas no statistically significant differences in traditional vowel-point metrics were observed for either the kinematic or acoustic variants across the 3 loudness conditions. In addition, moderate-to-strong relationships between the acoustic and kinematic variants of the sentence-level vowel space metrics were observed for the majority of participants. CONCLUSIONS These data suggest that both kinematic and acoustic vowel space metrics that reflect the dynamic contributions of both consonant and vowel segments are sensitive to within-speaker changes in articulation associated with manipulations of speech intensity.
Collapse
Affiliation(s)
- Jason A Whitfield
- Department of Communication Sciences and Disorders, Bowling Green State University, OH
| | - Christopher Dromey
- Department of Communication Disorders, Brigham Young University, Provo, UT
| | - Panika Palmer
- Department of Communication Disorders, Brigham Young University, Provo, UT
| |
Collapse
|
21
|
The impact of perilaryngeal vibration on the self-perception of loudness and the Lombard effect. Exp Brain Res 2018; 236:1713-1723. [PMID: 29623381 DOI: 10.1007/s00221-018-5248-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2017] [Accepted: 03/29/2018] [Indexed: 10/17/2022]
Abstract
The role of somatosensory feedback in speech and the perception of loudness was assessed in adults without speech or hearing disorders. Participants completed two tasks: loudness magnitude estimation of a short vowel and oral reading of a standard passage. Both tasks were carried out in each of three conditions: no-masking, auditory masking alone, and mixed auditory masking plus vibration of the perilaryngeal area. A Lombard effect was elicited in both masking conditions: speakers unconsciously increased vocal intensity. Perilaryngeal vibration further increased vocal intensity above what was observed for auditory masking alone. Both masking conditions affected fundamental frequency and the first formant frequency as well, but only vibration was associated with a significant change in the second formant frequency. An additional analysis of pure-tone thresholds found no difference in auditory thresholds between masking conditions. Taken together, these findings indicate that perilaryngeal vibration effectively masked somatosensory feedback, resulting in an enhanced Lombard effect (increased vocal intensity) that did not alter speakers' self-perception of loudness. This implies that the Lombard effect results from a general sensorimotor process, rather than from a specific audio-vocal mechanism, and that the conscious self-monitoring of speech intensity is not directly based on either auditory or somatosensory feedback.
Collapse
|
22
|
Mefferd AS. Tongue- and Jaw-Specific Contributions to Acoustic Vowel Contrast Changes in the Diphthong /ai/ in Response to Slow, Loud, and Clear Speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2017; 60:3144-3158. [PMID: 29067400 PMCID: PMC5945076 DOI: 10.1044/2017_jslhr-s-17-0114] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2017] [Accepted: 06/15/2017] [Indexed: 05/13/2023]
Abstract
PURPOSE This study sought to determine decoupled tongue and jaw displacement changes and their specific contributions to acoustic vowel contrast changes during slow, loud, and clear speech. METHOD Twenty typical talkers repeated "see a kite again" 5 times in 4 speech conditions (typical, slow, loud, clear). Speech kinematics were recorded using 3-dimensional electromagnetic articulography. Tongue composite displacement, decoupled tongue displacement, and jaw displacement during /ai/, as well as the distance between /a/ and /i/ in the F1-F2 vowel space, were examined during the diphthong /ai/ in "kite." RESULTS Displacements significantly increased during all 3 speech modifications. However, jaw displacements increased significantly more during clear speech than during loud and slow speech, whereas decoupled tongue displacements increased significantly more during slow speech than during clear and loud speech. In addition, decoupled tongue displacements increased significantly more during clear speech than during loud speech. Increases in acoustic vowel contrast tended to be larger during slow speech than during clear speech and were predominantly tongue-driven, whereas those during clear speech were fairly equally accounted for by changes in decoupled tongue and jaw displacements. Increases in acoustic vowel contrast during loud speech were smallest and were predominantly tongue-driven, particularly in men. CONCLUSIONS Findings suggest that task-specific patterns of decoupled tongue and jaw displacement change and task-specific patterns of decoupled tongue and jaw contributions to vowel acoustic change across these speech modifications. Clinical implications are discussed.
Collapse
|
23
|
Tang P, Xu Rattanasone N, Yuen I, Demuth K. Acoustic realization of Mandarin neutral tone and tone sandhi in infant-directed speech and Lombard speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:2823. [PMID: 29195426 PMCID: PMC5681351 DOI: 10.1121/1.5008372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2017] [Revised: 09/29/2017] [Accepted: 10/04/2017] [Indexed: 06/07/2023]
Abstract
Mandarin lexical tones are modified in both infant-directed speech (IDS) and Lombard speech, resulting in tone hyperarticulation. However, it is unclear if these registers also alter contextual tones (neutral tone and tone sandhi) and if such phonetic modification might affect acquisition of these tones. This study therefore examined how neutral tone and tone sandhi are realized in IDS, and how their acoustic manifestations compare with those in Lombard speech, where the communicative needs of listeners differ. Neutral tone and tone sandhi productions were elicited from 15 Mandarin-speaking mothers during (1) interactions with their 12-month-old infants (IDS), (2) in conversation with a Mandarin-speaking adult in a noisy environment (Lombard speech), and (3) in conversation with a Mandarin-speaking adult in a quiet environment (adult-directed speech). The results showed that, although both contextual tones were modified in IDS and Lombard speech, their key tone features were maintained. In addition, IDS and Lombard speech modified these tones differently: IDS increased pitch height and modified pitch contour, while Lombard speech increased pitch height only. The realization of neutral tone and tone sandhi across registers is discussed with reference to listeners' different communicative needs.
Collapse
Affiliation(s)
- Ping Tang
- Department of Linguistics, ARC Centre of Excellence in Cognition and its Disorders, Macquarie University, 16 University Avenue, Australian Hearing Hub, Balaclava Road, North Ryde, New South Wales 2109, Sydney, Australia
| | - Nan Xu Rattanasone
- Department of Linguistics, ARC Centre of Excellence in Cognition and its Disorders, Macquarie University, 16 University Avenue, Australian Hearing Hub, Balaclava Road, North Ryde, New South Wales 2109, Sydney, Australia
| | - Ivan Yuen
- Department of Linguistics, ARC Centre of Excellence in Cognition and its Disorders, Macquarie University, 16 University Avenue, Australian Hearing Hub, Balaclava Road, North Ryde, New South Wales 2109, Sydney, Australia
| | - Katherine Demuth
- Department of Linguistics, ARC Centre of Excellence in Cognition and its Disorders, Macquarie University, 16 University Avenue, Australian Hearing Hub, Balaclava Road, North Ryde, New South Wales 2109, Sydney, Australia
| |
Collapse
|
24
|
Tang P, Xu Rattanasone N, Yuen I, Demuth K. Phonetic enhancement of Mandarin vowels and tones: Infant-directed speech and Lombard speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:493. [PMID: 28863611 DOI: 10.1121/1.4995998] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Speech units are reported to be hyperarticulated in both infant-directed speech (IDS) and Lombard speech. Since these two registers have typically been studied separately, it is unclear if the same speech units are hyperarticulated in the same manner between these registers. The aim of the present study is to compare the effect of register on vowel and tone modification in the tonal language Mandarin Chinese. Vowel and tone productions were produced by 15 Mandarin-speaking mothers during interactions with their 12-month-old infants during a play session (IDS), in conversation with a Mandarin-speaking adult in a 70 dBA eight-talker babble noise environment (Lombard speech), and in a quiet environment (adult-directed speech). Vowel space expansion was observed in IDS and Lombard speech, however, the patterns of vowel-shift were different between the two registers. IDS displayed tone space expansion only in the utterance-final position, whereas there was no tone space expansion in Lombard speech. The overall pitch increased for all tones in both registers. The tone-bearing vowel duration also increased in both registers, but only in utterance-final position. The difference in speech modifications between these two registers is discussed in light of speakers' different communicative needs.
Collapse
Affiliation(s)
- Ping Tang
- Department of Linguistics, ARC Centre of Excellence in Cognition and its Disorders, Macquarie University, Sydney, 16 University Avenue, Australian Hearing Hub, Balaclava Road, North Ryde, New South Wales 2109 Australia
| | - Nan Xu Rattanasone
- Department of Linguistics, ARC Centre of Excellence in Cognition and its Disorders, Macquarie University, Sydney, 16 University Avenue, Australian Hearing Hub, Balaclava Road, North Ryde, New South Wales 2109 Australia
| | - Ivan Yuen
- Department of Linguistics, ARC Centre of Excellence in Cognition and its Disorders, Macquarie University, Sydney, 16 University Avenue, Australian Hearing Hub, Balaclava Road, North Ryde, New South Wales 2109 Australia
| | - Katherine Demuth
- Department of Linguistics, ARC Centre of Excellence in Cognition and its Disorders, Macquarie University, Sydney, 16 University Avenue, Australian Hearing Hub, Balaclava Road, North Ryde, New South Wales 2109 Australia
| |
Collapse
|
25
|
Berry J, Kolb A, Schroeder J, Johnson MT. Jaw Rotation in Dysarthria Measured With a Single Electromagnetic Articulography Sensor. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2017; 26:596-610. [PMID: 28654942 DOI: 10.1044/2017_ajslp-16-0104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2016] [Accepted: 12/22/2016] [Indexed: 06/07/2023]
Abstract
PURPOSE This study evaluated a novel method for characterizing jaw rotation using orientation data from a single electromagnetic articulography sensor. This method was optimized for clinical application, and a preliminary examination of clinical feasibility and value was undertaken. METHOD The computational adequacy of the single-sensor orientation method was evaluated through comparisons of jaw-rotation histories calculated from dual-sensor positional data for 16 typical talkers. The clinical feasibility and potential value of single-sensor jaw rotation were assessed through comparisons of 7 talkers with dysarthria and 19 typical talkers in connected speech. RESULTS The single-sensor orientation method allowed faster and safer participant preparation, required lower data-acquisition costs, and generated less high-frequency artifact than the dual-sensor positional approach. All talkers with dysarthria, regardless of severity, demonstrated jaw-rotation histories with more numerous changes in movement direction and reduced smoothness compared with typical talkers. CONCLUSIONS Results suggest that the single-sensor orientation method for calculating jaw rotation during speech is clinically feasible. Given the preliminary nature of this study and the small participant pool, the clinical value of such measures remains an open question. Further work must address the potential confound of reduced speaking rate on movement smoothness.
Collapse
Affiliation(s)
- Jeff Berry
- Department of Speech Pathology & Audiology, Marquette University, Milwaukee, WI
| | - Andrew Kolb
- Department of Electrical & Computer Engineering, Marquette University, Milwaukee, WI
| | - James Schroeder
- Department of Electrical & Computer Engineering, Marquette University, Milwaukee, WI
| | - Michael T Johnson
- Department of Electrical & Computer Engineering, Marquette University, Milwaukee, WI
| |
Collapse
|
26
|
Benuš Š, Šimko J. Stability and Variability in Slovak Prosodic Boundaries. PHONETICA 2017; 73:163-193. [PMID: 28208129 DOI: 10.1159/000446350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/03/2015] [Accepted: 04/15/2016] [Indexed: 06/06/2023]
Abstract
BACKGROUND/AIM Encoding intended meanings in the type and strength of prosodic boundaries and strategies for communicating these meanings in ambient noise use similar prosodic cues. We analyze how increasing the level of ambient noise affects the realization of Slovak prosodic boundaries. METHODS Five native speakers of Slovak read sentences, manipulating the boundary type (weak, rise, fall) and the location of pre-boundary pitch accent. Ambient noise of several levels was administered via headphones. Acoustic and articulatory data (electromagnetometry) were collected. RESULTS Under normal condition, boundary strength is signaled with longer pre-boundary rhymes, more frequent pauses, greater crossboundary f0 resets and jaw displacement. The strength of falls is realized in crossboundary features (pauses, f0 reset), and rises in pre-boundary features (rhyme duration, f0 range). Pitch-accented rhymes are strengthened in all features, but f0 range. In noise, the increase in boundary strength is weak, and falls strengthen more than rises. F0 targets for falls and rises are adjusted in addition to noiseinduced global f0 scaling and lengthening. CONCLUSION Hyper-articulation of prosodic boundaries in ambient noise is not robust and uniform; rather, durational, f0 and jaw displacement features co-create complex prosodic patterns in a complementary and synergetic manner based on affordances in normal speech.
Collapse
Affiliation(s)
- Štefan Benuš
- Constantine the Philosopher University, Nitra, Slovakia
| | | |
Collapse
|
27
|
Dromey C, Scott S. The effects of noise on speech movements in young, middle-aged, and older adults. SPEECH, LANGUAGE AND HEARING 2016. [DOI: 10.1080/2050571x.2015.1133757] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
28
|
Šimko J, Beňuš Š, Vainio M. Hyperarticulation in Lombard speech: Global coordination of the jaw, lips and the tongue. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 139:151-62. [PMID: 26827013 DOI: 10.1121/1.4939495] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Over the last century, researchers have collected a considerable amount of data reflecting the properties of Lombard speech, i.e., speech in a noisy environment. The documented phenomena predominately report effects on the speech signal produced in ambient noise. In comparison, relatively little is known about the underlying articulatory patterns of Lombard speech, in particular for lingual articulation. Here the authors present an analysis of articulatory recordings of speech material in babble noise of different intensity levels and in hypoarticulated speech and report quantitative differences in relative expansion of movement of different articulatory subsystems (the jaw, the lips and the tongue) as well as in relative expansion of utterance duration. The trajectory modifications for one articulator can be relatively reliably predicted by those for another one, but subsystems differ in a degree of continuity in trajectory expansion elicited across different noise levels. Regression analysis of articulatory modifications against durational expansion shows further qualitative differences between the subsystems, namely, the jaw and the tongue. The findings are discussed in terms of possible influences of a combination of prosodic, segmental, and physiological factors. In addition, the Lombard effect is put forward as a viable methodology for eliciting global articulatory variation in a controlled manner.
Collapse
Affiliation(s)
- Juraj Šimko
- Institute of Behavioural Sciences, University of Helsinki, Siltavuorenpenger 3A - PL 9, 00014 Helsinki, Finland
| | - Štefan Beňuš
- Faculty of Arts, Constantine the Philosopher University, Štefánikova 67, 949 74 Nitra, Slovakia
| | - Martti Vainio
- Institute of Behavioural Sciences, University of Helsinki, Siltavuorenpenger 3A - PL 9, 00014 Helsinki, Finland
| |
Collapse
|
29
|
Kent RD. Nonspeech Oral Movements and Oral Motor Disorders: A Narrative Review. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2015; 24:763-89. [PMID: 26126128 PMCID: PMC4698470 DOI: 10.1044/2015_ajslp-14-0179] [Citation(s) in RCA: 64] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2014] [Revised: 04/02/2015] [Accepted: 06/13/2015] [Indexed: 05/25/2023]
Abstract
PURPOSE Speech and other oral functions such as swallowing have been compared and contrasted with oral behaviors variously labeled quasispeech, paraspeech, speechlike, and nonspeech, all of which overlap to some degree in neural control, muscles deployed, and movements performed. Efforts to understand the relationships among these behaviors are hindered by the lack of explicit and widely accepted definitions. This review article offers definitions and taxonomies for nonspeech oral movements and for diverse speaking tasks, both overt and covert. METHOD Review of the literature included searches of Medline, Google Scholar, HighWire Press, and various online sources. Search terms pertained to speech, quasispeech, paraspeech, speechlike, and nonspeech oral movements. Searches also were carried out for associated terms in oral biology, craniofacial physiology, and motor control. RESULTS AND CONCLUSIONS Nonspeech movements have a broad spectrum of clinical applications, including developmental speech and language disorders, motor speech disorders, feeding and swallowing difficulties, obstructive sleep apnea syndrome, trismus, and tardive stereotypies. The role and benefit of nonspeech oral movements are controversial in many oral motor disorders. It is argued that the clinical value of these movements can be elucidated through careful definitions and task descriptions such as those proposed in this review article.
Collapse
Affiliation(s)
- Ray D. Kent
- Waisman Center, University of Wisconsin–Madison
| |
Collapse
|
30
|
An Acoustic and Electroglottographic Study of the Aging Voice With and Without an Open Jaw Posture. J Voice 2015; 29:518.e1-11. [DOI: 10.1016/j.jvoice.2014.09.024] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2014] [Accepted: 09/17/2014] [Indexed: 11/19/2022]
|
31
|
The effects of articulation on the perceived loudness of the projected voice. J Voice 2015; 29:390.e9-15. [PMID: 25770375 DOI: 10.1016/j.jvoice.2014.07.022] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2014] [Accepted: 07/14/2014] [Indexed: 10/23/2022]
Abstract
UNLABELLED Arthur Lessac developed a voice training approach that concentrated on three energies: structural action, tonal action, and consonant action. In Lessac-Madsen Resonant Voice Therapy (LMRVT), speech-language pathologists help patients achieve a resonant voice through structural posturing and awareness of tonal changes. However, LMRVT many not necessarily include the third component of Lessac's approach: consonant action.This study examines the effect that increased effort on consonant production has on the speaking voice-particularly regarding vocal loudness and projection. METHODS Audio samples were collected from eight actor participants who read a monologue using three distinct styles: normal articulation, poor articulation (elicited using a bite block), and overarticulation (elicited using a Lessac-based training intervention). Twenty graduate students of speech-language pathology listened to speech samples from the different conditions and made comparative judgments regarding articulation, loudness, and projection. RESULTS Group results showed a strong correlation between the articulatory condition and the level of perceived loudness and projection. That is, as precision of articulation increased, the ratings of perceived loudness and projection increased, as well. CONCLUSIONS These findings indicate that articulation treatment may have a positive influence on the perception of vocal loudness and projection. This has implications for future directions in expanding voice therapy modalities.
Collapse
|
32
|
Garnier M, Henrich N. Speaking in noise: How does the Lombard effect improve acoustic contrasts between speech and ambient noise? COMPUT SPEECH LANG 2014. [DOI: 10.1016/j.csl.2013.07.005] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
33
|
The listening talker: A review of human and algorithmic context-induced modifications of speech. COMPUT SPEECH LANG 2014. [DOI: 10.1016/j.csl.2013.08.003] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
34
|
Sapir S, Ramig LO, Fox CM. Intensive voice treatment in Parkinson’s disease: Lee Silverman Voice Treatment. Expert Rev Neurother 2014; 11:815-30. [DOI: 10.1586/ern.11.43] [Citation(s) in RCA: 74] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
|
35
|
The value of the Acoustic Voice Quality Index as a measure of dysphonia severity in subjects speaking different languages. Eur Arch Otorhinolaryngol 2013; 271:1609-19. [DOI: 10.1007/s00405-013-2730-7] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2013] [Accepted: 09/23/2013] [Indexed: 11/28/2022]
|
36
|
Pohjalainen J, Raitio T, Yrttiaho S, Alku P. Detection of shouted speech in noise: human and machine. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 133:2377-2389. [PMID: 23556603 DOI: 10.1121/1.4794394] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
High vocal effort has characteristic acoustic effects on speech. This study focuses on the utilization of this information by human listeners and a machine-based detection system in the task of detecting shouted speech in the presence of noise. Both female and male speakers read Finnish sentences using normal and shouted voice in controlled conditions, with the sound pressure level recorded. The speech material was artificially corrupted by noise and supplemented with pure noise. The human performance level was statistically evaluated by a listening test, where the subjects labeled noisy samples according to whether shouting was heard or not. A Bayesian detection system was constructed and statistically evaluated. Its performance was compared against that of human listeners, substituting different spectrum analysis methods in the feature extraction stage. Using features capable of taking into account the spectral fine structure (i.e., the fundamental frequency and its harmonics), the machine reached the detection level of humans even in the noisiest conditions. In the listening test, male listeners detected shouted speech significantly better than female listeners, especially with speakers making a smaller vocal effort increase for shouting.
Collapse
Affiliation(s)
- Jouni Pohjalainen
- Department of Signal Processing and Acoustics, Aalto University, P.O. Box 13000, FI-00076 AALTO, Espoo, Finland.
| | | | | | | |
Collapse
|
37
|
Hotchkin C, Parks S. The Lombard effect and other noise-induced vocal modifications: insight from mammalian communication systems. Biol Rev Camb Philos Soc 2013; 88:809-24. [PMID: 23442026 DOI: 10.1111/brv.12026] [Citation(s) in RCA: 67] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2012] [Revised: 01/20/2013] [Accepted: 01/25/2013] [Indexed: 01/07/2023]
Abstract
Humans and non-human mammals exhibit fundamentally similar vocal responses to increased noise, including increases in vocalization amplitude (the Lombard effect) and changes to spectral and temporal properties of vocalizations. Different research focuses have resulted in significant discrepancies in study methodologies and hypotheses among fields, leading to particular knowledge gaps and techniques specific to each field. This review compares and contrasts noise-induced vocal modifications observed from human and non-human mammals with reference to experimental design and the history of each field. Topics include the effects of communication motivation and subject-specific characteristics on the acoustic parameters of vocalizations, examination of evidence for a proposed biomechanical linkage between the Lombard effect and other spectral and temporal modifications, and effects of noise on self-communication signals (echolocation). Standardized terminology, cross-taxa tests of hypotheses, and open areas for future research in each field are recommended. Findings indicate that more research is needed to evaluate linkages among vocal modifications, context dependencies, and the finer details of the Lombard effect during natural communication. Studies of non-human mammals could benefit from applying the tightly controlled experimental designs developed in human research, while studies of human speech in noise should be expanded to include natural communicative contexts. The effects of experimental design and behavioural context on vocalizations should not be neglected as they may impact the magnitude and type of noise-induced vocal modifications.
Collapse
Affiliation(s)
- Cara Hotchkin
- Ecology Intercollege Graduate Degree Program, The Pennsylvania State University, University Park, 16801, PA, U.S.A
| | | |
Collapse
|
38
|
Mahler LA, Ramig LO. Intensive treatment of dysarthria secondary to stroke. CLINICAL LINGUISTICS & PHONETICS 2012; 26:681-694. [PMID: 22774928 DOI: 10.3109/02699206.2012.696173] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
This study investigated the impact of a well-defined behavioral dysarthria treatment on acoustic and perceptual measures of speech in four adults with dysarthria secondary to stroke. A single-subject A-B-A experimental design was used to measure the effects of the Lee Silverman Voice Treatment (LSVT(®) LOUD) on the speech of individual participants. Dependent measures included vocal sound pressure level, phonatory stability, vowel space area, and listener ratings of speech, voice and intelligibility. Statistically significant improvements (p < 0.05) in vocal dB SPL and phonatory stability as well as larger vowel space area were present for all participants. Listener ratings suggested improved voice quality and more natural speech post-treatment. Speech intelligibility scores improved for one of four participants. These data suggest that people with dysarthria secondary to stroke can respond positively to intensive speech treatments such as LSVT. Further studies are needed to investigate speech treatments specific to stroke.
Collapse
|
39
|
Darling M, Huber JE. Changes to articulatory kinematics in response to loudness cues in individuals with Parkinson's disease. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2011; 54:1247-59. [PMID: 21386044 PMCID: PMC3433496 DOI: 10.1044/1092-4388(2011/10-0024)] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
PURPOSE Individuals with Parkinson's disease (PD) exhibit differences in displacement and velocity of the articulators as compared with older adults. The purpose of the current study was to examine effects of 3 loudness cues on articulatory movement patterns in individuals with PD. METHOD Nine individuals diagnosed with idiopathic PD and 9 age- and sex-matched healthy controls produced sentences in 4 conditions: (a) comfortable loudness, (b) targeting 10 dB above comfortable, (c) twice as loud as comfortable, and (d) in background noise. Lip and jaw kinematics and acoustic measurements were obtained. RESULTS Both groups significantly increased sound pressure level (SPL) in the loud conditions as compared with the comfortable condition. For the loud conditions, both groups had the highest SPL in the background noise and the 10 dB conditions, and the lowest SPL in the twice as loud condition. Control participants produced the largest opening displacement in the background noise condition and the smallest opening displacement in the twice as loud condition. Conversely, individuals with PD produced the largest opening displacement in the twice as loud condition and the smallest opening displacement in the background noise condition. CONCLUSIONS Control participants and individuals with PD responded to cues to increase loudness in different ways. Changes in SPL may explain differences in kinematics for the control participants, but they do not explain such differences for individuals with PD.
Collapse
|
40
|
Lansford KL, Liss JM, Caviness JN, Utianski RL. A cognitive-perceptual approach to conceptualizing speech intelligibility deficits and remediation practice in hypokinetic dysarthria. PARKINSONS DISEASE 2011; 2011:150962. [PMID: 21918728 PMCID: PMC3171761 DOI: 10.4061/2011/150962] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 04/09/2011] [Revised: 06/14/2011] [Accepted: 07/13/2011] [Indexed: 11/20/2022]
Abstract
Hypokinetic dysarthria is a common manifestation of Parkinson's disease, which negatively influences quality of life. Behavioral techniques that aim to improve speech intelligibility constitute the bulk of intervention strategies for this population, as the dysarthria does not often respond vigorously to medical interventions. Although several case and group studies generally support the efficacy of behavioral treatment, much work remains to establish a rigorous evidence base. This absence of definitive research leaves both the speech-language pathologist and referring physician with the task of determining the feasibility and nature of therapy for intelligibility remediation in PD. The purpose of this paper is to introduce a novel framework for medical practitioners in which to conceptualize and justify potential targets for speech remediation. The most commonly targeted deficits (e.g., speaking rate and vocal loudness) can be supported by this approach, as well as underutilized and novel treatment targets that aim at the listener's perceptual skills.
Collapse
Affiliation(s)
- Kaitlin L Lansford
- Motor Speech Disorders Laboratory, Department of Speech and Hearing Science, Arizona State University, P.O. Box 870102, Tempe, AZ 85287-0102, USA
| | | | | | | |
Collapse
|
41
|
Kim J, Sironic A, Davis C. Hearing Speech in Noise: Seeing a Loud Talker is Better. Perception 2011; 40:853-62. [DOI: 10.1068/p6941] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
Seeing the talker improves the intelligibility of speech degraded by noise (a visual speech benefit). Given that talkers exaggerate spoken articulation in noise, this set of two experiments examined whether the visual speech benefit was greater for speech produced in noise than in quiet. We first examined the extent to which spoken articulation was exaggerated in noise by measuring the motion of face markers as four people uttered 10 sentences either in quiet or in babble-speech noise (these renditions were also filmed). The tracking results showed that articulated motion in speech produced in noise was greater than that produced in quiet and was more highly correlated with speech acoustics. Speech intelligibility was tested in a second experiment using a speech-perception-in-noise task under auditory-visual and auditory-only conditions. The results showed that the visual speech benefit was greater for speech recorded in noise than for speech recorded in quiet. Furthermore, the amount of articulatory movement was related to performance on the perception task, indicating that the enhanced gestures made when speaking in noise function to make speech more intelligible.
Collapse
Affiliation(s)
| | - Amanda Sironic
- Department of Psychology, The University of Melbourne, Australia
| | | |
Collapse
|
42
|
Green JR, Nip ISB, Wilson EM, Mefferd AS, Yunusova Y. Lip movement exaggerations during infant-directed speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2010; 53:1529-42. [PMID: 20699342 PMCID: PMC3548446 DOI: 10.1044/1092-4388(2010/09-0005)] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
PURPOSE Although a growing body of literature has identified the positive effects of visual speech on speech and language learning, oral movements of infant-directed speech (IDS) have rarely been studied. This investigation used 3-dimensional motion capture technology to describe how mothers modify their lip movements when talking to their infants. METHOD Lip movements were recorded from 25 mothers as they spoke to their infants and other adults. Lip shapes were analyzed for differences across speaking conditions. The maximum fundamental frequency, duration, acoustic intensity, and first and second formant frequency of each vowel also were measured. RESULTS Lip movements were significantly larger during IDS than during adult-directed speech, although the exaggerations were vowel specific. All of the vowels produced during IDS were characterized by an elevated vocal pitch and a slowed speaking rate when compared with vowels produced during adult-directed speech. CONCLUSION The pattern of lip-shape exaggerations did not provide support for the hypothesis that mothers produce exemplar visual models of vowels during IDS. Future work is required to determine whether the observed increases in vertical lip aperture engender visual and acoustic enhancements that facilitate the early learning of speech.
Collapse
Affiliation(s)
- Jordan R Green
- Department of Special Education and Communication Disorders, University of Nebraska-Lincoln, 318 Barkley Center, Lincoln, NE 68583, USA.
| | | | | | | | | |
Collapse
|
43
|
Mefferd AS, Green JR. Articulatory-to-acoustic relations in response to speaking rate and loudness manipulations. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2010; 53:1206-19. [PMID: 20699341 PMCID: PMC3548454 DOI: 10.1044/1092-4388(2010/09-0083)] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
PURPOSE In this investigation, the authors determined the strength of association between tongue kinematic and speech acoustics changes in response to speaking rate and loudness manipulations. Performance changes in the kinematic and acoustic domains were measured using two aspects of speech production presumably affecting speech clarity: phonetic specification and variability. METHOD Tongue movements for the vowels /ia/ were recorded in 10 healthy adults during habitual, fast, slow, and loud speech using three-dimensional electromagnetic articulography. To determine articulatory-to-acoustic relations for phonetic specification, the authors correlated changes in lingual displacement with changes in acoustic vowel distance. To determine articulatory-to-acoustic relations for phonetic variability, the authors correlated changes in lingual movement variability with changes in formant movement variability. RESULTS A significant positive linear association was found for kinematic and acoustic specification but not for kinematic and acoustic variability. Several significant speaking task effects were also observed. CONCLUSION Lingual displacement is a good predictor of acoustic vowel distance in healthy talkers. The weak association between kinematic and acoustic variability raises questions regarding the effects of articulatory variability on speech clarity and intelligibility, particularly in individuals with motor speech disorders.
Collapse
|
44
|
Garnier M, Henrich N, Dubois D. Influence of sound immersion and communicative interaction on the Lombard effect. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2010; 53:588-608. [PMID: 20008681 DOI: 10.1044/1092-4388(2009/08-0138)] [Citation(s) in RCA: 93] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
PURPOSE To examine the influence of sound immersion techniques and speech production tasks on speech adaptation in noise. METHOD In Experiment 1, we compared the modification of speakers' perception and speech production in noise when noise is played into headphones (with and without additional self-monitoring feedback) or over loudspeakers. We also examined how this sound immersion effect depends on noise type (broadband or cocktail party) and level (from 62 to 86dB SPL). In Experiment 2, we compared the modification of acoustic and lip articulatory parameters in noise when speakers interact or not with a speech partner. RESULTS Speech modifications in noise were greater when cocktail party noise was played in headphones than over loudspeakers. Such an effect was less noticeable in broadband noise. Adding a self-monitoring feedback into headphones reduced this effect but did not completely compensate for it. Speech modifications in noise were greater in interactive situation and concerned parameters that may not be related to voice intensity. CONCLUSIONS The results support the idea that the Lombard effect is both a communicative adaptation and an automatic regulation of vocal intensity. The influence of auditory and communicative factors has some methodological implications on the choice of appropriate paradigms to study the Lombard effect.
Collapse
Affiliation(s)
- Maëva Garnier
- Institut Jean Le Rond d'Alembert, LAM, (UMR 7190: UPMC Univ Paris 06, Centre National de la Recherche Scientifique (CNRS), Ministère de la Culture), Paris, France
| | | | | |
Collapse
|
45
|
Wenke RJ, Cornwell P, Theodoros DG. Changes to articulation following LSVT(R) and traditional dysarthria therapy in non-progressive dysarthria. INTERNATIONAL JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2010; 12:203-220. [PMID: 20433339 DOI: 10.3109/17549500903568468] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
The present study aimed to evaluate the effects of the Lee Silverman Voice Treatment (LSVT(R)) on acoustic and perceptual measures of articulation in non-progressive dysarthria in comparison to traditional dysarthria therapy. The study involved 26 individuals with non-progressive dysarthria who were randomly allocated to receive either LSVT(R) or traditional dysarthria therapy (TRAD), both of which were administered for 16 hourly sessions over 4 weeks. Participants' speech samples were collected over a total of six testing sessions during three assessment phases: (1) prior to treatment, (2) immediately post-treatment, and (3) 6 months post-treatment (FU). Speech samples were analysed perceptually to determine articulatory precision and intelligibility as well as acoustically using vowel space (and vowel formant measures) and first moment differences. Results revealed short and long-term significant increases in vowel space area following LSVT(R). Significantly increased intelligibility was also found at FU in the LSVT(R) group. No significant differences between groups for any variables were found. The study reveals that LSVT(R) may be a suitable treatment option for improving vowel articulation and subsequent intelligibility in some individuals with non-progressive dysarthria.
Collapse
|
46
|
Mooshammer C. Acoustic and laryngographic measures of the laryngeal reflexes of linguistic prominence and vocal effort in German. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 127:1047-58. [PMID: 20136226 PMCID: PMC2830266 DOI: 10.1121/1.3277160] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
This study uses acoustic and physiological measures to compare laryngeal reflexes of global changes in vocal effort to the effects of modulating such aspects of linguistic prominence as sentence accent, induced by focus variation, and word stress. Seven speakers were recorded by using a laryngograph. The laryngographic pulses were preprocessed to normalize time and amplitude. The laryngographic pulse shape was quantified using open and skewness quotients and also by applying a functional version of the principal component analysis. Acoustic measures included the acoustic open quotient and spectral balance in the vowel /e/ during the test syllable. The open quotient and the laryngographic pulse shape indicated a significantly shorter open phase for loud speech than for soft speech. Similar results were found for lexical stress, suggesting that lexical stress and loud speech are produced with a similar voice source mechanism. Stressed syllables were distinguished from unstressed syllables by their open phase and pulse shape, even in the absence of sentence accent. Evidence for laryngeal involvement in signaling focus, independent of fundamental frequency changes, was not as consistent across speakers. Acoustic results on various spectral balance measures were generally much less consistent compared to results from laryngographic data.
Collapse
Affiliation(s)
- Christine Mooshammer
- Haskins Laboratories, 300 George Street, Suite 900, New Haven, Connecticut 06511, USA.
| |
Collapse
|
47
|
Seshadri G, Yegnanarayana B. Perceived loudness of speech based on the characteristics of glottal excitation source. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 126:2061-2071. [PMID: 19813815 DOI: 10.1121/1.3203668] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
The impulse-like characteristic of glottal excitation in speech production is an important factor in the perception of loudness of speech signals. This characteristic is attributed to the abruptness of the closing phase in the glottal cycle. In this paper, an acoustic feature, called strength of excitation, is proposed to represent the impulse-like nature of excitation. The strength of excitation is derived from the linear prediction residual of speech signals, where the residual can be considered as an estimate of the source of excitation. Since the loudness of speech is perceived over one or more utterances of speech, it is hypothesized that the distribution of strength of excitation is indicative of the perceived loudness of speech. The distribution of strength of excitation is shown to distinguish between soft and loud utterances of speakers. The distribution can also help in discriminating between the loudness of two speakers. The loudness measure obtained using the distribution of the strength of excitation is in agreement with the subjective judgment of loudness of speech.
Collapse
Affiliation(s)
- Guruprasad Seshadri
- Dept. of Computer Science and Engineering, Indian Institute of Technology Madras, Chennai 600036, India.
| | | |
Collapse
|
48
|
Neel AT. Effects of loud and amplified speech on sentence and word intelligibility in Parkinson disease. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2009; 52:1021-1033. [PMID: 18978211 DOI: 10.1044/1092-4388(2008/08-0119)] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
PURPOSE In the two experiments in this study, the author examined the effects of increased vocal effort (loud speech) and amplification on sentence and word intelligibility in speakers with Parkinson disease (PD). Methods Five talkers with PD produced sentences and words at habitual levels of effort and using loud speech techniques. Amplified sets of sentences and words were created by increasing the intensity of habitual stimuli to the level of loud stimuli. Listeners rated the intelligibility of the 3 sets of sentences on a 1-7 scale and transcribed the 3 sets of words. RESULTS Both loud speech and amplification significantly improved intelligibility for sentences and words. Loud speech resulted in greater intelligibility improvement than amplification. CONCLUSIONS By comparing loud and amplified scores, about one third to one half of intelligibility improvement with loud speech could be attributed to increases in audibility or signal-to-noise ratio. Thus, factors other than increased intensity must be partly responsible for the loud speech benefit. Changes in articulation appear to play a relatively small role: Initial/h/was the only consonant to consistently show improvement with loud speech. Phonatory changes such as improvements in F(0) and spectral tilt may account for improved speech intelligibility using loud speech techniques.
Collapse
Affiliation(s)
- Amy T Neel
- Department of Speech and Hearing Sciences, University of New Mexico, Albuquerque, NM 87131-0001, USA.
| |
Collapse
|
49
|
McGhee H, Cornwell P, Addis P, Jarman C. Treating dysarthria following traumatic brain injury: Investigating the benefits of commencing treatment during post-traumatic amnesia in two participants. Brain Inj 2009; 20:1307-19. [PMID: 17132553 DOI: 10.1080/02699050601081851] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
PRIMARY OBJECTIVE The aims of this preliminary study were to explore the suitability for and benefits of commencing dysarthria treatment for people with traumatic brain injury (TBI) while in post-traumatic amnesia (PTA). It was hypothesized that behaviours in PTA don't preclude participation and dysarthria characteristics would improve post-treatment. RESEARCH DESIGN A series of comprehensive case analyses. METHODS AND PROCEDURES Two participants with severe TBI received dysarthria treatment focused on motor speech deficits until emergence from PTA. A checklist of neurobehavioural sequelae of TBI was rated during therapy and perceptual and motor speech assessments were administered before and after therapy. MAIN OUTCOMES AND RESULTS Results revealed that certain behaviours affected the quality of therapy but didn't preclude the provision of therapy. Treatment resulted in physiological improvements in some speech sub-systems for both participants, with varying functional speech outcomes. CONCLUSIONS These findings suggest that dysarthria treatment can begin and provide short-term benefits to speech production during the late stages of PTA post-TBI.
Collapse
Affiliation(s)
- Hannah McGhee
- Division of Speech Pathology, School of Health and Rehabilitation Sciences, University of Queensland, Brisbane, Australia
| | | | | | | |
Collapse
|
50
|
Roy N, Nissen SL, Dromey C, Sapir S. Articulatory changes in muscle tension dysphonia: evidence of vowel space expansion following manual circumlaryngeal therapy. JOURNAL OF COMMUNICATION DISORDERS 2009; 42:124-135. [PMID: 19054525 DOI: 10.1016/j.jcomdis.2008.10.001] [Citation(s) in RCA: 78] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2008] [Revised: 10/01/2008] [Accepted: 10/08/2008] [Indexed: 05/27/2023]
Abstract
UNLABELLED In a preliminary study, we documented significant changes in formant transitions associated with successful manual circumlaryngeal treatment (MCT) of muscle tension dysphonia (MTD), suggesting improvement in speech articulation. The present study explores further the effects of MTD on vowel articulation by means of additional vowel acoustic measures. Pre- and post-treatment audio recordings of 111 women with MTD were analyzed acoustically using two measures: vowel space area (VSA) and vowel articulation index (VAI), constructed using the first (F1) and second (F2) formants of 4 point vowels/ a, i, ae, u/, extracted from eight words within a standard reading passage. Pairwise t-tests revealed significant increases in both VSA and VAI, confirming that successful treatment of MTD is associated with vowel space expansion. Although MTD is considered a voice disorder, its treatment with MCT appears to positively affect vocal tract dynamics. While the precise mechanism underlying vowel space expansion remains unknown, improvements may be related to lowering of the larynx, expanding oropharyngeal space, and improving articulatory movements. LEARNING OUTCOMES The reader will be able to: (1) describe possible articulatory changes associated with successful treatment of muscle tension dysphonia; (2) describe two acoustic methods to assess vowel centralization and decentralization, and; (3) understand the basis for viewing muscle tension dysphonia as a disorder not solely confined to the larynx.
Collapse
Affiliation(s)
- Nelson Roy
- Department of Communication Sciences & Disorders, The University of Utah, Salt Lake City, UT 84112-0252, USA.
| | | | | | | |
Collapse
|