1
|
Kim SG, De Martino F, Overath T. Linguistic modulation of the neural encoding of phonemes. Cereb Cortex 2024; 34:bhae155. [PMID: 38687241 PMCID: PMC11059272 DOI: 10.1093/cercor/bhae155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 03/21/2024] [Accepted: 03/22/2024] [Indexed: 05/02/2024] Open
Abstract
Speech comprehension entails the neural mapping of the acoustic speech signal onto learned linguistic units. This acousto-linguistic transformation is bi-directional, whereby higher-level linguistic processes (e.g. semantics) modulate the acoustic analysis of individual linguistic units. Here, we investigated the cortical topography and linguistic modulation of the most fundamental linguistic unit, the phoneme. We presented natural speech and "phoneme quilts" (pseudo-randomly shuffled phonemes) in either a familiar (English) or unfamiliar (Korean) language to native English speakers while recording functional magnetic resonance imaging. This allowed us to dissociate the contribution of acoustic vs. linguistic processes toward phoneme analysis. We show that (i) the acoustic analysis of phonemes is modulated by linguistic analysis and (ii) that for this modulation, both of acoustic and phonetic information need to be incorporated. These results suggest that the linguistic modulation of cortical sensitivity to phoneme classes minimizes prediction error during natural speech perception, thereby aiding speech comprehension in challenging listening situations.
Collapse
Affiliation(s)
- Seung-Goo Kim
- Department of Psychology and Neuroscience, Duke University, 308 Research Dr, Durham, NC 27708, United States
- Research Group Neurocognition of Music and Language, Max Planck Institute for Empirical Aesthetics, Grüneburgweg 14, Frankfurt am Main 60322, Germany
| | - Federico De Martino
- Faculty of Psychology and Neuroscience, University of Maastricht, Universiteitssingel 40, 6229 ER Maastricht, Netherlands
| | - Tobias Overath
- Department of Psychology and Neuroscience, Duke University, 308 Research Dr, Durham, NC 27708, United States
- Duke Institute for Brain Sciences, Duke University, 308 Research Dr, Durham, NC 27708, United States
- Center for Cognitive Neuroscience, Duke University, 308 Research Dr, Durham, NC 27708, United States
| |
Collapse
|
2
|
McMullin MA, Kumar R, Higgins NC, Gygi B, Elhilali M, Snyder JS. Preliminary Evidence for Global Properties in Human Listeners During Natural Auditory Scene Perception. Open Mind (Camb) 2024; 8:333-365. [PMID: 38571530 PMCID: PMC10990578 DOI: 10.1162/opmi_a_00131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Accepted: 02/10/2024] [Indexed: 04/05/2024] Open
Abstract
Theories of auditory and visual scene analysis suggest the perception of scenes relies on the identification and segregation of objects within it, resembling a detail-oriented processing style. However, a more global process may occur while analyzing scenes, which has been evidenced in the visual domain. It is our understanding that a similar line of research has not been explored in the auditory domain; therefore, we evaluated the contributions of high-level global and low-level acoustic information to auditory scene perception. An additional aim was to increase the field's ecological validity by using and making available a new collection of high-quality auditory scenes. Participants rated scenes on 8 global properties (e.g., open vs. enclosed) and an acoustic analysis evaluated which low-level features predicted the ratings. We submitted the acoustic measures and average ratings of the global properties to separate exploratory factor analyses (EFAs). The EFA of the acoustic measures revealed a seven-factor structure explaining 57% of the variance in the data, while the EFA of the global property measures revealed a two-factor structure explaining 64% of the variance in the data. Regression analyses revealed each global property was predicted by at least one acoustic variable (R2 = 0.33-0.87). These findings were extended using deep neural network models where we examined correlations between human ratings of global properties and deep embeddings of two computational models: an object-based model and a scene-based model. The results support that participants' ratings are more strongly explained by a global analysis of the scene setting, though the relationship between scene perception and auditory perception is multifaceted, with differing correlation patterns evident between the two models. Taken together, our results provide evidence for the ability to perceive auditory scenes from a global perspective. Some of the acoustic measures predicted ratings of global scene perception, suggesting representations of auditory objects may be transformed through many stages of processing in the ventral auditory stream, similar to what has been proposed in the ventral visual stream. These findings and the open availability of our scene collection will make future studies on perception, attention, and memory for natural auditory scenes possible.
Collapse
Affiliation(s)
| | - Rohit Kumar
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Nathan C. Higgins
- Department of Communication Sciences & Disorders, University of South Florida, Tampa, FL, USA
| | - Brian Gygi
- East Bay Institute for Research and Education, Martinez, CA, USA
| | - Mounya Elhilali
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Joel S. Snyder
- Department of Psychology, University of Nevada, Las Vegas, Las Vegas, NV, USA
| |
Collapse
|
3
|
Low DM, Rao V, Randolph G, Song PC, Ghosh SS. Identifying bias in models that detect vocal fold paralysis from audio recordings using explainable machine learning and clinician ratings. medRxiv 2024:2020.11.23.20235945. [PMID: 33501466 PMCID: PMC7836138 DOI: 10.1101/2020.11.23.20235945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Introduction Detecting voice disorders from voice recordings could allow for frequent, remote, and low-cost screening before costly clinical visits and a more invasive laryngoscopy examination. Our goals were to detect unilateral vocal fold paralysis (UVFP) from voice recordings using machine learning, to identify which acoustic variables were important for prediction to increase trust, and to determine model performance relative to clinician performance. Methods Patients with confirmed UVFP through endoscopic examination (N=77) and controls with normal voices matched for age and sex (N=77) were included. Voice samples were elicited by reading the Rainbow Passage and sustaining phonation of the vowel "a". Four machine learning models of differing complexity were used. SHapley Additive explanations (SHAP) was used to identify important features. Results The highest median bootstrapped ROC AUC score was 0.87 and beat clinician's performance (range: 0.74 - 0.81) based on the recordings. Recording durations were different between UVFP recordings and controls due to how that data was originally processed when storing, which we can show can classify both groups. And counterintuitively, many UVFP recordings had higher intensity than controls, when UVFP patients tend to have weaker voices, revealing a dataset-specific bias which we mitigate in an additional analysis. Conclusion We demonstrate that recording biases in audio duration and intensity created dataset-specific differences between patients and controls, which models used to improve classification. Furthermore, clinician's ratings provide further evidence that patients were over-projecting their voices and being recorded at a higher amplitude signal than controls. Interestingly, after matching audio duration and removing variables associated with intensity in order to mitigate the biases, the models were able to achieve a similar high performance. We provide a set of recommendations to avoid bias when building and evaluating machine learning models for screening in laryngology.
Collapse
Affiliation(s)
- Daniel M. Low
- Program in Speech and Hearing Bioscience and Technology, Harvard Medical School, Boston, MA, USA
- McGovern Institute for Brain Research, MIT, Cambridge, MA, USA
| | - Vishwanatha Rao
- Department of Biomedical Engineering, Columbia University, New York, NY, USA
- Department of Otolaryngology–Head and Neck Surgery, Massachusetts Eye and Ear Infirmary, Boston, MA, USA
| | - Gregory Randolph
- Department of Otolaryngology–Head and Neck Surgery, Massachusetts Eye and Ear Infirmary, Boston, MA, USA
- Department of Otolaryngology–Head and Neck Surgery, Harvard Medical School, Boston, MA, USA
| | - Phillip C. Song
- Department of Otolaryngology–Head and Neck Surgery, Massachusetts Eye and Ear Infirmary, Boston, MA, USA
- Department of Otolaryngology–Head and Neck Surgery, Harvard Medical School, Boston, MA, USA
| | - Satrajit S. Ghosh
- Program in Speech and Hearing Bioscience and Technology, Harvard Medical School, Boston, MA, USA
- McGovern Institute for Brain Research, MIT, Cambridge, MA, USA
- Department of Otolaryngology–Head and Neck Surgery, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
4
|
Homma Y, Zhuang X, Watari T, Hayashi K, Baba T, Kamath A, Ishijima M. Differences in acoustic parameters of hammering sounds between successful and unsuccessful initial cementless cup press-fit fixation in total hip arthroplasty. Bone Jt Open 2024; 5:154-161. [PMID: 38423101 PMCID: PMC10904203 DOI: 10.1302/2633-1462.53.bjo-2023-0160.r1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/02/2024] Open
Abstract
Aims It is important to analyze objectively the hammering sound in cup press-fit technique in total hip arthroplasty (THA) in order to better understand the change of the sound during impaction. We hypothesized that a specific characteristic would present in a hammering sound with successful fixation. We designed the study to quantitatively investigate the acoustic characteristics during cementless cup impaction in THA. Methods In 52 THAs performed between November 2018 and April 2022, the acoustic parameters of the hammering sound of 224 impacts of successful press-fit fixation, and 55 impacts of unsuccessful press-fit fixation, were analyzed. The successful fixation was defined if the following two criteria were met: 1) intraoperatively, the stability of the cup was retained after manual application of the torque test; and 2) at one month postoperatively, the cup showed no translation on radiograph. Each hammering sound was converted to sound pressures in 24 frequency bands by fast Fourier transform analysis. Basic patient characteristics were assessed as potential contributors to the hammering sound. Results The median sound pressure (SP) of successful fixation at 0.5 to 1.0 kHz was higher than that of unsuccessful fixation (0.0694 (interquartile range (IQR) 0.04721 to 0.09576) vs 0.05425 (IQR 0.03047 to 0.06803), p < 0.001). The median SP of successful fixation at 3.5 to 4.0 kHz and 4.0 to 4.5 kHz was lower than that of unsuccessful fixation (0.0812 (IQR 0.05631 to 0.01161) vs 0.1233 (IQR 0.0730 to 0.1449), p < 0.001; and 0.0891 (IQR 0.0526 to 0.0891) vs 0.0885 (IQR 0.0716 to 0.1048); p < 0.001, respectively). There was a statistically significant positive relationship between body weight and SP at 0.5 to 1.0 kHz (p < 0.001). Multivariate analyses indicated that the SP at 0.5 to 1.0 kHz and 3.5 to 4.0 kHz was independently associated with the successful fixation. Conclusion The frequency bands of 0.5 to 1.0 and 3.5 to 4.0 kHz were the key to distinguish the sound characteristics between successful and unsuccessful press-fit cup fixation.
Collapse
Affiliation(s)
- Yasuhiro Homma
- Department of Medicine for Orthopaedics and Motor Organ, Juntendo University Graduate School of Medicine, Tokyo, Japan
- Department of Orthopaedics, Faculty of Medicine, Juntendo University, Tokyo, Japan
- Department of Community Medicine and Research for Bone and Joint Diseases, Juntendo University Graduate School of Medicine, Tokyo, Japan
| | - Xu Zhuang
- Department of Medicine for Orthopaedics and Motor Organ, Juntendo University Graduate School of Medicine, Tokyo, Japan
| | - Taiji Watari
- Department of Medicine for Orthopaedics and Motor Organ, Juntendo University Graduate School of Medicine, Tokyo, Japan
- Department of Orthopaedics, Faculty of Medicine, Juntendo University, Tokyo, Japan
| | - Koju Hayashi
- Department of Orthopaedics, Faculty of Medicine, Juntendo University, Tokyo, Japan
| | - Tomonori Baba
- Department of Medicine for Orthopaedics and Motor Organ, Juntendo University Graduate School of Medicine, Tokyo, Japan
- Department of Orthopaedics, Faculty of Medicine, Juntendo University, Tokyo, Japan
- Department of Pathophysiology for Locomotive Diseases, Juntendo University Graduate School of Medicine, Tokyo, Japan
| | - Atul Kamath
- Department of Orthopaedic Surgery, Orthopaedic and Rheumatologic Institute, Cleveland Clinic Foundation, Cleveland, USA
| | - Muneaki Ishijima
- Department of Medicine for Orthopaedics and Motor Organ, Juntendo University Graduate School of Medicine, Tokyo, Japan
- Department of Orthopaedics, Faculty of Medicine, Juntendo University, Tokyo, Japan
- Department of Community Medicine and Research for Bone and Joint Diseases, Juntendo University Graduate School of Medicine, Tokyo, Japan
- Department of Pathophysiology for Locomotive Diseases, Juntendo University Graduate School of Medicine, Tokyo, Japan
| |
Collapse
|
5
|
Martinez-Velasco JD, Filomena-Ambrosio A, Garzón-Castro CL. Technological tools for the measurement of sensory characteristics in food: A review. F1000Res 2024; 12:340. [PMID: 38322308 PMCID: PMC10844804 DOI: 10.12688/f1000research.131914.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 11/24/2023] [Indexed: 02/08/2024] Open
Abstract
The use of technological tools, in the food industry, has allowed a quick and reliable identification and measurement of the sensory characteristics of food matrices is of great importance, since they emulate the functioning of the five senses (smell, taste, sight, touch, and hearing). Therefore, industry and academia have been conducting research focused on developing and using these instruments which is evidenced in various studies that have been reported in the scientific literature. In this review, several of these technological tools are documented, such as the e-nose, e-tongue, colorimeter, artificial vision systems, and instruments that allow texture measurement (texture analyzer, electromyography, others). These allow us to carry out processes of analysis, review, and evaluation of food to determine essential characteristics such as quality, composition, maturity, authenticity, and origin. The determination of these characteristics allows the standardization of food matrices, achieving the improvement of existing foods and encouraging the development of new products that satisfy the sensory experiences of the consumer, driving growth in the food sector. However, the tools discussed have some limitations such as acquisition cost, calibration and maintenance cost, and in some cases, they are designed to work with a specific food matrix.
Collapse
Affiliation(s)
- José D Martinez-Velasco
- Engineering Faculty - Research Group CAPSAB, Universidad de La Sabana, Campus del Puente del Común, Km 7 Autopista Norte de Bogotá, Chia, Cundinamarca, 250001, Colombia
| | - Annamaria Filomena-Ambrosio
- International School of Economics and Administrative Science - Research Group Alimentación, Gestión de Procesos y Servicio de la Universidad de La Sabana Research Group, Universidad de La Sabana, Campus del Puente del Común, Km 7 Autopista Norte de Bogotá, Chía, Cundinamarca, 250001, Colombia
| | - Claudia L Garzón-Castro
- Engineering Faculty - Research Group CAPSAB, Universidad de La Sabana, Campus del Puente del Común, Km 7 Autopista Norte de Bogotá, Chia, Cundinamarca, 250001, Colombia
| |
Collapse
|
6
|
Simeone PJ, Green JR, Tager-Flusberg H, Chenausky KV. Vowel distinctiveness as a concurrent predictor of expressive language function in autistic children. Autism Res 2024; 17:419-431. [PMID: 38348589 DOI: 10.1002/aur.3102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 01/10/2024] [Indexed: 02/21/2024]
Abstract
Speech ability may limit spoken language development in some minimally verbal autistic children. In this study, we aimed to determine whether an acoustic measure of speech production, vowel distinctiveness, is concurrently related to expressive language (EL) for autistic children. Syllables containing the vowels [i] and [a] were recorded remotely from 27 autistic children (4;1-7;11) with a range of spoken language abilities. Vowel distinctiveness was calculated using automatic formant tracking software. Robust hierarchical regressions were conducted with receptive language (RL) and vowel distinctiveness as predictors of EL. Hierarchical regressions were also conducted within a High EL and a Low EL subgroup. Vowel distinctiveness accounted for 29% of the variance in EL for the entire group, RL for 38%. For the Low EL group, only vowel distinctiveness was significant, accounting for 38% of variance in EL. Conversely, in the High EL group, only RL was significant and accounted for 26% of variance in EL. Replicating previous results, speech production and RL significantly predicted concurrent EL in autistic children, with speech production being the sole significant predictor for the Low EL group and RL the sole significant predictor for the High EL group. Further work is needed to determine whether vowel distinctiveness longitudinally, as well as concurrently, predicts EL. Findings have important implications for the early identification of language impairment and in developing language interventions for autistic children.
Collapse
Affiliation(s)
- Paul J Simeone
- Department of Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, Massachusetts, USA
- Division of Allied health and Supportive Technology, May Institute, Randolph, Massachusetts, USA
| | - Jordan R Green
- Department of Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, Massachusetts, USA
- Department of Speech and Hearing Bioscience and Technology, Division of Medical Sciences, Harvard University, Cambridge, Massachusetts, USA
| | - Helen Tager-Flusberg
- Department of Psychological & Brain Sciences, College of Arts and Sciences, Boston University, Boston, Massachusetts, USA
| | - Karen V Chenausky
- Department of Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, Massachusetts, USA
- Department of Neurology, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
7
|
Aoyama K, Hong L, Flege JE, Akahane-Yamada R, Yamada T. Relationships Between Acoustic Characteristics and Intelligibility Scores: A Reanalysis of Japanese Speakers' Productions of American English Liquids. Lang Speech 2023; 66:1030-1045. [PMID: 36680472 DOI: 10.1177/00238309221140910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
The primary purpose of this research report was to investigate the relationships between acoustic characteristics and perceived intelligibility for native Japanese speakers' productions of American English liquids. This report was based on a reanalysis of intelligibility scores and acoustic analyses that were reported in two previous studies. We examined which acoustic parameters were associated with higher perceived intelligibility scores for their productions of /l/ and /ɹ/ in American English, and whether Japanese speakers' productions of the two liquids were acoustically differentiated from each other. Results demonstrated that the second formant (F2) was strongly correlated with the perceived intelligibility scores for the Japanese adults' productions. Results also demonstrated that the Japanese adults' and children's productions of /l/ and /ɹ/ were indeed differentiated by some acoustic parameters including the third formant (F3). In addition, some changes occurred in the Japanese children's productions over the course of 1 year. Overall, the present report shows that Japanese speakers of American English may be making a distinction between /l/ and /ɹ/ in production, although the distinctions are made in a different way compared with native English speakers' productions. These findings have implications for setting realistic goals for improving intelligibility of English /l/ and /ɹ/ for Japanese speakers, as well as theoretical advancement of second-language speech learning.
Collapse
Affiliation(s)
- Katsura Aoyama
- Department of Audiology & Speech-Language Pathology, University of North Texas, USA
| | - Lingzi Hong
- Department of Information Science, University of North Texas, USA
| | - James E Flege
- Speech and Hearing Sciences, University of Alabama at Birmingham, USA
| | | | - Tsuneo Yamada
- Department of Informatics, The Open University of Japan, Japan
| |
Collapse
|
8
|
Mirkoska V, Antonsson M, Hartelius L, Nylén F. Detection of Subclinical Motor Speech Deficits after Presumed Low-Grade Glioma Surgery. Brain Sci 2023; 13:1631. [PMID: 38137079 PMCID: PMC10741922 DOI: 10.3390/brainsci13121631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 11/11/2023] [Accepted: 11/21/2023] [Indexed: 12/24/2023] Open
Abstract
Motor speech performance was compared before and after surgical resection of presumed low-grade gliomas. This pre- and post-surgery study was conducted on 15 patients (mean age = 41) with low-grade glioma classified based on anatomic features. Repetitions of /pa/, /ta/, /ka/, and /pataka/ recorded before and 3 months after surgery were analyzed regarding rate and regularity. A significant reduction (6 to 5.6 syllables/s) pre- vs. post-surgery was found in the rate for /ka/, which is comparable to the approximate average decline over 10-15 years of natural aging reported previously. For all other syllable types, rates were within normal age-adjusted ranges in both preoperative and postoperative sessions. The decline in /ka/ rate might reflect a subtle reduction in motor speech production, but the effects were not severe. All but one patient continued to perform within normal ranges post-surgery; one performed two standard deviations below age-appropriate norms pre- and post-surgery in all syllable tasks. The patient experienced motor speech difficulties, which may be related to the tumor's location in an area important for speech. Low-grade glioma may reduce maximum speech-motor performance in individual patients, but larger samples are needed to elucidate how often the effect occurs.
Collapse
Affiliation(s)
- Vesna Mirkoska
- Speech and Language Pathology Unit, Institute of Neuroscience and Physiology, Sahlgrenska Academy at the University of Gothenburg, 40530 Gothenburg, Sweden; (M.A.); (L.H.)
| | - Malin Antonsson
- Speech and Language Pathology Unit, Institute of Neuroscience and Physiology, Sahlgrenska Academy at the University of Gothenburg, 40530 Gothenburg, Sweden; (M.A.); (L.H.)
| | - Lena Hartelius
- Speech and Language Pathology Unit, Institute of Neuroscience and Physiology, Sahlgrenska Academy at the University of Gothenburg, 40530 Gothenburg, Sweden; (M.A.); (L.H.)
| | - Fredrik Nylén
- Department of Clinical Sciences, Umeå University, 90736 Umeå, Sweden
| |
Collapse
|
9
|
Rong P, Benson J. Intergenerational choral singing to improve communication outcomes in Parkinson's disease: Development of a theoretical framework and an integrated measurement tool. Int J Speech Lang Pathol 2023; 25:722-745. [PMID: 36106430 DOI: 10.1080/17549507.2022.2110281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Purpose: This study presented an initial step towards developing the evidence base for intergenerational choral singing as a communication-focussed rehabilitative approach for Parkinson's disease (PD).Method: A theoretical framework was established to conceptualise the rehabilitative effect of intergenerational choral singing on four domains of communication impairments - motor drive, timing mechanism, sensorimotor integration, higher-level cognitive and affective functions - as well as activity/participation, and quality of life. A computer-assisted multidimensional acoustic analysis was developed to objectively assess the targeted domains of communication impairments. Voice Handicap Index and the World Health Organization's Quality of Life assessment-abbreviated version were used to obtain patient-reported outcomes at the activity/participation and quality of life levels. As a proof of concept, a single subject with PD was recruited to participate in 9 weekly 1-h intergenerational choir rehearsals. The subject was assessed before, 1 week post, and 8 weeks post-choir.Result: Notable trends of improvement were observed in multiple domains of communication impairments at 1 week post-choir. Some improvements were maintained at 8 weeks post-choir. Patient-reported outcomes exhibited limited pre-post changes.Conclusion: This study provided the theoretical groundwork and an empirical measurement tool for future validation of intergenerational choral singing as a novel rehabilitation for PD.
Collapse
Affiliation(s)
- Panying Rong
- Department of Speech-Language-Hearing: Sciences & Disorders, University of Kansas, Lawrence, KS, USA and
| | | |
Collapse
|
10
|
Asci F, Marsili L, Suppa A, Saggio G, Michetti E, Di Leo P, Patera M, Longo L, Ruoppolo G, Del Gado F, Tomaiuoli D, Costantini G. Acoustic analysis in stuttering: a machine-learning study. Front Neurol 2023; 14:1169707. [PMID: 37456655 PMCID: PMC10347393 DOI: 10.3389/fneur.2023.1169707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Accepted: 06/16/2023] [Indexed: 07/18/2023] Open
Abstract
Background Stuttering is a childhood-onset neurodevelopmental disorder affecting speech fluency. The diagnosis and clinical management of stuttering is currently based on perceptual examination and clinical scales. Standardized techniques for acoustic analysis have prompted promising results for the objective assessment of dysfluency in people with stuttering (PWS). Objective We assessed objectively and automatically voice in stuttering, through artificial intelligence (i.e., the support vector machine - SVM classifier). We also investigated the age-related changes affecting voice in stutterers, and verified the relevance of specific speech tasks for the objective and automatic assessment of stuttering. Methods Fifty-three PWS (20 children, 33 younger adults) and 71 age-/gender-matched controls (31 children, 40 younger adults) were recruited. Clinical data were assessed through clinical scales. The voluntary and sustained emission of a vowel and two sentences were recorded through smartphones. Audio samples were analyzed using a dedicated machine-learning algorithm, the SVM to compare PWS and controls, both children and younger adults. The receiver operating characteristic (ROC) curves were calculated for a description of the accuracy, for all comparisons. The likelihood ratio (LR), was calculated for each PWS during all speech tasks, for clinical-instrumental correlations, by using an artificial neural network (ANN). Results Acoustic analysis based on machine-learning algorithm objectively and automatically discriminated between the overall cohort of PWS and controls with high accuracy (88%). Also, physiologic ageing crucially influenced stuttering as demonstrated by the high accuracy (92%) of machine-learning analysis when classifying children and younger adults PWS. The diagnostic accuracies achieved by machine-learning analysis were comparable for each speech task. The significant clinical-instrumental correlations between LRs and clinical scales supported the biological plausibility of our findings. Conclusion Acoustic analysis based on artificial intelligence (SVM) represents a reliable tool for the objective and automatic recognition of stuttering and its relationship with physiologic ageing. The accuracy of the automatic classification is high and independent of the speech task. Machine-learning analysis would help clinicians in the objective diagnosis and clinical management of stuttering. The digital collection of audio samples here achieved through smartphones would promote the future application of the technique in a telemedicine context (home environment).
Collapse
Affiliation(s)
- Francesco Asci
- Department of Human Neurosciences, Sapienza University of Rome, Rome, Italy
- IRCCS Neuromed Institute, Pozzilli, Italy
| | - Luca Marsili
- Department of Neurology, James J. and Joan A. Gardner Center for Parkinson’s Disease and Movement Disorders, University of Cincinnati, Cincinnati, OH, United States
| | - Antonio Suppa
- Department of Human Neurosciences, Sapienza University of Rome, Rome, Italy
- IRCCS Neuromed Institute, Pozzilli, Italy
| | - Giovanni Saggio
- Department of Electronic Engineering, University of Rome Tor Vergata, Rome, Italy
| | | | - Pietro Di Leo
- Department of Electronic Engineering, University of Rome Tor Vergata, Rome, Italy
| | - Martina Patera
- Department of Human Neurosciences, Sapienza University of Rome, Rome, Italy
| | - Lucia Longo
- Department of Sense Organs, Otorhinolaryngology Section, Sapienza University of Rome, Rome, Italy
| | | | | | | | - Giovanni Costantini
- Department of Electronic Engineering, University of Rome Tor Vergata, Rome, Italy
| |
Collapse
|
11
|
Oh C, Morris R, Wang X, Raskin MS. Analysis of emotional prosody as a tool for differential diagnosis of cognitive impairments: a pilot research. Front Psychol 2023; 14:1129406. [PMID: 37425151 PMCID: PMC10327638 DOI: 10.3389/fpsyg.2023.1129406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Accepted: 05/26/2023] [Indexed: 07/11/2023] Open
Abstract
Introduction This pilot research was designed to investigate if prosodic features from running spontaneous speech could differentiate dementia of the Alzheimer's type (DAT), vascular dementia (VaD), mild cognitive impairment (MCI), and healthy cognition. The study included acoustic measurements of prosodic features (Study 1) and listeners' perception of emotional prosody differences (Study 2). Methods For Study 1, prerecorded speech samples describing the Cookie Theft picture from 10 individuals with DAT, 5 with VaD, 9 with MCI, and 10 neurologically healthy controls (NHC) were obtained from the DementiaBank. The descriptive narratives by each participant were separated into utterances. These utterances were measured on 22 acoustic features via the Praat software and analyzed statistically using the principal component analysis (PCA), regression, and Mahalanobis distance measures. Results The analyses on acoustic data revealed a set of five factors and four salient features (i.e., pitch, amplitude, rate, and syllable) that discriminate the four groups. For Study 2, a group of 28 listeners served as judges of emotions expressed by the speakers. After a set of training and practice sessions, they were instructed to indicate the emotions they heard. Regression measures were used to analyze the perceptual data. The perceptual data indicated that the factor underlying pitch measures had the greatest strength for the listeners to separate the groups. Discussion The present pilot work showed that using acoustic measures of prosodic features may be a functional method for differentiating among DAT, VaD, MCI, and NHC. Future studies with data collected under a controlled environment using better stimuli are warranted.
Collapse
Affiliation(s)
- Chorong Oh
- School of Rehabilitation and Communication Sciences, Ohio University, Athens, OH, United States
| | - Richard Morris
- School of Communication Science and Disorders, Florida State University, Tallahassee, FL, United States
| | - Xianhui Wang
- School of Medicine, University of California Irvine, Irvine, CA, United States
| | - Morgan S. Raskin
- School of Communication Science and Disorders, Florida State University, Tallahassee, FL, United States
| |
Collapse
|
12
|
Josephs KA, Duffy JR, Martin PR, Stephens YC, Singh NA, Clark HM, Botha H, Lowe VJ, Whitwell JL, Utianski RL. Acoustic Analysis and Neuroimaging Correlates of Diadochokinetic Rates in Mild-Moderate Primary Progressive Apraxia of Speech. Brain Lang 2023; 240:105254. [PMID: 37584042 PMCID: PMC10424909 DOI: 10.1016/j.bandl.2023.105254] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/17/2023]
Abstract
Speech rate can be judged clinically using diadochokinetic (DDK) tasks, such as alternating motion rates (AMR) and sequential motion rates (SMR). We evaluated whether acoustic AMR/SMR speech rates would differentiate primary progressive apraxia of speech (PPAOS) from healthy controls, and determined how DDK rates relate to phonetic and prosodic speech characteristics and brain metabolism on FDG-PET. Rate was calculated for each of three AMRs (repetitions of 'puh', 'tuh', and 'kuh') and for SMRs (repetitions of 'puhtuhkuh') for 27 PPAOS patients and 52 controls who underwent FDG-PET. PPAOS patients were slower than controls on all DDK tasks. All DDK rates correlated with apraxia of speech severity, with strongest associations with prosodic speech features. Slower DDK rates were associated with hypometabolism in the right cerebellar dentate and left supplementary motor area. Performance on AMR rate, not just SMR rate, may be impaired in mild PPAOS, but sensitivity and specificity require further study.
Collapse
Affiliation(s)
| | | | - Peter R. Martin
- Department of Quantitative Health Research, Mayo Clinic, Rochester, MN, USA
| | | | | | | | - Hugo Botha
- Department of Neurology, Mayo Clinic, Rochester, MN, USA
| | - Val J. Lowe
- Department of Radiology, Mayo Clinic, Rochester, MN, USA
| | | | | |
Collapse
|
13
|
Ma M, Hua R, Bao D, Ye G, Tang Z, Hua L. Alarm Calling in Plateau Pika ( Ochotona curzoniae): Evidence from Field Observations and Simulated Predator and Playback Experiments. Animals (Basel) 2023; 13:ani13071271. [PMID: 37048527 PMCID: PMC10093306 DOI: 10.3390/ani13071271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Revised: 04/02/2023] [Accepted: 04/04/2023] [Indexed: 04/14/2023] Open
Abstract
Acoustic communication plays a vital role in passing or sharing information between individuals. Identifying the biological meaning of vocal signals is crucial in understanding the survival strategies of animals. However, there are many challenges in identifying the true meaning of such signals. The plateau pika (Ochotona curzoniae) is a call-producing mammal endemic to the Qinghai-Tibet plateau (QTP) and considered a keystone species owing to its multiple benefits in alpine rangeland ecosystems. Previous studies have shown that plateau pikas emit alarm calls as part of their daily activities. However, only field observations have been used to identify these alarm calls of the plateau pika, with no attempts at using playback experiments. Here, we report the alarm calling of plateau pikas through field observations as well as simulated predator and playback experiments in the Eastern QTP from 2021 to 2022. We found that both female and male adults emitted alarm calls, the signals of which comprised only one syllable, with a duration of 0.1-0.3 s. There were no differences in the characteristics between the observed alarm calls and those made in response to the simulated predator. The duration of the alarm call response varied with altitude, with plateau pikas living at higher altitudes responding at shorter durations than those at lower altitudes.
Collapse
Affiliation(s)
- Meina Ma
- College of Grassland Science, Gansu Agricultural University, Key Laboratory of Grassland Ecosystem of the Ministry of Education, Engineering and Technology Research Center for Alpine Rodent Pests Control, National Forestry and Grassland Administration, Lanzhou 730070, China
| | - Rui Hua
- College of Grassland Science, Gansu Agricultural University, Key Laboratory of Grassland Ecosystem of the Ministry of Education, Engineering and Technology Research Center for Alpine Rodent Pests Control, National Forestry and Grassland Administration, Lanzhou 730070, China
| | - Darhan Bao
- College of Grassland Science, Gansu Agricultural University, Key Laboratory of Grassland Ecosystem of the Ministry of Education, Engineering and Technology Research Center for Alpine Rodent Pests Control, National Forestry and Grassland Administration, Lanzhou 730070, China
| | - Guohui Ye
- College of Grassland Science, Gansu Agricultural University, Key Laboratory of Grassland Ecosystem of the Ministry of Education, Engineering and Technology Research Center for Alpine Rodent Pests Control, National Forestry and Grassland Administration, Lanzhou 730070, China
| | - Zhuangsheng Tang
- College of Grassland Science, Gansu Agricultural University, Key Laboratory of Grassland Ecosystem of the Ministry of Education, Engineering and Technology Research Center for Alpine Rodent Pests Control, National Forestry and Grassland Administration, Lanzhou 730070, China
| | - Limin Hua
- College of Grassland Science, Gansu Agricultural University, Key Laboratory of Grassland Ecosystem of the Ministry of Education, Engineering and Technology Research Center for Alpine Rodent Pests Control, National Forestry and Grassland Administration, Lanzhou 730070, China
| |
Collapse
|
14
|
Kouba T, Frank W, Tykalova T, Mühlbäck A, Klempíř J, Lindenberg KS, Landwehrmeyer GB, Rusz J. Speech biomarkers in Huntington's disease: A cross-sectional study in pre-symptomatic, prodromal and early manifest stages. Eur J Neurol 2023; 30:1262-1271. [PMID: 36732902 DOI: 10.1111/ene.15726] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 01/25/2023] [Accepted: 01/28/2023] [Indexed: 02/04/2023]
Abstract
BACKGROUND AND PURPOSE Motor speech alterations are a prominent feature of clinically manifest Huntington's disease (HD). Objective acoustic analysis of speech can quantify speech alterations. It is currently unknown, however, at what stage of HD speech alterations can be reliably detected. We aimed to explore the patterns and extent of speech alterations using objective acoustic analysis in HD and to assess correlations with both rater-assessed phenotypical features and biological determinants of HD. METHODS Speech samples were acquired from 44 premanifest (29 pre-symptomatic and 15 prodromal) and 25 manifest HD gene expansion carriers, and 25 matched healthy controls. A quantitative automated acoustic analysis of 10 speech dimensions was performed. RESULTS Automated speech analysis allowed us to differentiate between participants with HD and controls, with areas under the curve of 0.74 for pre-symptomatic, 0.92 for prodromal, and 0.97 for manifest stages. In addition to irregular alternating motion rates and prolonged pauses seen only in manifest HD, both prodromal and manifest HD displayed slowed articulation rate, slowed alternating motion rates, increased loudness variability, and unstable steady-state position of articulators. In participants with premanifest HD, speech alteration severity was associated with cognitive slowing (r = -0.52, p < 0.001) and the extent of bradykinesia (r = 0.43, p = 0.004). Speech alterations correlated with a measure of exposure to mutant gene products (CAG-age-product score; r = 0.60, p < 0.001). CONCLUSION Speech abnormalities in HD are associated with other motor and cognitive deficits and are measurable already in premanifest stages of HD. Therefore, automated speech analysis might represent a quantitative HD biomarker with potential for assessing disease progression.
Collapse
Affiliation(s)
- Tomas Kouba
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic
| | - Wiebke Frank
- Department of Neurology, University Ulm, Ulm, Germany
| | - Tereza Tykalova
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic
| | - Alzbeta Mühlbäck
- Department of Neurology, University Ulm, Ulm, Germany.,Department of Neuropsychiatry, Huntington Center South, kbo-Isar-Amper-Klinikum Taufkirchen (Vils), Taufkirchen, Germany.,Department of Neurology and Center of Clinical Neuroscience, 1st Faculty of Medicine, Charles University and General University Hospital, Prague, Czech Republic
| | - Jiří Klempíř
- Department of Neurology and Center of Clinical Neuroscience, 1st Faculty of Medicine, Charles University and General University Hospital, Prague, Czech Republic
| | | | | | - Jan Rusz
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic.,Department of Neurology and Center of Clinical Neuroscience, 1st Faculty of Medicine, Charles University and General University Hospital, Prague, Czech Republic.,Department of Neurology & ARTORG Center, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
| |
Collapse
|
15
|
Glover M, Duhamel MF. Assessment of Two Audio-Recording Methods for Remote Collection of Vocal Biomarkers Indicative of Tobacco Smoking Harm. Acoust Aust 2023; 51:39-52. [PMCID: PMC9511443 DOI: 10.1007/s40857-022-00279-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 08/24/2022] [Indexed: 01/11/2024]
Abstract
This study aimed to determine if self-complete at-home recordings could produce audio samples of sufficient quality for use in voice analysis software, and if audio samples of similar or sufficient quality could be extracted from audio-recorded naturalistic phone interviews. Data were obtained from 31 adults aged 18 years and over who smoked. The /a/ sound segment was manually isolated, and analysis functions were used to produce the following values: fundamental frequency, jitter, shimmer, noise ratio, formant 3, and formant 4. The /a/ sound segment was then manually isolated from audio recordings of naturalistic interviews previously conducted by phone. These were analysed in the same way and compared for quality against Evistr-recorded audio samples from the same participants. A third audio sample consisted of an Evistr or phone-recorded sustained phonation of the /a/ sound. Means and standard deviations were calculated for the target vocal parameters. Statistical comparisons for quality of sound segment were conducted for readings, interviews, and vowel phonation and for sound signals extracted via both recording methods. Self-recording by adults who smoked provided audio samples of sufficient quality for analysis of vocal features that have been associated with a clinical outcome. The values obtained for sustained phonation audio samples displayed the least perturbation and noise for the vocal parameters surveyed. Sound signals recorded with smartphones appeared to be affected by electronic interference but have potential for use in diagnostic tools for measuring vocal parameters.
Collapse
Affiliation(s)
- Marewa Glover
- Centre of Research Excellence: Indigenous Sovereignty and Smoking, PO Box 89186, Torbay, Auckland, 0742 New Zealand
| | | |
Collapse
|
16
|
Catalino MP, Buss E, Chamberlin G, Trembath D, Morgan D, Krebs M, Ewend MG, Jaikumar S. Tumor sound, auditory cues, and tissue pathology in glioma surgery: a proof-of-concept study. J Neurosurg 2022:1-9. [PMID: 36585869 DOI: 10.3171/2022.11.jns222114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 11/29/2022] [Indexed: 12/31/2022]
Abstract
OBJECTIVE Visual, tactile, and auditory cues are used during surgery to differentiate tissue type. Auditory cues in glioma surgery have not been studied previously. The objectives of this study were 1) to evaluate the feasibility of recording sound generated by the suction device during glioma surgery in matched tissue samples, and 2) to characterize the acoustic variation that occurs in different tissue samples. METHODS This was a prospective observational proof-of-concept study. Recordings were attempted in 20 patients in order meet the accrual target of 10 patients with matched sound and tissue data. For each patient, three 30- to 60-second recordings were made at these sites: normal white matter, infiltrative margin, and tumor. Tissue samples at each site were then reviewed by experienced neuropathologists, and agreement with surgical identification was estimated with the kappa statistic. Acoustic parameters were characterized for each sample. RESULTS Data from 20 patients were analyzed. Patient-related or technical issues resulted in missing data for 10 patients, but the final 10 patients had both audio and tissue data for analysis. Among all tissue samples, fair agreement was observed between surgeon identification and actual pathology (κ = 0.24, standard error 0.096, p = 0.006). Acoustic data suggested that 1) the acoustic stimulus is broadband, 2) acoustic features are somewhat consistent within cases, 3) high-entropy values indicate irregularity of sound over time, and 4) bimodal pitch distributions could differentially reflect cues of interest. CONCLUSIONS This study supports the feasibility of collecting intraoperative data on acoustic features during glioma surgery, and it provides an example of how an analysis could be performed to compare different types of tissues.
Collapse
Affiliation(s)
- Michael P Catalino
- Departments of1Neurosurgery
- 5Department of Neurosurgery, The University of Texas MD Anderson Cancer Center, Houston, Texas; and
| | | | - Gregory Chamberlin
- 3Pathology, The University of North Carolina, Chapel Hill
- 6Department of Pathology, Duke University, Durham, North Carolina
| | | | - David Morgan
- 4The University of North Carolina School of Medicine, Chapel Hill, North Carolina
| | - Madelyn Krebs
- 4The University of North Carolina School of Medicine, Chapel Hill, North Carolina
| | | | | |
Collapse
|
17
|
Kao HH, Lin YC, Chiang JK, Yu HC, Wang CL, Kao YH. Dependable algorithm for visualizing snoring duration through acoustic analysis: A pilot study. Medicine (Baltimore) 2022; 101:e32538. [PMID: 36595844 PMCID: PMC9794359 DOI: 10.1097/md.0000000000032538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Snoring is a nuisance for the bed partners of people who snore and is also associated with chronic diseases. Estimating the snoring duration from a whole-night sleep period is challenging. The authors present a dependable algorithm for visualizing snoring durations through acoustic analysis. Both instruments (Sony digital recorder and smartphone's SnoreClock app) were placed within 30 cm from the examinee's head during the sleep period. Subsequently, spectrograms were plotted based on audio files recorded from Sony recorders. The authors thereby developed an algorithm to validate snoring durations through visualization of typical snoring segments. In total, 37 snoring recordings obtained from 6 individuals were analyzed. The mean age of the participants was 44.6 ± 9.9 years. Every recorded file was tailored to a regular 600-second segment and plotted. Visualization revealed that the typical features of the clustered snores in the amplitude domains were near-isometric spikes (most had an ascending-descending trend). The recorded snores exhibited 1 or more visibly fixed frequency bands. Intervals were noted between the snoring clusters and were incorporated into the whole-night snoring calculation. The correlative coefficients of snoring rates from digitally recorded files examined between Examiners A and B were higher (0.865, P < .001) than those with SnoreClock app and Examiners (0.757, P < .001; 0.787, P < .001, respectively). A dependable algorithm with high reproducibility was developed for visualizing snoring durations.
Collapse
Affiliation(s)
- Hsueh-Hsin Kao
- Graduate Institute of Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan
- Department of Laboratory Medicine, Kaohsiung Medical University Hospital, Kaohsiung, Taiwan
| | | | - Jui-Kun Chiang
- Department of Family Medicine, Dalin Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation, Chiayi, Taiwan
| | | | - Chun-Lung Wang
- School of Medicine, Tzu Chi University, Hualien, Taiwan
- Division of Pediatrics, Dalin Tzu Chi Hospital, Buddhish Tzu Chi Medical Foundation, Dalin Chiayi, Taiwan
| | - Yee-Hsin Kao
- Department of Family Medicine, Tainan Municipal Hospital (Managed by Show Chwan Medical Care Corporation), Tainan, Taiwan
- *Correspondence: Yee-Hsin Kao, 670 Chung Te Road, Tainan, 70173 Taiwan (e-mail: )
| |
Collapse
|
18
|
Geng P, Gu W. Acoustic and Perceptual Characteristics of Mandarin Speech in Gay and Heterosexual Male Speakers. Lang Speech 2022; 65:1096-1109. [PMID: 33740875 DOI: 10.1177/00238309211000783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
This study investigated acoustic and perceptual characteristics of Mandarin speech produced by gay and heterosexual male speakers. Acoustic analysis of monosyllabic words showed significant differences between the two groups in voice fundamental frequency (F0), F1 of low vowel, and duration of aspiration/frication in consonants. The acoustic patterns on F0, formants, and center of gravity as well as spectral skewness of /s/ differed from those reported for Western languages like American English, which could be interpreted from a sociopsychological point of view based on different acceptability of gay identity in the two societies. The results of a perceptual experiment revealed significant but weak correlations between the acoustic parameters and the score of perceived gayness, which was significantly higher on gay speech than on heterosexual male speech. Although the observed F0 and F1 patterns in Mandarin gay speech were opposite to the stereotype of gayness, gay identity can still be identified to some extent from speech due to the existence of other acoustic cues such as a longer fricative duration, which is not a stereotype of gayness but has been consistently observed in Mandarin and Western languages.
Collapse
|
19
|
Liu S, Shao J. [Current methods of acoustic analysis of voice: a review]. Lin Chuang Er Bi Yan Hou Tou Jing Wai Ke Za Zhi 2022; 36:966-970;976. [PMID: 36543409 PMCID: PMC10128270 DOI: 10.13201/j.issn.2096-7993.2022.12.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 03/12/2022] [Indexed: 12/24/2022]
Abstract
Acoustic analysis of the voice, as an objective, quantitative, non-invasive and reproducible method for the evaluation of voice quality, can be used to detect and analyze the acoustic characteristics of normal, artistic or pathological voice. With the development of medicine, physics, statistics, and artificial intelligence technology, there are new advances in the study of voice acoustic analysis, especially in terms of acoustic parameters. In addition, artificial neural networks can be used to perform complex multi-parameter analysis, which greatly improves the efficiency of acoustic analysis. This paper provides an overview of the methods of acoustic analysis and its latest development.
Collapse
Affiliation(s)
- Siwei Liu
- Department of Otolaryngology,Eye&ENT Hospital,Fudan University,Shanghai,200031,China
| | - Jun Shao
- Department of Otolaryngology,Eye&ENT Hospital,Fudan University,Shanghai,200031,China
| |
Collapse
|
20
|
Marchese MR, Longobardi Y, Di Cesare T, Mari G, Terruso V, Galli J, D’Alatri L. Gender-related differences in the prevalence of voice disorders and awareness of dysphonia. Acta Otorhinolaryngol Ital 2022; 42:458-464. [PMID: 36541384 PMCID: PMC9793143 DOI: 10.14639/0392-100x-n2018] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Accepted: 07/30/2022] [Indexed: 12/24/2022]
Abstract
Objective Considering the impact of dysphonia on public health and the increasing attention to patient-centred care, we evaluated sex-related differences in the prevalence of benign voice disorders, awareness of dysphonia and voice therapy (VT) results. Methods One hundred and seventy-one patients, 129 females and 42 males, with functional or organic benign dysphonia underwent Voice Handicap Index (VHI), auditory-perceptual dysphonia severity scoring (GRBAS) and acoustic analysis (Jitter%, Shimmer%, NHR) before and after VT. Results Prevalence of each voice disorder was significantly higher among females. Mean time-to-diagnosis (time elapsed until medical consultation) was not different between males and females. The refusal of therapy and VT adherence (mean number of absences and premature dropout) were similar in the two groups. Pre-VT VHI and "G" parameter were worse in women. The percentage of women with abnormal acoustic analysis was significantly higher. Post-VT VHI gain was higher in women, whereas "G" parameter improvement did not differ by sex. Conclusions Our study showed a higher prevalence of voice disorders in females. Awareness of dysphonia was not gender related. Females started with worse voice subjective perception and acoustic analysis, but they perceived greater improvement after therapy.
Collapse
Affiliation(s)
| | - Ylenia Longobardi
- Institute of Otolaryngology, Catholic University of the Sacred Heart, Rome, Italy
| | - Tiziana Di Cesare
- Institute of Otolaryngology, Catholic University of the Sacred Heart, Rome, Italy,Correspondence Tiziana Di Cesare Department of Head and Neck Sciences, Catholic University of Sacred Heart, Policlinico “A. Gemelli” Foundation, l.go “A. Gemelli” 8, 00168 Rome, Italy Tel. +39 06 30154439. Fax +39 06 3051194 E-mail:
| | - Giorgia Mari
- Institute of Otolaryngology, Catholic University of the Sacred Heart, Rome, Italy
| | - Valeria Terruso
- Institute of Otolaryngology, Catholic University of the Sacred Heart, Rome, Italy
| | - Jacopo Galli
- Institute of Otolaryngology, Catholic University of the Sacred Heart, Rome, Italy, Department of Aging, Neurological, Orthopedic and Head and Neck Sciences, UOC of Otolaryngology, A. Gemelli IRCCS University Hospital Foundation, Rome, Italy
| | - Lucia D’Alatri
- Institute of Otolaryngology, Catholic University of the Sacred Heart, Rome, Italy, Department of Aging, Neurological, Orthopedic and Head and Neck Sciences, UOC of Otolaryngology, A. Gemelli IRCCS University Hospital Foundation, Rome, Italy
| |
Collapse
|
21
|
Huang Z, Bosschieter PF, Aarab G, van Selms MK, Vanhommerig JW, Hilgevoord AA, Lobbezoo F, de Vries N. Predicting upper airway collapse sites found in drug-induced sleep endoscopy from clinical data and snoring sounds in patients with obstructive sleep apnea: a prospective clinical study. J Clin Sleep Med 2022; 18:2119-2131. [PMID: 35459443 PMCID: PMC9435347 DOI: 10.5664/jcsm.9998] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Revised: 03/16/2022] [Accepted: 03/17/2022] [Indexed: 11/13/2022]
Abstract
STUDY OBJECTIVES The primary aim was to predict upper airway collapse sites found in drug-induced sleep endoscopy (DISE) from demographic, anthropometric, clinical examination, sleep study, and snoring sound parameters in patients with obstructive sleep apnea (OSA). The secondary aim was to identify the above-mentioned parameters that are associated with complete concentric collapse of the soft palate. METHODS All patients with OSA who underwent DISE and simultaneous snoring sound recording were enrolled in this study. Demographic, anthropometric, clinical examination (viz., modified Mallampati classification and Friedman tonsil classification), and sleep study parameters were extracted from the polysomnography and DISE reports. Snoring sound parameters during DISE were calculated. RESULTS One hundred and nineteen patients with OSA (79.8% men; age = 48.1 ± 12.4 years) were included. Increased body mass index was found to be associated with higher probability of oropharyngeal collapse (P < .01; odds ratio = 1.29). Patients with a high Friedman tonsil score were less likely to have tongue base collapse (P < .01; odd ratio = 0.12) and epiglottic collapse (P = .01; odds ratio = 0.20) than those with a low score. A longer duration of snoring events (P = .05; odds ratio = 2.99) was associated with a higher probability of complete concentric collapse of the soft palate. CONCLUSIONS Within the current patient profile and approach, given that only a limited number of predictors were identified, it does not seem feasible to predict upper airway collapse sites found in DISE from demographic, anthropometric, clinical examination, sleep study, and snoring sound parameters in patients with OSA. CITATION Huang Z, Bosschieter PFN, Aarab G, et al. Predicting upper airway collapse sites found in drug-induced sleep endoscopy from clinical data and snoring sounds in obstructive sleep apnea patients: a prospective clinical study. J Clin Sleep Med. 2022;18(9):2119-2131.
Collapse
Affiliation(s)
- Zhengfei Huang
- Department of Orofacial Pain and Dysfunction, Academic Center for Dentistry Amsterdam (ACTA), University of Amsterdam and Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Department of Clinical Neurophysiology, OLVG, Amsterdam, The Netherlands
| | - Pien F.N. Bosschieter
- Department of Otorhinolaryngology–Head and Neck Surgery, OLVG, Amsterdam, The Netherlands
| | - Ghizlane Aarab
- Department of Orofacial Pain and Dysfunction, Academic Center for Dentistry Amsterdam (ACTA), University of Amsterdam and Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Maurits K.A. van Selms
- Department of Orofacial Pain and Dysfunction, Academic Center for Dentistry Amsterdam (ACTA), University of Amsterdam and Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Joost W. Vanhommerig
- Department of Research and Epidemiology, OLVG Hospital, Amsterdam, The Netherlands
| | | | - Frank Lobbezoo
- Department of Orofacial Pain and Dysfunction, Academic Center for Dentistry Amsterdam (ACTA), University of Amsterdam and Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Nico de Vries
- Department of Orofacial Pain and Dysfunction, Academic Center for Dentistry Amsterdam (ACTA), University of Amsterdam and Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Department of Otorhinolaryngology–Head and Neck Surgery, OLVG, Amsterdam, The Netherlands
- Department of Otorhinolaryngology–Head and Neck Surgery, Antwerp University Hospital (UZA), Antwerp, Belgium
| |
Collapse
|
22
|
Marchese MR, Proietti I, Longobardi Y, Mari G, Ausili Cefaro C, D’Alatri L. Multidimensional voice assessment after Lee Silverman Voice Therapy (LSVT ®) in Parkinson's disease. Acta Otorhinolaryngol Ital 2022; 42:348-354. [PMID: 36254651 PMCID: PMC9577687 DOI: 10.14639/0392-100x-n1962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/11/2021] [Accepted: 04/21/2022] [Indexed: 11/06/2022]
Abstract
Objective To investigate the effectiveness of Lee Silvermann Voice Treatment (LSVT®) in improving prosody in patients with Parkinson’s disease over medium-term follow-up. Methods 15 patients with Parkinson’s disease were assessed before LSVT®, within one week, and 3 and 6 months after treatment. Subjective and objective evaluation included: Voice Handicap Index - 10 (VHI-10), perceptual assessment by GRBAS scale and item 18 of the Unified Parkinson’s Disease Rating Scale III (UPDRS III), maximum phonation time (MPT /s/) and acoustic analysis by means the Voice Range Profile (VRP) and the “Intonation Stimulability Protocol” of the Motor Speech Profile (MSP). Results A significant increase of the mean values of Imax and rF0 was observed until 6 months post-therapy (p < 0.001), whereas Running Speech Standard Deviation (rSTD) (p = 0.004), Amplitude Variability (rVAm) (p = 0.02) and Frequency Variability (rvF0) (p = 0-01) improved significantly after 3 months, but returned to pre-therapy levels after 6 months. The score of item 18 of the UPDRS III increased significantly early post-therapy (p = 0.03), but did not maintain the improvement at 3 and 6 months. Median values of Grade (G), Asthenia (A) and mean values VHI-10 score significantly decreased at each post-therapy control (p < 0.05). Conclusions In addition to the subjective and perceptual beneficial effect of LSVT®, we found a long-lasting increase of loudness and fundamental frequency. There was also improvement of acoustic parameters related to prosody, although it was temporary.
Collapse
Affiliation(s)
- Maria Raffaella Marchese
- Otorhinolaryngology Head & Neck Surgery Unit, Fondazione Policlinico Universitario “A. Gemelli” - IRCCS - Rome, Italy
| | - Ilaria Proietti
- Otorhinolaryngology Head & Neck Surgery Unit, Fondazione Policlinico Universitario “A. Gemelli” - IRCCS - Rome, Italy
| | - Ylenia Longobardi
- Otorhinolaryngology Head & Neck Surgery Unit, Fondazione Policlinico Universitario “A. Gemelli” - IRCCS - Rome, Italy
| | - Giorgia Mari
- Otorhinolaryngology Head & Neck Surgery Unit, Fondazione Policlinico Universitario “A. Gemelli” - IRCCS - Rome, Italy,Correspondence Giorgia Mari Unità Operativa Complessa di Otorinolaringoiatria, Dipartimento di Scienze dell’Invecchiamento, Neurologiche, Ortopediche e della Testa-Collo, Fondazione Policlinico Universitario “A. Gemelli” IRCCS, largo “A. Gemelli” 8, 00168 Rome, Italy Tel. +39 06 30155193. Fax +39 06 3051194 E-mail:
| | - Carolina Ausili Cefaro
- Otorhinolaryngology Head & Neck Surgery Unit, Fondazione Policlinico Universitario “A. Gemelli” - IRCCS - Rome, Italy
| | - Lucia D’Alatri
- Otorhinolaryngology Head & Neck Surgery Unit, Fondazione Policlinico Universitario “A. Gemelli” - IRCCS - Rome, Italy, Head & Neck Department, Università Cattolica del Sacro Cuore, Rome, Italy
| |
Collapse
|
23
|
Li W, He L, Jin X, Li L, Sun C, Wang C. Isolated dysarthria as the sole manifestation of myasthenia gravis: a case report. J Int Med Res 2022; 50:3000605221109395. [PMID: 35915860 PMCID: PMC9350514 DOI: 10.1177/03000605221109395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Myasthenia gravis (MG) is an acquired autoimmune disease. Its clinical
manifestations comprise ptosis, diplopia, dysarthria, dysphagia, limb weakness,
and in severe cases, respiratory muscle involvement. Dysarthria as an exclusive
initial and primary complaint in MG is rare and seldom reported. In this paper,
we report a case of type IIIb MG with isolated dysarthria as the only clinical
manifestation and we review the relevant literature. The patient was a
62-year-old man who presented with episodes of slurred speech for 20 days that
had worsened in the previous 9 days. His medical history comprised hypertension,
diabetes mellitus, and coronary heart disease. The initial diagnosis on
admission was transient ischemic attack. Careful re-examination of the patient’s
history revealed that his symptoms mainly involved increasingly worse slurred
speech episodes without drinking or swallowing difficulties, and no significant
improvement with rest was observed. Electromyography and autoantibody profiling
led to a diagnosis of type IIIb MG. His symptoms improved after the oral
administration of pyridostigmine bromide 60 mg. Laryngeal MG is important to
differentiate from stroke. It is necessary to perform a computerized voice
analysis when encountering patients with atypical symptoms of MG.
Collapse
Affiliation(s)
- Wei Li
- Department of Geriatrics, ZiBo Central Hospital, Zibo, China
| | - Ling He
- Department of Neurology, Jilin Central General Hospital, Jilin, China
| | - Xiaodong Jin
- Department of Geriatrics, ZiBo Central Hospital, Zibo, China
| | - Li Li
- Department of Geriatrics, ZiBo Central Hospital, Zibo, China
| | - Congcong Sun
- Department of Neurology, Qilu Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Cuilan Wang
- Department of Neurology, Qilu Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China
| |
Collapse
|
24
|
Mekiš J, Strojan P, Mekiš D, Hočevar Boltežar I. Change in Voice Quality after Radiotherapy for Early Glottic Cancer. Cancers (Basel) 2022; 14. [PMID: 35740656 DOI: 10.3390/cancers14122993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2022] [Revised: 06/09/2022] [Accepted: 06/15/2022] [Indexed: 11/16/2022] Open
Abstract
Our aim was to track the changes in voice quality for two years after radiotherapy (RT) for early glottic cancer. A videoendostroboscopy, subjective patient and phoniatrician voice assessments, a Voice Handicap Index questionnaire, and objective acoustic measurements (F0, jitter, shimmer, maximal phonation time) were performed on 50 patients with T1 glottic carcinomas at 3, 12, and 24 months post-RT. The results were compared between the subsequent assessments, and between the assessments at 3 months and 24 months post-RT. The stroboscopy showed a gradual progression of fibrosis of the vocal folds with a significant difference apparent when the assessments at 3 months and 24 months were compared (p < 0.001). Almost all of the subjective assessments of voice quality showed an improvement during the first 2 years, but significant differences were noted at 24 months. Jitter and shimmer deteriorated in the first year after RT with a significant deterioration noticed between the sixth and twelfth months (p = 0.048 and p = 0.002, respectively). Two years after RT, only 8/50 (16%) patients had normal voices. The main reasons for a decreased voice quality after RT for early glottic cancer were post-RT changes in the larynx. Despite a significant improvement in the voice after RT shown in a few of the evaluation methods, only a minority of the patients had a normal voice two years post-RT.
Collapse
|
25
|
Cowan T, Cohen AS, Raugh IM, Strauss GP. Ambulatory audio and video recording for digital phenotyping in schizophrenia: Adherence & data usability. Psychiatry Res 2022; 311:114485. [PMID: 35276573 PMCID: PMC9018573 DOI: 10.1016/j.psychres.2022.114485] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Revised: 02/21/2022] [Accepted: 02/23/2022] [Indexed: 12/18/2022]
Abstract
Ambulatory audio and video recording provides a wealth of information which can be used for a broad range of applications, including digital phenotyping, telepsychiatry, and telepsychology. However, these technologies are in their infancy, and guidelines for their use and analysis have yet to be established. The current project used ambulatory assessment data from individuals with schizophrenia (N = 52) and controls (N = 55) over a week to assess factors influencing sufficiency and useability of video and audio data. Logistic multilevel models examined the effect of relevant variables on video provision and video quality. There was no difference by group in video provision or quality. Videos were less likely to be provided later in the study and later in the day. Video quality was lower later in the day, particularly for controls. Participants were more likely to provide videos if alone or at home than in other settings. Black participants were less likely to have analyzable video frames than White participants. These results suggest potential racial disparities in camera technologies and/or facial analysis algorithms. Implications of these findings and recommendations for future study development, such as instructions to provide to participants to optimize video quality, are discussed.
Collapse
Affiliation(s)
- Tovah Cowan
- Department of Psychology, Louisiana State University, Baton Rouge, USA,Center for Computation and Technology, Louisiana State University, Baton Rouge, USA
| | - Alex S. Cohen
- Department of Psychology, Louisiana State University, Baton Rouge, USA,Center for Computation and Technology, Louisiana State University, Baton Rouge, USA
| | - Ian M. Raugh
- Department of Psychology, University of Georgia, Athens, USA
| | - Gregory P. Strauss
- Department of Psychology, University of Georgia, Athens, USA,Correspondence concerning this article should be addressed to Gregory P. Strauss, PhD. Department of Psychology, Psychology Building, University of Georgia, Athens, GA 30602-3013,
| |
Collapse
|
26
|
Nguyen DD, Chacon A, Payten C, Black R, Sheth M, McCabe P, Novakovic D, Madill C. Acoustic characteristics of fricatives, amplitude of formants and clarity of speech produced without and with a medical mask. Int J Lang Commun Disord 2022; 57:366-380. [PMID: 35166414 PMCID: PMC9305964 DOI: 10.1111/1460-6984.12705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Revised: 01/14/2022] [Accepted: 01/17/2022] [Indexed: 06/14/2023]
Abstract
BACKGROUND Previous research has found that high-frequency energy of speech signals decreased while wearing face masks. However, no study has examined the specific spectral characteristics of fricative consonants and vowels and the perception of clarity of speech in mask wearing. AIMS To investigate acoustic-phonetic characteristics of fricative consonants and vowels and auditory perceptual rating of clarity of speech produced with and without wearing a face mask. METHODS & PROCEDURES A total of 16 healthcare workers read the Rainbow Passage using modal phonation in three conditions: without a face mask, with a standard surgical mask and with a KN95 mask (China GB2626-2006, a medical respirator with higher barrier level than the standard surgical mask). Speech samples were acoustically analysed for root mean square (RMS) amplitude (ARMS ) and spectral moments of four fricatives /f/, /s/, /ʃ/ and /z/; and amplitude of the first three formants (A1, A2 and A3) measured from the reading passage and extracted vowels. Auditory perception of speech clarity was performed. Data were compared across mask and non-mask conditions using linear mixed models. OUTCOMES & RESULTS The ARMS of all included fricatives was significantly lower in surgical mask and KN95 mask compared with non-mask condition. Centre of gravity of /f/ decreased in both surgical and KN95 mask while other spectral moments did not show systematic significant linear trends across mask conditions. None of the formant amplitude measures was statistically different across conditions. Speech clarity was significantly poorer in both surgical and KN95 mask conditions. CONCLUSIONS & IMPLICATIONS Speech produced while wearing either a surgical mask or KN95 mask was associated with decreased fricative amplitude and poorer speech clarity. WHAT THIS PAPER ADDS What is already known on the subject Previous studies have shown that the overall spectral levels in high frequency ranges and intelligibility are decreased for speech produced with a face mask. It is unclear how different types of the speech signals that is, fricatives and vowels are presented in speech produced with wearing either a medical surgical or KN95 mask. It is also unclear whether ratings of speech clarity are similar for speech produced with these face masks. What this paper adds to existing knowledge Speech data collected using a real-world, clinical and non-laboratory-controlled settings showed differences in the amplitude of fricatives and speech clarity ratings between non-mask and mask-wearing conditions. Formant amplitude did not show significant differences in mask-wearing conditions compared with non-mask. What are the potential or actual clinical implications of this work? Wearing a surgical mask or a KN95 mask had different effects on consonants and vowels. It appeared from the findings in this study that these masks only affected fricative consonants and did not affect vowel production. The poorer speech clarity in these mask-wearing conditions has important implications for speech perception in communication between clinical staff and between medical officers and patients in clinics, and between people in everyday situations. The impact of these masks on speech perception may be more pronounced in people with hearing impairment and communication disorders. In voice evaluation and/or therapy sessions, the effects of wearing a medical mask can occur bidirectionally for both the clinician and the patient. The patient may find it more challenging to understand the speech conveyed by the clinician while the clinician may not perceptually assess patient's speech and voice accurately. Given the significant correlation between clarity ratings and fricative amplitude, improving fricative signals would be useful to improve speech clarity while wearing these medical face masks.
Collapse
Affiliation(s)
- Duy Duong Nguyen
- Voice Research LaboratoryFaculty of Medicine and HealthSusan Wakil Health BuildingCamperdown CampusThe University of SydneySydneyNSWAustralia
- National Hospital of OtorhinolaryngologyHanoiVietnam
| | - Antonia Chacon
- Voice Research LaboratoryFaculty of Medicine and HealthSusan Wakil Health BuildingCamperdown CampusThe University of SydneySydneyNSWAustralia
| | - Christopher Payten
- Voice Research LaboratoryFaculty of Medicine and HealthSusan Wakil Health BuildingCamperdown CampusThe University of SydneySydneyNSWAustralia
| | - Rebecca Black
- Voice Research LaboratoryFaculty of Medicine and HealthSusan Wakil Health BuildingCamperdown CampusThe University of SydneySydneyNSWAustralia
| | - Meet Sheth
- Voice Research LaboratoryFaculty of Medicine and HealthSusan Wakil Health BuildingCamperdown CampusThe University of SydneySydneyNSWAustralia
| | - Patricia McCabe
- Voice Research LaboratoryFaculty of Medicine and HealthSusan Wakil Health BuildingCamperdown CampusThe University of SydneySydneyNSWAustralia
| | - Daniel Novakovic
- Voice Research LaboratoryFaculty of Medicine and HealthSusan Wakil Health BuildingCamperdown CampusThe University of SydneySydneyNSWAustralia
- The Canterbury HospitalCampsieNSWAustralia
- Sydney Voice and SwallowingSt LeonardsNSWAustralia
| | - Catherine Madill
- Voice Research LaboratoryFaculty of Medicine and HealthSusan Wakil Health BuildingCamperdown CampusThe University of SydneySydneyNSWAustralia
| |
Collapse
|
27
|
Aichert I, Lehner K, Falk S, Späth M, Franke M, Ziegler W. In Time with the Beat: Entrainment in Patients with Phonological Impairment, Apraxia of Speech, and Parkinson's Disease. Brain Sci 2021; 11:brainsci11111524. [PMID: 34827523 PMCID: PMC8615970 DOI: 10.3390/brainsci11111524] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 11/08/2021] [Accepted: 11/12/2021] [Indexed: 11/25/2022] Open
Abstract
In the present study, we investigated if individuals with neurogenic speech sound impairments of three types, Parkinson’s dysarthria, apraxia of speech, and aphasic phonological impairment, accommodate their speech to the natural speech rhythm of an auditory model, and if so, whether the effect is more significant after hearing metrically regular sentences as compared to those with an irregular pattern. This question builds on theories of rhythmic entrainment, assuming that sensorimotor predictions of upcoming events allow humans to synchronize their actions with an external rhythm. To investigate entrainment effects, we conducted a sentence completion task relating participants’ response latencies to the spoken rhythm of the prime heard immediately before. A further research question was if the perceived rhythm interacts with the rhythm of the participants’ own productions, i.e., the trochaic or iambic stress pattern of disyllabic target words. For a control group of healthy speakers, our study revealed evidence for entrainment when trochaic target words were preceded by regularly stressed prime sentences. Persons with Parkinson’s dysarthria showed a pattern similar to that of the healthy individuals. For the patient groups with apraxia of speech and with phonological impairment, considerably longer response latencies with differing patterns were observed. Trochaic target words were initiated with significantly shorter latencies, whereas the metrical regularity of prime sentences had no consistent impact on response latencies and did not interact with the stress pattern of the target words to be produced. The absence of an entrainment in these patients may be explained by the more severe difficulties in initiating speech at all. We discuss the results in terms of clinical implications for diagnostics and therapy in neurogenic speech disorders.
Collapse
Affiliation(s)
- Ingrid Aichert
- Clinical Neuropsychology Research Group, Institute of Phonetics and Speech Processing, Ludwig-Maximilians-Universität München, 80799 Munich, Germany; (K.L.); (M.F.); (W.Z.)
- Correspondence:
| | - Katharina Lehner
- Clinical Neuropsychology Research Group, Institute of Phonetics and Speech Processing, Ludwig-Maximilians-Universität München, 80799 Munich, Germany; (K.L.); (M.F.); (W.Z.)
| | - Simone Falk
- International Laboratory for Brain, Music and Sound Research (BRAMS), Département de Linguistique et de Traduction, Université de Montréal, Montréal, QC H3C 3J7, Canada;
| | - Mona Späth
- Neolexon, Limedix GmbH, 80538 Munich, Germany;
| | - Mona Franke
- Clinical Neuropsychology Research Group, Institute of Phonetics and Speech Processing, Ludwig-Maximilians-Universität München, 80799 Munich, Germany; (K.L.); (M.F.); (W.Z.)
| | - Wolfram Ziegler
- Clinical Neuropsychology Research Group, Institute of Phonetics and Speech Processing, Ludwig-Maximilians-Universität München, 80799 Munich, Germany; (K.L.); (M.F.); (W.Z.)
| |
Collapse
|
28
|
Chang WD, Chen SH, Tsai MH, Tsou YA. Autologous Fat Injection Laryngoplasty for Unilateral Vocal Fold Paralysis. J Clin Med 2021; 10:5034. [PMID: 34768558 DOI: 10.3390/jcm10215034] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 10/22/2021] [Accepted: 10/25/2021] [Indexed: 11/16/2022] Open
Abstract
Background: Unilateral vocal palsy (UVFP) affects the voice and swallowing function and could be treated by various materials to achieve improved mucosal wave and better closure during phonation. Injection laryngoplasty is considered an exemplary method for these patients and could be injected as early as possible. We conducted a systematic review and meta-analysis for the subjective and objective outcomes of autologous fat injection laryngoplasty (AFIL) and assessed the effects for patients with UVFP. Methods: We searched studies from PubMed and EBSCO databases with PRISMA appraisal to search for articles about the effects of AFIL on UVFP. The published articles were reviewed according to our inclusion and exclusion criteria. The short- and long-term outcomes of perceptual, acoustic analysis, and quality of life were also analyzed by meta-analysis. Results: Eleven articles were reviewed, and seven studies were selected for meta-analysis. AFIL improves the perceptual outcome and some voice parameters in short-term and long-term results, i.e., jitter, shimmer, and maximal phonation time (MPT). It also significantly improved the voice handicap index (VHI) in the long term, suggesting an increase in quality of life. Conclusions: AFIL is considered a reliable treatment method for UVFP and could even last for over 12 months.
Collapse
|
29
|
Montazeri Ghahjaverestan N, Saha S, Kabir M, Gavrilovic B, Zhu K, Yadollahi A. Sleep apnea severity based on estimated tidal volume and snoring features from tracheal signals. J Sleep Res 2021; 31:e13490. [PMID: 34553793 DOI: 10.1111/jsr.13490] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Revised: 08/20/2021] [Accepted: 09/07/2021] [Indexed: 02/01/2023]
Abstract
Sleep apnea can be characterized by reductions in the respiratory tidal volume. Previous studies showed that the tidal volume can be estimated from tracheal sounds and movements called tracheal signals. Additionally, tracheal sounds include the sounds of snoring, a common symptom of obstructive sleep apnea. This study investigates the feasibility of estimating the severity of sleep apnea, as quantified by the apnea/hypopnea index (AHI), using the estimated tidal volume and snoring sounds extracted from tracheal signals. Tracheal signals were recorded simultaneously with polysomnography (PSG). The tidal volume was estimated from tracheal signals. The reductions in the tidal volume were detected as potential respiratory events. Additionally, features related to snoring sounds, which quantified variability, temporal clusters, and dominant frequency of snores, were extracted. A step-wise regression model and a greedy search algorithm were used sequentially to select the optimal set of features to estimate the apnea/hypopnea index and classify participants into healthy individuals and patients with sleep apnea. Sixty-one participants with suspected sleep apnea (age: 51 ± 16, body mass index: 29.5 ± 6.4 kg/m2 , apnea/hypopnea index: 20.2 ± 21.2 event/h) who were referred for a sleep test were recruited. The estimated apnea/hypopnea index was strongly correlated with the polysomnography-based apnea/hypopnea index (R2 = 0.76, p < 0.001). The accuracy of detecting sleep apnea for the apnea/hypopnea index cutoff of 15 events/h was 78.69% and 83.61% with and without using snore-related features. These findings suggest that acoustic estimation of airflow and snore-related features can provide a convenient and reliable method for screening of sleep apnea.
Collapse
Affiliation(s)
- Nasim Montazeri Ghahjaverestan
- KITE, Toronto Rehabilitation Institute-University Health Network, Toronto, ON, Canada.,Institute of Biomaterials and Biomedical Engineering, University of Toronto, Toronto, ON, Canada
| | - Shumit Saha
- KITE, Toronto Rehabilitation Institute-University Health Network, Toronto, ON, Canada.,Institute of Biomaterials and Biomedical Engineering, University of Toronto, Toronto, ON, Canada
| | - Muammar Kabir
- KITE, Toronto Rehabilitation Institute-University Health Network, Toronto, ON, Canada.,Institute of Biomaterials and Biomedical Engineering, University of Toronto, Toronto, ON, Canada
| | - Bojan Gavrilovic
- KITE, Toronto Rehabilitation Institute-University Health Network, Toronto, ON, Canada.,Institute of Biomaterials and Biomedical Engineering, University of Toronto, Toronto, ON, Canada
| | - Kaiyin Zhu
- KITE, Toronto Rehabilitation Institute-University Health Network, Toronto, ON, Canada
| | - Azadeh Yadollahi
- KITE, Toronto Rehabilitation Institute-University Health Network, Toronto, ON, Canada.,Institute of Biomaterials and Biomedical Engineering, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
30
|
León Gómez NM, Delgado Hernández J, Luis Hernández J, Artazkoz Del Toro JJ. Objective Analysis Of Voice Quality In Patients With Thyroid Pathology. Clin Otolaryngol 2021; 47:81-87. [PMID: 34516048 DOI: 10.1111/coa.13860] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 08/11/2021] [Accepted: 08/28/2021] [Indexed: 11/30/2022]
Abstract
OBJECTIVE The goal of this study is to analyze the voice in patients with thyroid pathology through two objective indexes with great diagnostic accuracy. Overall vocal quality was evaluated with the Acoustic Voice Quality Index (AVQI v.03.01) and the breathy voice with the Acoustic Breathiness Index (ABI). DESIGN Observational case-control study. SETTING Hospital Universitario Nuestra Señora de Candelaria. PARTICIPANTS Fifty-eight subjects, 29 controls and 29 thyroidectomy candidates. MAIN OUTCOME MEASURES All participants with thyroid pathology completed the Spanish version of Voice Handicap Index-10. Also, patient complaints relating to possible laryngeal dysfunction were assessed through closed questions. A sustained vowel and three phonetically balanced sentences were recorded for each subject (118 samples). AVQI v.03.01 and ABI were assessed using the Praat program. Two raters perceptually evaluated each voice sample by using the Grade parameter of GRABS scale. RESULTS Acoustic analysis shows that 55.17% of subjects present values above the pathological threshold of the AVQI, and 58.62% above that of the ABI. Results of the Student's test comparisons of the AVQI and ABI values between the control group and the thyroid group show significantly higher values of AVQI (t[56] = -3.85, p < .001) and ABI (t[54.39] = -4.82, p < .001) in thyroidectomy candidates. CONCLUSION A mild decrease in vocal quality is part of the symptomatology presented by thyroidectomy candidates.
Collapse
Affiliation(s)
- Nieves María León Gómez
- Department of Rehabilitation, Unit of Speech-Language Therapy, HUNSC, Tenerife, Spain.,Department of Developmental and Educational Psychology, La Laguna University, Tenerife, Spain
| | - Jonathan Delgado Hernández
- Department of Developmental and Educational Psychology, La Laguna University, Tenerife, Spain.,Department of Speech-Language Therapy, CREN Salud, LaLaguna, Tenerife, Spain
| | - Jorge Luis Hernández
- Department of Otorhinolaryngology, Nuestra Señora de la Candelaria University Hospital, Tenerife, Spain
| | | |
Collapse
|
31
|
Rusz J, Tykalová T, Novotný M, Zogala D, Růžička E, Dušek P. Automated speech analysis in early untreated Parkinson's disease: Relation to gender and dopaminergic transporter imaging. Eur J Neurol 2021; 29:81-90. [PMID: 34498329 DOI: 10.1111/ene.15099] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Revised: 08/31/2021] [Accepted: 09/01/2021] [Indexed: 01/01/2023]
Abstract
BACKGROUND The mechanisms underlying speech abnormalities in Parkinson's disease (PD) remain poorly understood, with most of the available evidence based on male patients. This study aimed to estimate the occurrence and characteristics of speech disorder in early, drug-naive PD patients with relation to gender and dopamine transporter imaging. METHODS Speech samples from 60 male and 40 female de novo PD patients as well as 60 male and 40 female age-matched healthy controls were analyzed. Quantitative acoustic vocal assessment of 10 distinct speech dimensions related to phonation, articulation, prosody, and speech timing was performed. All patients were evaluated using [123]I-2b-carbomethoxy-3b-(4-iodophenyl)-N-(3-fluoropropyl) nortropane single-photon emission computed tomography and Montreal Cognitive Assessment. RESULTS The prevalence of speech abnormalities in the de novo PD cohort was 56% for male and 65% for female patients, mainly manifested with monopitch, monoloudness, and articulatory decay. Automated speech analysis enabled discrimination between PD and controls with an area under the curve of 0.86 in men and 0.93 in women. No gender-specific speech dysfunction in de novo PD was found. Regardless of disease status, females generally showed better performance in voice quality, consonant articulation, and pauses production than males, who were better only in loudness variability. The extent of monopitch was correlated to nigro-putaminal dopaminergic loss in men (r = 0.39, p = 0.003) and the severity of imprecise consonants was related to cognitive deficits in women (r = -0.44, p = 0.005). CONCLUSIONS Speech abnormalities represent a frequent and early marker of motor abnormalities in PD. Despite some gender differences, our findings demonstrate that speech difficulties are associated with nigro-putaminal dopaminergic deficits.
Collapse
Affiliation(s)
- Jan Rusz
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czechia.,Department of Neurology and Centre of Clinical Neuroscience, First Faculty of Medicine, Charles University and General University Hospital, Prague, Czechia
| | - Tereza Tykalová
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czechia
| | - Michal Novotný
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czechia
| | - David Zogala
- First Faculty of Medicine, Institute of Nuclear Medicine, Charles University and General University Hospital, Prague, Czechia
| | - Evžen Růžička
- Department of Neurology and Centre of Clinical Neuroscience, First Faculty of Medicine, Charles University and General University Hospital, Prague, Czechia
| | - Petr Dušek
- Department of Neurology and Centre of Clinical Neuroscience, First Faculty of Medicine, Charles University and General University Hospital, Prague, Czechia
| |
Collapse
|
32
|
Ge S, Wan Q, Yin M, Wang Y, Huang Z. Quantitative acoustic metrics of vowel production in mandarin-speakers with post-stroke spastic dysarthria. Clin Linguist Phon 2021; 35:779-792. [PMID: 32985269 DOI: 10.1080/02699206.2020.1827295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2020] [Revised: 09/16/2020] [Accepted: 09/19/2020] [Indexed: 06/11/2023]
Abstract
Impairment of vowel production in dysarthria has been highly valued. This study aimed to explore the vowel production of Mandarin-speakers with post-stroke spastic dysarthria in connected speech and to explore the influence of gender and tone on the vowel production. Multiple vowel acoustic metrics, including F1 range, F2 range, vowel space area (VSA), vowel articulation index (VAI) and formant centralization ratio (FCR), were analyzed from vowel tokens embedded in connected speech produced. The participants included 25 clients with spastic dysarthria secondary to stroke (15 males, 10 females) and 25 speakers with no history of neurological disease (15 males, 10 females). Variance analyses were conducted and the results showed that the main effects of population, gender, and tone on F2 range, VSA, VAI, and FCR were all significant. Vowel production became centralized in the clients with post-stroke spastic dysarthria. Vowel production was found to be more centralized in males compared to females. Vowels in neutral tone (T0) were the most centralized among the other tones. The quantitative acoustic metrics of F2 range, VSA, VAI, and FCR were effective in predicting vowel production in Mandarin-speaking clients with post-stroke spastic dysarthria, and hence may be used as powerful tools to assess the speech performance for this population.
Collapse
Affiliation(s)
- Shengnan Ge
- Department of Education and Rehabilitation, Faculty of Education, East China Normal University, Shanghai, China
| | - Qin Wan
- Department of Education and Rehabilitation, Faculty of Education, East China Normal University, Shanghai, China
| | - Minmin Yin
- Department of Education and Rehabilitation, Faculty of Education, East China Normal University, Shanghai, China
| | - Yongli Wang
- Department of Education and Rehabilitation, Faculty of Education, East China Normal University, Shanghai, China
| | - Zhaoming Huang
- Department of Education and Rehabilitation, Faculty of Education, East China Normal University, Shanghai, China
| |
Collapse
|
33
|
Eravci FC, Yildiz BD, Özcan KM, Moran M, Çolak M, Karakurt SE, Karakuş MF, Ikinciogullari A. Acoustic parameter changes after bariatric surgery. LOGOP PHONIATR VOCO 2021; 47:256-261. [PMID: 34213387 DOI: 10.1080/14015439.2021.1945676] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
OBJECTIVE To investigate the acoustic parameter changes after weight loss in bariatric surgery patients. MATERIALS AND METHODS This prospective, longitudinal study was conducted with 15 patients with planned bariatric surgery, who were evaluated pre-operatively and at 6 months post-operatively. Fundamental frequency (F0), Formant frequency (F1, F2, F3, and F4), Frequency perturbation (Jitter), Amplitude perturbation (Shimmer) and Noise-to-Harmonics Ratio (NHR) parameters were evaluated for /a/, /e/, /i/, /o/, and /u/ vowels. Changes in the acoustic analysis parameters for each vowel were compared. The study group was separated into two groups according to whether the Mallampati score had not changed (Group 1) or had decreased (Group 2) and changes in the formant frequencies were compared between these groups. RESULTS A total of 15 patients with a median age of 40 ± 11 years completed the study. The median weight of the patients was 122 ± 14 kg pre-operatively and 80 ± 15 kg, post-operatively. BMI declined from 46 ± 4 to 31 ± 5 kg/m2. The Mallampati score decreased by one point in six patients and remained stable in nine. Of the acoustic voice analysis parameters of vowels, in general, fundamental frequency tended to decrease, and shimmer and jitter values tended to increase. Some of the formant frequencies were specifically affected by the weight loss and this showed statistical significance between Group 1 and Group 2. CONCLUSION The present study reveals that some specific voice characteristics might be affected by successful weight loss after bariatric surgery.HighlightsObesity reduces the size of the pharyngeal lumen at different levels.The supralaryngeal vocal tract size and configuration is a determinative factor in the features of the voice.Changes in the length and shape of the vocal tract, or height and position of the tongue can result in changes especially in formant frequencies in acoustic analysis.
Collapse
Affiliation(s)
- Fakih Cihat Eravci
- Department of Otorhinolaryngology, Meram Medical Faculty, Necmettin Erbakan University, Konya, Turkey
| | - Barış Doğu Yildiz
- Department of General Surgery, University of Health Science, Ankara Numune Training and Research Hospital, Ankara, Turkey
| | - Kürşat Murat Özcan
- Department of Otorhinolaryngology, University of Health Science, Ankara Numune Training and Research Hospital, Ankara, Turkey
| | - Münevver Moran
- Department of General Surgery, University of Health Science, Ankara Numune Training and Research Hospital, Ankara, Turkey.,Department of General Surgery, Liv Hospital Ankara, Ankara, Turkey
| | - Mustafa Çolak
- Department of Otorhinolaryngology, University of Health Science, Ankara Numune Training and Research Hospital, Ankara, Turkey
| | - Süleyman Emre Karakurt
- Department of Otorhinolaryngology, University of Health Science, Ankara Numune Training and Research Hospital, Ankara, Turkey
| | - Mehmet Fatih Karakuş
- Department of Otorhinolaryngology, University of Health Science, Ankara Numune Training and Research Hospital, Ankara, Turkey
| | - Aykut Ikinciogullari
- Department of Otorhinolaryngology, University of Health Science, Ankara Numune Training and Research Hospital, Ankara, Turkey
| |
Collapse
|
34
|
Maffia M, De Micco R, Pettorino M, Siciliano M, Tessitore A, De Meo A. Speech Rhythm Variation in Early-Stage Parkinson's Disease: A Study on Different Speaking Tasks. Front Psychol 2021; 12:668291. [PMID: 34194369 PMCID: PMC8236634 DOI: 10.3389/fpsyg.2021.668291] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Accepted: 05/17/2021] [Indexed: 11/25/2022] Open
Abstract
Patients with Parkinson's disease (PD) usually reveal speech disorders and, among other symptoms, the alteration of speech rhythm. The purpose of this study is twofold: (1) to test the validity of two acoustic parameters-%V, vowel percentage and VtoV, the mean interval between two consecutive vowel onset points-for the identification of rhythm variation in early-stage PD speech and (2) to analyze the effect of PD on speech rhythm in two different speaking tasks: reading passage and monolog. A group of 20 patients with early-stage PD was involved in this study and compared with 20 age- and sex-matched healthy controls (HCs). The results of the acoustic analysis confirmed that %V is a useful cue for early-stage PD speech characterization, having significantly higher values in the production of patients with PD than the values in HC speech. A simple speaking task, such as the reading task, was found to be more effective than spontaneous speech in the detection of rhythmic variations.
Collapse
Affiliation(s)
- Marta Maffia
- Department of Literary, Linguistics and Comparative Studies, University “L'Orientale, ” Naples, Italy
| | - Rosa De Micco
- Department of Advanced Medical and Surgical Sciences, University of Campania “Luigi Vanvitelli, ” Naples, Italy
| | - Massimo Pettorino
- Department of Literary, Linguistics and Comparative Studies, University “L'Orientale, ” Naples, Italy
| | - Mattia Siciliano
- Department of Advanced Medical and Surgical Sciences, University of Campania “Luigi Vanvitelli, ” Naples, Italy
- Department of Psychology, University of Campania “Luigi Vanvitelli, ” Caserta, Italy
| | - Alessandro Tessitore
- Department of Advanced Medical and Surgical Sciences, University of Campania “Luigi Vanvitelli, ” Naples, Italy
| | - Anna De Meo
- Department of Literary, Linguistics and Comparative Studies, University “L'Orientale, ” Naples, Italy
| |
Collapse
|
35
|
Villas-Bôas AP, Schwarz K, Fontanari AMV, Costa AB, Cardoso da Silva D, Schneider MA, Cielo CA, Spritzer PM, Rodrigues Lobato MI. Acoustic Measures of Brazilian Transgender Women's Voices: A Case-Control Study. Front Psychol 2021; 12:622526. [PMID: 34135803 PMCID: PMC8203313 DOI: 10.3389/fpsyg.2021.622526] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Accepted: 04/09/2021] [Indexed: 11/30/2022] Open
Abstract
Objective: This study aims to compare the acoustic vocal analysis results of a group of transgender women relative to those of cisgender women. Methods: Thirty transgender women between the ages of 19 and 52 years old participated in the study. The control group was composed of 31 cisgender women between the ages of 20 and 48 years old. A standardized questionnaire was administered to collect general patient data to better characterize the participants. The vowel /a/ sounds of all participants were collected and analyzed by the Multi-Dimensional Voice Program advanced system. Results: Statistically significant differences between cisgender and transgender women were found on 14 measures: fundamental frequency, maximum fundamental frequency, minimum fundamental frequency, standard deviation of fundamental frequency, absolute jitter, percentage or relative jitter, fundamental frequency relative average perturbation, fundamental frequency perturbation quotient, smoothed fundamental frequency perturbation quotient, fundamental frequency variation, absolute shimmer, relative shimmer, voice turbulence index (lower values in the cases), and soft phonation index (higher values in the cases). The mean fundamental frequency value was 159.046 Hz for the cases and 192.435 Hz for the controls. Conclusion: Through glottal adaptations, the group of transgender women managed to feminize their voices, presenting voices that were less aperiodic and softer than those of cisgender women.
Collapse
Affiliation(s)
- Anna Paula Villas-Bôas
- Gender Identity Program, Universidade Federal do Rio Grande do Sul, Programa de Pós Graduação em Psiquiatria e Ciências do Comportamento, Porto Alegre, Brazil
| | - Karine Schwarz
- Identity Program, Universidade Federal do Rio Grande do Sul, Programa de Pós Graduação em Ciências Médicas: Endocrinologia, Porto Alegre, Brazil
| | - Anna Martha Vaitses Fontanari
- Gender Identity Program, Universidade Federal do Rio Grande do Sul, Programa de Pós Graduação em Psiquiatria e Ciências do Comportamento, Porto Alegre, Brazil
| | - Angelo Brandelli Costa
- Pontifícia Universidade Católica do Rio Grande do Sul, Programa de Pós-Graduação em Psicologia e do Programa de Pós-Graduação em Ciências Sociais, Porto Alegre, Brazil
| | - Dhiordan Cardoso da Silva
- Gender Identity Program, Universidade Federal do Rio Grande do Sul, Programa de Pós Graduação em Psiquiatria e Ciências do Comportamento, Porto Alegre, Brazil
| | - Maiko Abel Schneider
- Psychiatry & Behavioural Neurosciences, Faculty of Health Sciences, McMaster University, Hamilton, ON, Canada
| | - Carla Aparecida Cielo
- Department of Psychiatry and Behavioural Neurosciences, McMaster University, Hamilton, ON, Canada
- Speech Therapy Department, Universidade Federal de Santa Maria, Santa Maria, Brazil
| | - Poli Mara Spritzer
- Identity Program, Universidade Federal do Rio Grande do Sul, Programa de Pós Graduação em Ciências Médicas: Endocrinologia, Porto Alegre, Brazil
| | - Maria Inês Rodrigues Lobato
- Gender Identity Program, Universidade Federal do Rio Grande do Sul, Programa de Pós Graduação em Psiquiatria e Ciências do Comportamento, Porto Alegre, Brazil
| |
Collapse
|
36
|
Bourqui M, Pernon M, Fougeron C, Laganaro M. Contribution of acoustic analysis to the detection of vocoid epenthesis in apraxia of speech and other motor speech disorders. Aphasiology 2021; 36:854-867. [PMID: 35720256 PMCID: PMC9197203 DOI: 10.1080/02687038.2021.1914815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Accepted: 04/01/2021] [Indexed: 06/15/2023]
Abstract
BACKGROUND Vocoid epenthesis within consonant clusters has been claimed to contribute to the diagnosis of apraxia of speech. In clinical practice, the clinicians often doubt about the correct production of clusters as the C-C transition may be minimally disrupted. AIMS To demonstrate the value of acoustic analysis in clinical practice as a reliable complement to perceptive judgment. METHODS & PROCEDURES We compared the acoustic signature and the perceptive detection of vocoid epentheses in unvoiced consonant clusters within pseudo-words produced by 40 participants presenting different subtypes of motor speech disorders (including apraxia of speech (AoS) and dysarthria) and matched neurotypical controls. OUTCOMES & RESULTS The results indicate that vocoid epenthesis was acoustically visible in 3 out of 10 participants with AoS, and in one out of 30 participants with dysarthria. One-quarter of these vocoid epentheses was not detected via auditory perception by expert listeners (speech and language therapists) who also made false detections. CONCLUSIONS The current results indicate that vocoid epenthesis is not systematic at least in mild AoS. Moreover, an important proportion is misdetected by ear, even by expert clinicians, meaning that visualisation of the acoustic signal can be of precious help.
Collapse
Affiliation(s)
- Marion Bourqui
- Faculty of Psychology and Educational Science, University of Geneva, Geneva, Switzerland
| | - Michaela Pernon
- Laboratoire de Phonétique et Phonologie, UMR, France
- Department of Clinical Neurosciences, Geneva University Hospital, Switzerland
| | | | - Marina Laganaro
- Faculty of Psychology and Educational Science, University of Geneva, Geneva, Switzerland
| |
Collapse
|
37
|
Xiao Y, Wang T, Deng W, Yang L, Zeng B, Lao X, Zhang S, Liu X, Ouyang D, Liao G, Liang Y. Data mining of an acoustic biomarker in tongue cancers and its clinical validation. Cancer Med 2021; 10:3822-3835. [PMID: 33938165 PMCID: PMC8178493 DOI: 10.1002/cam4.3872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2020] [Revised: 01/30/2021] [Accepted: 03/14/2021] [Indexed: 11/08/2022] Open
Abstract
The promise of speech disorders as biomarkers in clinical examination has been identified in a broad spectrum of neurodegenerative diseases. However, to the best of our knowledge, a validated acoustic marker with established discriminative and evaluative properties has not yet been developed for oral tongue cancers. Here we cross-sectionally collected a screening dataset that included acoustic parameters extracted from 3 sustained vowels /ɑ/, /i/, /u/ and binary perceptual outcomes from 12 consonant-vowel syllables. We used a support vector machine with linear kernel function within this dataset to identify the formant centralization ratio (FCR) as a dominant predictor of different perceptual outcomes across gender and syllable. The Acoustic analysis, Perceptual evaluation and Quality of Life assessment (APeQoL) was used to validate the FCR in 33 patients with primary resectable oral tongue cancers. Measurements were taken before (pre-op) and four to six weeks after (post-op) surgery. The speech handicap index (SHI), a speech-specific questionnaire, was also administrated at these time points. Pre-op correlation analysis within the APeQoL revealed overall consistency and a strong correlation between FCR and SHI scores. FCRs also increased significantly with increasing T classification pre-operatively, especially for women. Longitudinally, the main effects of T classification, the extent of resection, and their interaction effects with time (pre-op vs. post-op) on FCRs were all significant. For pre-operative FCR, after merging the two datasets, a cut-off value of 0.970 produced an AUC of 0.861 (95% confidence interval: 0.785-0.938) for T3-4 patients. In sum, this study determined that FCR is an acoustic marker with the potential to detect disease and related speech function in oral tongue cancers. These are preliminary findings that need to be replicated in longitudinal studies and/or larger cohorts.
Collapse
Affiliation(s)
- Yudong Xiao
- Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
| | - Tao Wang
- Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
| | - Wei Deng
- Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
| | - Le Yang
- Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
| | - Bin Zeng
- Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
| | - Xiaomei Lao
- Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
| | - Sien Zhang
- Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
| | - Xiangqi Liu
- Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
| | - Daiqiao Ouyang
- Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
| | - Guiqing Liao
- Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
| | - Yujie Liang
- Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
38
|
Sonbay Yılmaz ND, Afyoncu C, Ensari N, Yıldız M, Gür ÖE. The Effect of the Mother's Participation in Therapy on Children with Vocal Fold Nodules. Ann Otol Rhinol Laryngol 2021; 130:1263-1267. [PMID: 33733874 DOI: 10.1177/00034894211002430] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
OBJECTIVES Vocal fold nodules (VFN) are a bilateral epithelial thickening of the membranous vocal folds. In this study, children with VFN and their mothers took part in voice therapy. We then compared acoustic analyzes and subjective evaluations to those in previous literature to determine whether voice therapy is more effective for children with VFN when their mothers also take part in therapy. METHODS Children aged eight to 12 years who were diagnosed with bilateral VFN between January 2018 and January 2020 were included in this study. Participating children diagnosed with bilateral VFN were divided into two groups based on the wishes and cooperation of their families. Group 1 consisted of 16 patients; Group 2 included 17 patients. The children in Group 1 received voice therapy alone; children in Group 2 took part in therapy with their mothers. For all participants, the average fundemental frequency (F0), jitter percentages, shimmer percentages, maximum phonation time (MPT) and s/z ratios were measured. Pediatric voice handicap index (p-VHI) values were calculated as well. RESULTS The two groups' measures pre-treatment and post-treatment were compared. Except for p-VHI, no significant difference was observed between the two groups. However, p-VHI post-treatment was significantly lower in Group 2 than in Group 1. CONCLUSIONS Involving the families and even teachers of children with VFN in voice therapy can increase the effectiveness of therapy. The family's involvement increases the child's motivation in therapy. The mother's presence during therapy, supporting the child or even doing the work with the child, can be a very important source of motivation for the child, who may already be tired from school and other activities. Thus, the mother's involvement increases the child's compliance with and interest in therapy.
Collapse
Affiliation(s)
| | - Cansu Afyoncu
- Depatment of Speech and Language Therapy, Antalya Training and Research Hospital, Antalya, Turkey
| | - Nuray Ensari
- Depatment of Otolaryngology, Antalya Training and Research Hospital, Antalya, Turkey
| | - Muhammet Yıldız
- Depatment of Otolaryngology, Antalya Training and Research Hospital, Antalya, Turkey
| | - Özer Erdem Gür
- Depatment of Otolaryngology, Antalya Training and Research Hospital, Antalya, Turkey
| |
Collapse
|
39
|
Konstantopoulos K, Vogazianos P, Christou Y, Pisinou M. Sequential motion rate and oral reading rate: normative data for Greek and clinical implications. LOGOP PHONIATR VOCO 2021; 47:177-182. [PMID: 33730987 DOI: 10.1080/14015439.2021.1901309] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
INTRODUCTION The aim of the present study was to provide normative data in Greek, regarding sequential motion rate (SMR) and oral reading rate (ORR), and to show the sensitivity of both tasks to predict Parkinson's disease (PD). METHODS The speech rate of sixty-five healthy control participants was recorded and analyzed using speech acoustics. The speech rate of a subsample of 20 healthy control participants was compared to the speech rate of 20 pair-matched dysarthric parkinsonian participants. All participants produced the syllables /pataka/ (SMR task) as quickly as possible and read aloud a standard Greek passage (ORR task). RESULTS In normative data, the mean score for the SMR variable was 4.91 syllables per second (SD = 0.73) and for the ORR variable was 4.42 syllables per second (SD = 0.87). The Mann-Whitney test showed significant differences between the two groups of participants in the SMR (U = 64.000, Z = -4.60, p < .001) and ORR (U = 77.000, Z = -4.36, p < .001). Multiple binary logistic regression analysis examined the combined effect of ORR and SMR on the occurrence of the disease. The sensitivity of both tasks to predict PD was found to be 0.88 and the specificity 0.90. The optimal screening cutoff point was found to be 4.66 syllables/second for the SMR task and 2.79 syllables/second for the ORR task. CONCLUSIONS This study provided Greek normative data in SMR and ORR tasks. Both tasks showed high sensitivity and specificity to predict PD in the Greek sample of participants.
Collapse
Affiliation(s)
- K Konstantopoulos
- Department of Speech Therapy, University of Peloponnese, Kalamata, Greece.,Cyprus Institute for Neurology and Genetics, Nicosia, Cyprus
| | - P Vogazianos
- School of Humanities, Social & Education Sciences, European University Cyprus, Nicosia, Cyprus
| | - Y Christou
- Cyprus Institute for Neurology and Genetics, Nicosia, Cyprus
| | - M Pisinou
- Program of Speech Therapy, European University Cyprus, Nicosia, Cyprus
| |
Collapse
|
40
|
Mainsah BO, Patel PA, Chen XJ, Olsen C, Collins LM, Karra R. Novel Acoustic Biomarker of Quality of Life in Left Ventricular Assist Device Recipients. J Am Heart Assoc 2021; 10:e018588. [PMID: 33660516 PMCID: PMC8174227 DOI: 10.1161/jaha.120.018588] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 01/08/2021] [Indexed: 12/18/2022]
Abstract
Background Although technological advances to pump design have improved survival, left ventricular assist device (LVAD) recipients experience variable improvements in quality of life. Methods for optimizing LVAD support to improve quality of life are needed. We investigated whether acoustic signatures obtained from digital stethoscopes can predict patient-centered outcomes in LVAD recipients. Methods and Results We followed precordial sounds over 6 months in 24 LVAD recipients (8 HeartWare HVAD™, 16 HeartMate 3 [HM3]). Subjects recorded their precordial sounds with a digital stethoscope and completed a Kansas City Cardiomyopathy Questionnaire weekly. We developed a novel algorithm to filter LVAD sounds from recordings. Unsupervised clustering of LVAD-mitigated sounds revealed distinct groups of acoustic features. Of 16 HM3 recipients, 6 (38%) had a unique acoustic feature that we have termed the pulse synchronized sound based on its temporal association with the artificial pulse of the HM3. HM3 recipients with the pulse synchronized sound had significantly better Kansas City Cardiomyopathy Questionnaire scores at baseline (median, 89.1 [interquartile range, 86.2-90.4] versus 66.1 [interquartile range, 31.1-73.7]; P=0.03) and over the 6-month study period (marginal mean, 77.6 [95% CI, 66.3-88.9] versus 59.9 [95% CI, 47.9-70.0]; P<0.001). Mechanistically, the pulse synchronized sound shares acoustic features with patient-derived intrinsic sounds. Finally, we developed a machine learning algorithm to automatically detect the pulse synchronized sound within precordial sounds (area under the curve, 0.95, leave-one-subject-out cross-validation). Conclusions We have identified a novel acoustic biomarker associated with better quality of life in HM3 LVAD recipients, which may provide a method for assaying optimized LVAD support.
Collapse
Affiliation(s)
- Boyla O. Mainsah
- Department of Electrical and Computer EngineeringDuke UniversityDurhamNC
| | | | - Xinlin J. Chen
- Department of Electrical and Computer EngineeringDuke UniversityDurhamNC
| | - Cameron Olsen
- Division of CardiologyDepartment of MedicineDuke University Medical CenterDurhamNC
| | - Leslie M. Collins
- Department of Electrical and Computer EngineeringDuke UniversityDurhamNC
| | - Ravi Karra
- Division of CardiologyDepartment of MedicineDuke University Medical CenterDurhamNC
| |
Collapse
|
41
|
Volkmann N, Kulig B, Hoppe S, Stracke J, Hensel O, Kemper N. On-farm detection of claw lesions in dairy cows based on acoustic analyses and machine learning. J Dairy Sci 2021; 104:5921-5931. [PMID: 33663849 DOI: 10.3168/jds.2020-19206] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Accepted: 12/23/2020] [Indexed: 11/19/2022]
Abstract
Claw lesions are a serious problem on dairy farms, affecting both the health and welfare of the cow. Automated detection of lameness with a practical, on-farm application would support the early detection and treatment of lame cows, potentially reducing the number and severity of claw lesions. Therefore, in this study, a method was proposed for the detection of claw lesions based on the acoustic analysis of a cow's gait. A panel was constructed to measure the impact sound of animals walking over it. The recorded impact sound was edited, and 640 sound files from 64 cows were analyzed. The classification of animal-lameness status was performed using a machine-learning process with a random forest algorithm. The gold standard was a 2-point scale of hoof-trimming results (healthy vs. affected), and 38 properties of the recorded sound files were used as influencing factors. A prediction model for classifying the cow lameness was built using a random forest algorithm. This was validated by comparing the reference output from hoof-trimming with the model output concerning the impact sound. Altering the likelihood settings and changing the cutoff value to predict lame animals improved the prediction model. At a cutoff at 0.4, a decreased false-negative rate was generated, and the false-positive rate only increased slightly. This model obtained a sensitivity of 0.81 and a specificity of 0.97. With this procedure, Cohen's Kappa value of 0.80 showed good agreement between model classification and diagnoses from hoof-trimming. In summary, the prediction model enabled the detection of cows with claw lesions. This study shows that lameness can be detected by machine learning from the impact sound of hoofs in dairy cows.
Collapse
Affiliation(s)
- N Volkmann
- Institute for Animal Hygiene, Animal Welfare and Animal Behavior, University of Veterinary Medicine Hannover, Foundation, Bischofsholer Damm 15, D-30173 Hannover, Germany.
| | - B Kulig
- Section of Agricultural and Biosystems Engineering, University of Kassel, Nordbahnhofstraße 1a, D-37213 Witzenhausen, Germany
| | - S Hoppe
- Agricultural Research and Training Center Haus Riswick, Agricultural Chamber of North Rhine-Westphalia, Elsenpaß 5, D-47533 Kleve, Germany
| | - J Stracke
- Institute for Animal Hygiene, Animal Welfare and Animal Behavior, University of Veterinary Medicine Hannover, Foundation, Bischofsholer Damm 15, D-30173 Hannover, Germany
| | - O Hensel
- Section of Agricultural and Biosystems Engineering, University of Kassel, Nordbahnhofstraße 1a, D-37213 Witzenhausen, Germany
| | - N Kemper
- Institute for Animal Hygiene, Animal Welfare and Animal Behavior, University of Veterinary Medicine Hannover, Foundation, Bischofsholer Damm 15, D-30173 Hannover, Germany
| |
Collapse
|
42
|
Cavallieri F, Budriesi C, Gessani A, Contardi S, Fioravanti V, Menozzi E, Pinto S, Moro E, Valzania F, Antonelli F. Dopaminergic Treatment Effects on Dysarthric Speech: Acoustic Analysis in a Cohort of Patients With Advanced Parkinson's Disease. Front Neurol 2021; 11:616062. [PMID: 33613419 PMCID: PMC7892955 DOI: 10.3389/fneur.2020.616062] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2020] [Accepted: 12/29/2020] [Indexed: 01/10/2023] Open
Abstract
Importance: The effects of dopaminergic treatment on speech in patients with Parkinson's disease (PD) are often mixed and unclear. The aim of this study was to better elucidate those discrepancies. Methods: Full retrospective data from advanced PD patients before and after an acute levodopa challenge were collected. Acoustic analysis of spontaneous monologue and sustained phonation including several quantitative parameters [i.e., maximum phonation time (MPT); shimmer local dB] as well as the Unified Parkinson's Disease Rating Scale (UPDRS) (total scores, subscores, and items) and the Clinical Dyskinesia Rating Scale (CDRS) were performed in both the defined-OFF and -ON conditions. The primary outcome was the changes of speech parameters after levodopa intake. Secondary outcomes included the analysis of possible correlations of motor features and levodopa-induced dyskinesia (LID) with acoustic speech parameters. Statistical analysis included paired t-test between the ON and OFF data (calculated separately for male and female subgroups) and Pearson correlation between speech and motor data. Results: In 50 PD patients (male: 32; female: 18), levodopa significantly increased the MPT of sustained phonation in female patients (p < 0.01). In the OFF-state, the UPDRS part-III speech item negatively correlated with MPT (p = 0.02), whereas in the ON-state, it correlated positively with the shimmer local dB (p = 0.01), an expression of poorer voice quality. The total CDRS score and axial subscores strongly correlated with the ON-state shimmer local dB (p = 0.01 and p < 0.01, respectively). Conclusions: Our findings emphasize that levodopa has a poor effect on speech acoustic parameters. The intensity and location of LID negatively influenced speech quality.
Collapse
Affiliation(s)
- Francesco Cavallieri
- Neurology Unit, Neuromotor and Rehabilitation Department, Azienda USL - IRCCS di Reggio Emilia, Reggio Emilia, Italy.,Clinical and Experimental Medicine PhD Program, University of Modena and Reggio Emilia, Modena, Italy
| | - Carla Budriesi
- Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio Emilia, Modena, Italy.,Azienda Ospedaliero Universitaria di Modena, Modena, Italy
| | - Annalisa Gessani
- Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio Emilia, Modena, Italy.,Azienda Ospedaliero Universitaria di Modena, Modena, Italy
| | - Sara Contardi
- Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio Emilia, Modena, Italy.,Azienda Ospedaliero Universitaria di Modena, Modena, Italy
| | - Valentina Fioravanti
- Neurology Unit, Neuromotor and Rehabilitation Department, Azienda USL - IRCCS di Reggio Emilia, Reggio Emilia, Italy
| | - Elisa Menozzi
- Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio Emilia, Modena, Italy.,Azienda Ospedaliero Universitaria di Modena, Modena, Italy.,Department of Clinical and Movement Neurosciences, UCL Queen Square Institute of Neurology, London, United Kingdom
| | - Serge Pinto
- Aix Marseille Univ, CNRS, LPL, Aix-en-Provence, France
| | - Elena Moro
- Division of Neurology, Centre Hospitalier Universitaire (CHU), Grenoble Alpes University, Grenoble Institute of Neurosciences, Grenoble, France
| | - Franco Valzania
- Neurology Unit, Neuromotor and Rehabilitation Department, Azienda USL - IRCCS di Reggio Emilia, Reggio Emilia, Italy
| | - Francesca Antonelli
- Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio Emilia, Modena, Italy.,Azienda Ospedaliero Universitaria di Modena, Modena, Italy
| |
Collapse
|
43
|
Cheoy LP, Chong FY, Mazlan R, Lim HW. Development of the Mandarin Nonsense Word Identification Test. Int J Audiol 2021; 60:578-587. [PMID: 33426971 DOI: 10.1080/14992027.2020.1864485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
OBJECTIVE This study aimed to develop a digitised Mandarin Nonsense Word Speech Perception Test for use in Malaysia, a multilingual country in Southeast Asia. DESIGN In Phase I, 400 vowel-consonant-vowel (VCV) nonsense word samples containing 20 Mandarin consonants in /a/, /i/, or /u/ contexts were recorded from two speakers of different genders. Acoustic analyses, sound quality ratings, and item validations were used to guide selection of items to form two gender-specific test lists. In Phase II, performance-intensity functions and test-retest reliability for the lists were established. STUDY SAMPLE Native Mandarin-speaking adults with normal hearing participated in Phase I (n = 10) and Phase II (n = 69). RESULTS Eighty-four of the 400 VCV words were selected to form two gender-specific test lists. A two-way repeated measure ANOVA revealed a significant interaction effect between speaker-gender and presentation level [F (4.88, 283.20) = 22.79, p < 0.001, ηp2= 0.28]. Intraclass correlation scores of 0.75 and 0.87 were obtained for the female-speaker and male-speaker lists respectively. CONCLUSIONS The preliminary normative data of the Mandarin nonsense word test have been developed. It is recommended to use separate gender-specific norms when conducting the test. The test has good validity and reliability for testing Mandarin-speaking adults in Malaysia.
Collapse
Affiliation(s)
- Lai Pheng Cheoy
- Audiology Programme, Centre for Rehabilitation and Special Needs (iCaRehab), Faculty of Health Sciences, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia
| | - Foong Yen Chong
- Audiology Programme, Centre for Rehabilitation and Special Needs (iCaRehab), Faculty of Health Sciences, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia
| | - Rafidah Mazlan
- Audiology Programme, Centre for Rehabilitation and Special Needs (iCaRehab), Faculty of Health Sciences, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia.,Centre for Ear, Hearing and Speech, Faculty of Health Sciences, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia
| | - Hui Woan Lim
- Speech Sciences Programme, Centre for Rehabilitation and Special Needs (iCaRehab), Faculty of Health Sciences, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia
| |
Collapse
|
44
|
Mao Y, Chen H, Xie S, Xu L. Acoustic Assessment of Tone Production of Prelingually-Deafened Mandarin-Speaking Children With Cochlear Implants. Front Neurosci 2020; 14:592954. [PMID: 33250708 PMCID: PMC7673231 DOI: 10.3389/fnins.2020.592954] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2020] [Accepted: 10/12/2020] [Indexed: 11/23/2022] Open
Abstract
Objective The purpose of the present study was to investigate Mandarin tone production performance of prelingually deafened children with cochlear implants (CIs) using modified acoustic analyses and to evaluate the relationship between demographic factors of those CI children and their tone production ability. Methods Two hundred seventy-eight prelingually deafened children with CIs and 173 age-matched normal-hearing (NH) children participated in the study. Thirty-six monosyllabic Mandarin Chinese words were recorded from each subject. The fundamental frequencies (F0) were extracted from the tone tokens. Two acoustic measures (i.e., differentiability and hit rate) were computed based on the F0 onset and offset values (i.e., the tone ellipses of the two-dimensional [2D] method) or the F0 onset, midpoint, and offset values (i.e., the tone ellipsoids of the 3D method). The correlations between the acoustic measures as well as between the methods were performed. The relationship between demographic factors and acoustic measures were also explored. Results The children with CIs showed significantly poorer performance in tone differentiability and hit rate than the NH children. For both CI and NH groups, performance on the two acoustic measures was highly correlated with each other (r values: 0.895–0.961). The performance between the two methods (i.e., 2D and 3D methods) was also highly correlated (r values: 0.774–0.914). Age at implantation and duration of CI use showed a weak correlation with the scores of acoustic measures under both methods. These two factors jointly accounted for 15.4–18.9% of the total variance of tone production performance. Conclusion There were significant deficits in tone production ability in most prelingually deafened children with CIs, even after prolonged use of the devices. The strong correlation between the two methods suggested that the simpler, 2D method seemed to be efficient in acoustic assessment for lexical tones in hearing-impaired children. Age at implantation and especially the duration of CI use were significant, although weak, predictors for tone development in pediatric CI users. Although a large part of tone production ability could not be attributed to these two factors, the results still encourage early implantation and continual CI use for better lexical tone development in Mandarin-speaking pediatric CI users.
Collapse
Affiliation(s)
- Yitao Mao
- Department of Radiology, Xiangya Hospital, Central South University, Changsha, China
| | - Hongsheng Chen
- Department of Otolaryngology-Head and Neck Surgery, Xiangya Hospital, Central South University, Changsha, China
| | - Shumin Xie
- Department of Otolaryngology-Head and Neck Surgery, Xiangya Hospital, Central South University, Changsha, China
| | - Li Xu
- Communication Sciences and Disorders, Ohio University, Athens, OH, United States
| |
Collapse
|
45
|
Noffs G, Boonstra FMC, Perera T, Butzkueven H, Kolbe SC, Maldonado F, Cofre Lizama LE, Galea MP, Stankovich J, Evans A, van der Walt A, Vogel AP. Speech metrics, general disability, brain imaging and quality of life in multiple sclerosis. Eur J Neurol 2020; 28:259-268. [PMID: 32916031 DOI: 10.1111/ene.14523] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2020] [Accepted: 08/30/2020] [Indexed: 01/09/2023]
Abstract
BACKGROUND AND PURPOSE Objective measurement of speech has shown promising results to monitor disease state in multiple sclerosis. In this study, we characterize the relationship between disease severity and speech metrics through perceptual (listener based) and objective acoustic analysis. We further look at deviations of acoustic metrics in people with no perceivable dysarthria. METHODS Correlations and regression were calculated between speech measurements and disability scores, brain volume, lesion load and quality of life. Speech measurements were further compared between three subgroups of increasing overall neurological disability: mild (as rated by the Expanded Disability Status Scale ≤2.5), moderate (≥3 and ≤5.5) and severe (≥6). RESULTS Clinical speech impairment occurred majorly in people with severe disability. An experimental acoustic composite score differentiated mild from moderate (P < 0.001) and moderate from severe subgroups (P = 0.003), and correlated with overall neurological disability (r = 0.6, P < 0.001), quality of life (r = 0.5, P < 0.001), white matter volume (r = 0.3, P = 0.007) and lesion load (r = 0.3, P = 0.008). Acoustic metrics also correlated with disability scores in people with no perceivable dysarthria. CONCLUSIONS Acoustic analysis offers a valuable insight into the development of speech impairment in multiple sclerosis. These results highlight the potential of automated analysis of speech to assist in monitoring disease progression and treatment response.
Collapse
Affiliation(s)
- G Noffs
- Centre for Neuroscience of Speech, University of Melbourne, Melbourne, VIC, Australia.,Department of Neurology, Royal Melbourne Hospital, Melbourne, VIC, Australia
| | - F M C Boonstra
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, VIC, Australia
| | - T Perera
- The Bionics Institute, Melbourne, VIC, Australia.,Department of Medical Bionics, University of Melbourne, Melbourne, VIC, Australia
| | - H Butzkueven
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, VIC, Australia
| | - S C Kolbe
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, VIC, Australia
| | - F Maldonado
- Centre for Neuroscience of Speech, University of Melbourne, Melbourne, VIC, Australia
| | - L Euardo Cofre Lizama
- Department of Medicine, University of Melbourne, Melbourne, VIC, Australia.,Australia Rehabilitation Research Centre, Royal Melbourne Hospital, Melbourne, VIC, Australia.,School of Allied Health, Human Services and Sports, La Trobe University, Melbourne, VIC, Australia
| | - M P Galea
- Department of Medicine, University of Melbourne, Melbourne, VIC, Australia.,Australia Rehabilitation Research Centre, Royal Melbourne Hospital, Melbourne, VIC, Australia
| | - J Stankovich
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, VIC, Australia
| | - A Evans
- Department of Neurology, Royal Melbourne Hospital, Melbourne, VIC, Australia.,The Bionics Institute, Melbourne, VIC, Australia
| | - A van der Walt
- Department of Neurology, Royal Melbourne Hospital, Melbourne, VIC, Australia.,Department of Neuroscience, Central Clinical School, Monash University, Melbourne, VIC, Australia.,The Bionics Institute, Melbourne, VIC, Australia
| | - A P Vogel
- Centre for Neuroscience of Speech, University of Melbourne, Melbourne, VIC, Australia.,The Bionics Institute, Melbourne, VIC, Australia.,Department of Neurodegeneration, Hertie Institute for Clinical Brain Research, University of Tübingen, Tübingen, Germany.,Redenlab, Melbourne, VIC, Australia
| |
Collapse
|
46
|
Alías F, Socoró JC, Alsina-Pagès RM. WASN-Based Day-Night Characterization of Urban Anomalous Noise Events in Narrow and Wide Streets. Sensors (Basel) 2020; 20:s20174760. [PMID: 32842527 PMCID: PMC7506928 DOI: 10.3390/s20174760] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Revised: 08/15/2020] [Accepted: 08/20/2020] [Indexed: 11/21/2022]
Abstract
In addition to air pollution, environmental noise has become one of the major hazards for citizens, being Road Traffic Noise (RTN) as its main source in urban areas. Recently, low-cost Wireless Acoustic Sensor Networks (WASNs) have become an alternative to traditional strategic noise mapping in cities. In order to monitor RTN solely, WASN-based approaches should automatize the off-line removal of those events unrelated to regular road traffic (e.g., sirens, airplanes, trams, etc.). Within the LIFE DYNAMAP project, 15 urban Anomalous Noise Events (ANEs) were described through an expert-based recording campaign. However, that work only focused on the overall analysis of the events gathered during non-sequential diurnal periods. As a step forward to characterize the temporal and local particularities of urban ANEs in real acoustic environments, this work analyses their distribution between day (06:00–22:00) and night (22:00–06:00) in narrow (1 lane) and wide (more than 1 lane) streets. The study is developed on a manually-labelled 151-h acoustic database obtained from the 24-nodes WASN deployed across DYNAMAP’s Milan pilot area during a weekday and a weekend day. Results confirm the unbalanced nature of the problem (RTN represents 83.5% of the data), while identifying 26 ANE subcategories mainly derived from pedestrians, animals, transports and industry. Their presence depends more significantly on the time period than on the street type, as most events have been observed in the day-time during the weekday, despite being especially present in narrow streets. Moreover, although ANEs show quite similar median durations regardless of time and location in general terms, they usually present higher median signal-to-noise ratios at night, mainly on the weekend, which becomes especially relevant for the WASN-based computation of equivalent RTN levels.
Collapse
|
47
|
Verdurand M, Rossato S, Zmarich C. Coarticulatory Aspects of the Fluent Speech of French and Italian People Who Stutter Under Altered Auditory Feedback. Front Psychol 2020; 11:1745. [PMID: 32793069 PMCID: PMC7390966 DOI: 10.3389/fpsyg.2020.01745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2019] [Accepted: 06/24/2020] [Indexed: 12/03/2022] Open
Abstract
A number of studies have shown that phonetic peculiarities, especially at the coarticulation level, exist in the disfluent as well as in the perceptively fluent speech of people who stutter (PWS). However, results from fluent speech are very disparate and not easily interpretable. Are the coarticulatory features observed in fluent speech of PWS a manifestation of the disorder, or rather a compensation for the disorder itself? The purpose of the present study is to investigate the coarticulatory behavior in the fluent speech of PWS in the attempt to answer the question on its symptomatic or adaptive nature. In order to achieve this, we have studied the speech of 21 adult PWS (10 French and 11 Italian) compared to that of 20 fluent adults (10 French and 10 Italian). The participants had to repeat simple CV syllables in short carrier sentences, where C = /b, d, g/ and V = /a, i, u/. Crucially, this repetition task was performed in order to compare fluent speech coarticulation of PWS to that of PWNS, and to compare the coarticulation of PWS under a condition with normal auditory feedback (NAF) and under a fluency-enhancing condition due to an altered auditory feedback (AAF). This is the first study, to our knowledge, to investigate the coarticulation behavior under AAF. The degree of coarticulation was measured by means of the Locus Equations (LE). The coarticulation degree observed in fluent PWS speech is lower than that of the PWNS, and, more importantly, in AAF condition, PWS coarticulation appears even weaker than in the NAF condition. The results allow to interpret the lower degree of coarticulation found in fluent speech of PWS under NAF condition as a compensation for the disorder, based on the fact that PWS’s coarticulation is weakening in fluency-enhancing conditions, further away from the degree of coarticulation observed in PWNS. Since a lower degree of coarticulation is associated to a greater separation between the places of articulation of the consonant and the vowel, these results are compatible with the hypothesis that larger articulatory movements could be responsible for the stabilization of the PWS speech motor system, increasing the kinesthetic feedback from the effector system. This interpretation shares with a number of relatively recent proposal the idea that stuttering derives from an impaired feedforward (open-loop) control system, which makes PWS rely more heavily on a feedback-based (closed loop) motor control strategy.
Collapse
Affiliation(s)
- Marine Verdurand
- Speech Therapy Study, Cabestany, France.,Université Grenoble Alpes, CNRS, Grenoble INP, LIG, Grenoble, France
| | - Solange Rossato
- Université Grenoble Alpes, CNRS, Grenoble INP, LIG, Grenoble, France
| | - Claudio Zmarich
- Institute of Cognitive Sciences and Technologies, National Research Council, Padua, Italy
| |
Collapse
|
48
|
Diamant N, Amir O. Examining the voice of Israeli transgender women: Acoustic measures, voice femininity and voice-related quality-of-life. Int J Transgend Health 2020; 22:281-293. [PMID: 34240071 PMCID: PMC8118229 DOI: 10.1080/26895269.2020.1798838] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
BACKGROUND Transgender women may experience gender-dysphoria associated with their voice and the way it is perceived. Previous studies have shown that specific acoustic measures are associated with the perception of voice-femininity and with voice-related quality-of-life, yet results are inconsistent. AIMS This study aimed to examine the associations between specific voice measures of transgender women, voice-related quality-of-life, and the perception of voice-femininity by listeners and by the speakers themselves. METHODS Thirty Hebrew speaking transgender women were recorded. They had also rated their voice-femininity and completed the Hebrew version of the TVQMtF questionnaire. Recordings were analyzed to extract mean fundamental frequency (F0), formant frequencies (F1, F2, F3), and vocal-range (calculated in Hz. and in semitones). Recordings were also rated on a voice-gender 7-point scale, by 20 naïve cisgender listeners. RESULTS Significant correlations were found between both F0 and F1 and listeners' as well as speakers' evaluation of voice-femininity. TVQMtF scores were significantly correlated with F0 and with the lower and upper boundaries of the vocal-range. Voice-femininity ratings were strongly correlated with vocal-range, when calculated in Hz, but not when defined in semitones. Listeners' evaluation and speakers' self-evaluation of voice-femininity were significantly correlated. However, TVQMtF scores were significantly correlated only with the speakers' voice-femininity ratings, but not with those of the listeners. CONCLUSION Higher F0 and F1, which are perceived as more feminine, jointly improved speakers' satisfaction with their voice. Speakers' self-evaluation of voice-femininity does not mirror listeners' judgment, as it is affected by additional factors, related to self-satisfaction and personal experience. Combining listeners' and speakers' voice evaluation with acoustic analysis is valuable by providing a more holistic view on how transgender women feel about their voice and how it is perceived by listeners.
Collapse
Affiliation(s)
- Noa Diamant
- Department of Communication Disorders, Sackler Faculty of Medicine, Tel-Aviv University, Tel-Aviv, Israel
| | - Ofer Amir
- Department of Communication Disorders, Sackler Faculty of Medicine, Tel-Aviv University, Tel-Aviv, Israel
| |
Collapse
|
49
|
Kosztyła-Hojna B, Duchnowska E, Zdrojkowski M, Łobaczuk-Sitnik A, Biszewska J. Application of High Speed Digital Imaging (HSDI) technique and voice acoustic analysis in the diagnosis of the clinical form of Presbyphonia in women. Otolaryngol Pol 2020; 74:24-30. [PMID: 34550094 DOI: 10.5604/01.3001.0014.1580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
<b>Introduction:</b> The aging process of voice begins after the age of 60 and has an individually variable course. Voice quality disorders at this age are called senile voice (Presbyphonia or Vox Senium). Voice pathology is particularly severe in women. The aim of the study was to diagnose the clinical form of Presbyphonia in elderly women using High Speed Digital Imaging (HSDI) and acoustic voice analysis. <br><b>Material and methods:</b> Study included 50 elderly women (average age 69) with dysphonia (Group I). Control group (Group II) included 30 women (average age 71) without voice quality disorders. Visualization assessment has been conducted with High Speed Digital Imaging (HSDI) with High Speed camera (HS). Acoustic evaluation of voice included analysis isolated vowel "a" and continuous linguistic text with Diagnoscope Specialista software. Maximum Phonation Time (MPT) has been determined. <br><b>Results:</b> In Group I, 78% of women revealed vocal folds vibrations asymmetry, vibration amplitude increase, Mucousal Wave (MW) limitation and Type D glottal insufficiency (GTs). Acoustic voice analysis proved decrease in F0, increase in Jitter, Shimmer, NHR. In 22% of women, next to vibrations asymmetry, vibration amplitude reduction and MW limitation, Type E glottal insufficiency (GTs) have been found. Acoustic voice analysis revealed slight decrease in F0 and the presence of numerous non-harmonic components in the glottis region. <br><b>Conclusions:</b> Vocal folds visualization with HSDI showed edema, less often atrophy in elderly women. Both forms of dysphonia were caused abnormal values of F0, Jitter, Shimmer, NHR in the acoustic voice evaluation and significant reduction of MPT.
Collapse
Affiliation(s)
- Bożena Kosztyła-Hojna
- Department of Clinical Phonoaudiology and Speech Therapy, Medical University of Bialystok, Poland
| | - Emilia Duchnowska
- Department of Clinical Phonoaudiology and Speech Therapy, Medical University of Bialystok, Poland
| | - Maciej Zdrojkowski
- Department of Clinical Phonoaudiology and Speech Therapy, Medical University of Bialystok, Poland
| | - Anna Łobaczuk-Sitnik
- Department of Clinical Phonoaudiology and Speech Therapy, Medical University of Bialystok, Poland
| | - Jolanta Biszewska
- Department of Clinical Phonoaudiology and Speech Therapy, Medical University of Bialystok, Poland
| |
Collapse
|
50
|
Arciuli J, Colombo L, Surian L. Lexical stress contrastivity in Italian children with autism spectrum disorders: an exploratory acoustic study. J Child Lang 2020; 47:870-880. [PMID: 31826787 DOI: 10.1017/s0305000919000795] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
We investigated production of lexical stress in children with and without autism spectrum disorders (ASD), all monolingual Italian speakers. The mean age of the 16 autistic children was 5.73 years and the mean age of the 16 typically developing children was 4.65 years. Picture-naming targets were five trisyllabic words that began with a weak-strong pattern of lexical stress across the initial two syllables (WS: matita) and five trisyllabic words beginning with a strong-weak pattern (SW: gomito). Acoustic measures of the duration, fundamental frequency, and intensity of the first two vowels for correct word productions were used to calculate a normalised Pairwise Variability Index (PVI) for WS and SW words. Results of acoustic analyses indicated no statistically significant group differences in PVIs. Results should be interpreted in line with the exploratory nature of this study. We hope this study will encourage additional cross-linguistic studies of prosody in children's speech production.
Collapse
|