1
|
Shellikeri S, Cho S, Ash S, Gonzalez-Recober C, McMillan CT, Elman L, Quinn C, Amado DA, Baer M, Irwin DJ, Massimo L, Olm C, Liberman M, Grossman M, Nevler N. Digital markers of motor speech impairments in spontaneous speech of patients with ALS-FTD spectrum disorders. Amyotroph Lateral Scler Frontotemporal Degener 2024; 25:317-325. [PMID: 38050971 PMCID: PMC11023759 DOI: 10.1080/21678421.2023.2288106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Accepted: 11/20/2023] [Indexed: 12/07/2023]
Abstract
OBJECTIVE To evaluate automated digital speech measures, derived from spontaneous speech (picture descriptions), in assessing bulbar motor impairments in patients with ALS-FTD spectrum disorders (ALS-FTSD). METHODS Automated vowel algorithms were employed to extract two vowel acoustic measures: vowel space area (VSA), and mean second formant slope (F2 slope). Vowel measures were compared between ALS with and without clinical bulbar symptoms (ALS + bulbar (n = 49, ALSFRS-r bulbar subscore: x¯ = 9.8 (SD = 1.7)) vs. ALS-nonbulbar (n = 23), behavioral variant frontotemporal dementia (bvFTD, n = 25) without a motor syndrome, and healthy controls (HC, n = 32). Correlations with bulbar motor clinical scales, perceived listener effort, and MRI cortical thickness of the orobuccal primary motor cortex (oral PMC) were examined. We compared vowel measures to speaking rate, a conventional metric for assessing bulbar dysfunction. RESULTS ALS + bulbar had significantly reduced VSA and F2 slope than ALS-nonbulbar (|d|=0.94 and |d|=1.04, respectively), bvFTD (|d|=0.89 and |d|=1.47), and HC (|d|=0.73 and |d|=0.99). These reductions correlated with worse bulbar clinical scores (VSA: R = 0.33, p = 0.043; F2 slope: R = 0.38, p = 0.011), greater listener effort (VSA: R=-0.43, p = 0.041; F2 slope: p > 0.05), and cortical thinning in oral PMC (F2 slope: β = 0.0026, p = 0.017). Vowel measures demonstrated greater sensitivity and specificity for bulbar impairment than speaking rate, while showing independence from cognitive and respiratory impairments. CONCLUSION Automatic vowel measures are easily derived from a brief spontaneous speech sample, are sensitive to mild-moderate stage of bulbar disease in ALS-FTSD, and may present better sensitivity to bulbar impairment compared to traditional assessments such as speaking rate.
Collapse
Affiliation(s)
- Sanjana Shellikeri
- Penn Frontotemporal Degeneration Center and Department of Neurology, University of Pennsylvania, Philadelphia, PA
| | - Sunghye Cho
- Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA
| | - Sharon Ash
- Penn Frontotemporal Degeneration Center and Department of Neurology, University of Pennsylvania, Philadelphia, PA
| | - Carmen Gonzalez-Recober
- Penn Frontotemporal Degeneration Center and Department of Neurology, University of Pennsylvania, Philadelphia, PA
| | - Corey T. McMillan
- Penn Frontotemporal Degeneration Center and Department of Neurology, University of Pennsylvania, Philadelphia, PA
| | | | - Colin Quinn
- Penn ALS Clinic, University of Pennsylvania, PA
| | | | | | - David J Irwin
- Penn Frontotemporal Degeneration Center and Department of Neurology, University of Pennsylvania, Philadelphia, PA
| | - Lauren Massimo
- Penn Frontotemporal Degeneration Center and Department of Neurology, University of Pennsylvania, Philadelphia, PA
| | - Chris Olm
- Penn Frontotemporal Degeneration Center and Department of Neurology, University of Pennsylvania, Philadelphia, PA
| | - Mark Liberman
- Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA
- Department of Linguistics, University of Pennsylvania, Philadelphia, PA
| | - Murray Grossman
- Penn Frontotemporal Degeneration Center and Department of Neurology, University of Pennsylvania, Philadelphia, PA
| | - Naomi Nevler
- Penn Frontotemporal Degeneration Center and Department of Neurology, University of Pennsylvania, Philadelphia, PA
| |
Collapse
|
2
|
Malekroodi HS, Madusanka N, Lee BI, Yi M. Leveraging Deep Learning for Fine-Grained Categorization of Parkinson's Disease Progression Levels through Analysis of Vocal Acoustic Patterns. Bioengineering (Basel) 2024; 11:295. [PMID: 38534569 DOI: 10.3390/bioengineering11030295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 03/18/2024] [Accepted: 03/18/2024] [Indexed: 03/28/2024] Open
Abstract
Speech impairments often emerge as one of the primary indicators of Parkinson's disease (PD), albeit not readily apparent in its early stages. While previous studies focused predominantly on binary PD detection, this research explored the use of deep learning models to automatically classify sustained vowel recordings into healthy controls, mild PD, or severe PD based on motor symptom severity scores. Popular convolutional neural network (CNN) architectures, VGG and ResNet, as well as vision transformers, Swin, were fine-tuned on log mel spectrogram image representations of the segmented voice data. Furthermore, the research investigated the effects of audio segment lengths and specific vowel sounds on the performance of these models. The findings indicated that implementing longer segments yielded better performance. The models showed strong capability in distinguishing PD from healthy subjects, achieving over 95% precision. However, reliably discriminating between mild and severe PD cases remained challenging. The VGG16 achieved the best overall classification performance with 91.8% accuracy and the largest area under the ROC curve. Furthermore, focusing analysis on the vowel /u/ could further improve accuracy to 96%. Applying visualization techniques like Grad-CAM also highlighted how CNN models focused on localized spectrogram regions while transformers attended to more widespread patterns. Overall, this work showed the potential of deep learning for non-invasive screening and monitoring of PD progression from voice recordings, but larger multi-class labeled datasets are needed to further improve severity classification.
Collapse
Affiliation(s)
- Hadi Sedigh Malekroodi
- Industry 4.0 Convergence Bionics Engineering, Pukyong National University, Busan 48513, Republic of Korea
| | - Nuwan Madusanka
- Digital of Healthcare Research Center, Institute of Information Technology and Convergence, Pukyong National University, Busan 48513, Republic of Korea
| | - Byeong-Il Lee
- Industry 4.0 Convergence Bionics Engineering, Pukyong National University, Busan 48513, Republic of Korea
- Digital of Healthcare Research Center, Institute of Information Technology and Convergence, Pukyong National University, Busan 48513, Republic of Korea
- Division of Smart Healthcare, Pukyong National University, Busan 48513, Republic of Korea
| | - Myunggi Yi
- Industry 4.0 Convergence Bionics Engineering, Pukyong National University, Busan 48513, Republic of Korea
- Digital of Healthcare Research Center, Institute of Information Technology and Convergence, Pukyong National University, Busan 48513, Republic of Korea
- Division of Smart Healthcare, Pukyong National University, Busan 48513, Republic of Korea
| |
Collapse
|
3
|
Ong YQ, Lee J, Chu SY, Chai SC, Gan KB, Ibrahim NM, Barlow SM. Oral-diadochokinesis between Parkinson's disease and neurotypical elderly among Malaysian-Malay speakers. Int J Lang Commun Disord 2024. [PMID: 38451114 DOI: 10.1111/1460-6984.13025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 02/09/2024] [Indexed: 03/08/2024]
Abstract
BACKGROUND Parkinson's disease (PD) has an impact on speech production, manifesting in various ways including alterations in voice quality, challenges in articulating sounds and a decrease in speech rate. Numerous investigations have been conducted to ascertain the oral-diadochokinesis (O-DDK) rate in individuals with PD. However, the existing literature lacks exploration of such O-DDK rates in Malaysia and does not provide consistent evidence regarding the advantage of real-word repetition. AIMS To explore the effect of gender, stimuli type and PD status and their interactions on the O-DDK rates among Malaysian-Malay speakers. METHODS & PROCEDURES O-DDK performance of 62 participants (29 individuals with PD and 33 healthy elderly) using a non-word ('pataka'), a Malay real-word ('patahkan') and an English real-word ('buttercake') was audio recorded. The number of syllables produced in 8 s was counted. A hierarchical linear modelling was performed to investigate the effects of stimuli type (non-word, Malay real-word, English real-word), PD status (yes, no), gender (male, female) and their interactions on the O-DDK rate. The model accounted for participants' age as well as the nesting of repeated measurements within participants, thereby providing unbiased estimates of the effects. OUTCOMES & RESULTS The stimuli effect was significant (p < 0.0001). Malay real-word showed the lowest O-DDK rate (5.03 ± 0.11 syllables/s), followed by English real-word (5.25 ± 0.11 syllables/s) and non-word (5.42 ± 0.11 syllables/s). Individuals with PD showed a significantly lower O-DDK rate compared to healthy elderly (4.73 ± 0.15 syllables/s vs. 5.74 ± 0.14 syllables/s, adjusted p < 0.001). A subsequent analysis indicated that the O-DDK rate declined in a quadratic pattern. However, neither gender nor age effects were observed. Additionally, no significant two-way interactions were found between stimuli type, PD status and gender (all p > 0.05). Therefore, the choice of stimuli type has no or only limited effect considering the use of O-DDK tests in clinical practice for diagnostic purposes. CONCLUSIONS & IMPLICATIONS The observed slowness in O-DDK among individuals with PD can be attributed to the impact of the movement disorder, specifically bradykinesia, on the physiological aspects of speech production. Speech-language pathologists can gain insights into the impact of PD on speech production and tailor appropriate intervention strategies to address the specific needs of individuals with PD according to disease stages. WHAT THIS PAPER ADDS What is already known on this subject The observed slowness in O-DDK rates among individuals with PD may stem from the movement disorder's effects on the physiological aspects of speech production, particularly bradykinesia. However, there is a lack of consistent evidence regarding the influence of real-word repetition and how O-DDK rates vary across different PD stages. What this study adds to existing knowledge The O-DDK rates decline in a quadratic pattern as the PD progresses. The research provides insights into the advantage of real-word repetition in assessing O-DDK rates, with Malay real-word showing the lowest O-DDK rate, followed by English real-word and non-word. What are the potential or actual clinical implications of this work? Speech-language pathologists can better understand the evolving nature of speech motor impairments as PD progresses. This insight enables them to design targeted intervention strategies that are sensitive to the specific needs and challenges associated with each PD stage. This finding can guide clinicians in selecting appropriate assessment tools for evaluating speech motor function in PD patients.
Collapse
Affiliation(s)
- Ying Qian Ong
- Centre for Healthy Ageing and Wellness (H-CARE), Faculty of Health Sciences, Speech Sciences Programme, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia
| | - Jaehoon Lee
- Department of Educational Psychology, Leadership, and Counseling, Texas Tech University, Lubbock, Texas, USA
| | - Shin Ying Chu
- Centre for Healthy Ageing and Wellness (H-CARE), Faculty of Health Sciences, Speech Sciences Programme, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia
| | - Siaw Chui Chai
- Centre for Rehabilitation & Special Needs Studies, Faculty of Health Sciences, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia
| | - Kok Beng Gan
- Department of Electrical, Electronic and Systems Engineering, Faculty of Engineering and Built Environment, Universiti Kebangsaan Malaysia, Bangi, Malaysia
| | - Norlinah Mohamed Ibrahim
- Department of Medicine, Hospital Canselor Tuanku Muhriz, Faculty of Medicine, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia
| | - Steven M Barlow
- Special Education & Communication Disorders, Biomedical Engineering, Center for Brain, Biology, Behavior, University of Nebraska-Lincoln, Lincoln, Nebraska, USA
| |
Collapse
|
4
|
Houle N, Feaster T, Mira A, Meeks K, Stepp CE. Sex Differences in the Speech of Persons With and Without Parkinson's Disease. Am J Speech Lang Pathol 2024; 33:96-116. [PMID: 37889201 PMCID: PMC11000784 DOI: 10.1044/2023_ajslp-22-00350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 02/24/2023] [Accepted: 08/30/2023] [Indexed: 10/28/2023]
Abstract
BACKGROUND Sex differences are apparent in the prevalence and the clinical presentation of Parkinson's disease (PD), but their effects on speech have been less studied. METHOD Speech acoustics of persons with (34 females and 34 males) and without (age- and sex-matched) PD were examined, assessing the effects of PD diagnosis and sex on ratings of dysarthria severity and acoustic measures of phonation (fundamental frequency standard deviation, smoothed cepstral peak prominence), speech rate (net syllables per second, percent pause ratio), and articulation (articulatory-acoustic vowel space, release burst precision). RESULTS Most measures were affected by PD (dysarthria severity, fundamental frequency standard deviation) and sex (smoothed cepstral peak prominence, net syllables per second, percent pause ratio, articulatory-acoustic vowel space), but without interactions between them. Release burst precision was differentially affected by sex in PD. Relative to those without PD, persons with PD produced fewer plosives with a single burst: females more frequently produced multiple bursts, whereas males more frequently produced no burst at all. CONCLUSIONS Most metrics did not indicate that speech production is differentially affected by sex in PD. Sex was, however, associated with disparate effects on release burst precision in PD, which deserves further study. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.24388666.
Collapse
Affiliation(s)
- Nichole Houle
- Department of Speech, Language, and Hearing Sciences, Boston University, MA
| | - Taylor Feaster
- Department of Speech, Language, and Hearing Sciences, Boston University, MA
| | - Amna Mira
- Department of Speech, Language, and Hearing Sciences, Boston University, MA
- College of Applied Medical Sciences, King Saud bin Abdulaziz University for Health Sciences, Jeddah, Saudi Arabia
- King Abdullah International Medical Research Center, Jeddah, Saudi Arabia
| | - Kirsten Meeks
- Department of Speech, Language, and Hearing Sciences, Boston University, MA
| | - Cara E. Stepp
- Department of Speech, Language, and Hearing Sciences, Boston University, MA
- Department of Biomedical Engineering, Boston University, MA
- Department of Otolaryngology–Head & Neck Surgery, Boston University School of Medicine, MA
| |
Collapse
|
5
|
Dragicevic DA, Dahl KL, Perkins Z, Abur D, Stepp CE. Effects of a Concurrent Working Memory Task on Speech Acoustics in Parkinson's Disease. Am J Speech Lang Pathol 2024; 33:418-434. [PMID: 38081054 PMCID: PMC11001185 DOI: 10.1044/2023_ajslp-23-00214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 08/30/2023] [Accepted: 10/26/2023] [Indexed: 01/05/2024]
Abstract
PURPOSE The purpose of this study was to determine the effect of a concurrent working memory task on acoustic measures of speech in individuals with Parkinson's disease (PD). METHOD Individuals with PD and age- and sex-matched controls performed a speaking task with and without a Stroop-like concurrent working memory task. Cepstral peak prominence, low-to-high spectral energy ratio, fundamental frequency (fo) standard deviation, articulation rate, pause duration, articulatory-acoustic vowel space, relative fo, mean voice onset time (VOT), and VOT variability were calculated for each condition. Mixed-model analyses of variance were performed to determine the effects of group, condition (presence of the concurrent working memory task), and their interaction on the acoustic measures. RESULTS All measures except for VOT variability, mean pause duration, and relative fo offset differed between people with and without PD. Cepstral peak prominence, articulation rate, and relative fo offset differed as a function of condition. However, no measures indicated disparate effects of condition as a function of group. CONCLUSION Although differentially impactful on limb motor function in PD, here a concurrent working memory task was not found to be differentially disruptive to speech acoustics in PD. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.24759648.
Collapse
Affiliation(s)
| | - Kimberly L. Dahl
- Department of Speech, Language and Hearing Sciences, Boston University, MA
| | - Zoe Perkins
- Department of Speech, Language and Hearing Sciences, Boston University, MA
| | - Defne Abur
- Department of Speech, Language and Hearing Sciences, Boston University, MA
- Center for Language and Cognition Groningen, University of Groningen, the Netherlands
| | - Cara E. Stepp
- Department of Speech, Language and Hearing Sciences, Boston University, MA
- Department of Biomedical Engineering, Boston University, MA
- Department of Otolaryngology—Head and Neck Surgery, Boston University School of Medicine, MA
| |
Collapse
|
6
|
Iyer A, Kemp A, Rahmatallah Y, Pillai L, Glover A, Prior F, Larson-Prior L, Virmani T. A machine learning method to process voice samples for identification of Parkinson's disease. Sci Rep 2023; 13:20615. [PMID: 37996478 PMCID: PMC10667335 DOI: 10.1038/s41598-023-47568-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 11/15/2023] [Indexed: 11/25/2023] Open
Abstract
Machine learning approaches have been used for the automatic detection of Parkinson's disease with voice recordings being the most used data type due to the simple and non-invasive nature of acquiring such data. Although voice recordings captured via telephone or mobile devices allow much easier and wider access for data collection, current conflicting performance results limit their clinical applicability. This study has two novel contributions. First, we show the reliability of personal telephone-collected voice recordings of the sustained vowel /a/ in natural settings by collecting samples from 50 people with specialist-diagnosed Parkinson's disease and 50 healthy controls and applying machine learning classification with voice features related to phonation. Second, we utilize a novel application of a pre-trained convolutional neural network (Inception V3) with transfer learning to analyze the spectrograms of the sustained vowel from these samples. This approach considers speech intensity estimates across time and frequency scales rather than collapsing measurements across time. We show the superiority of our deep learning model for the task of classifying people with Parkinson's disease as distinct from healthy controls.
Collapse
Affiliation(s)
- Anu Iyer
- Georgia Institute of Technology, Atlanta, 30332, USA
| | - Aaron Kemp
- Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, 72205, USA.
| | - Yasir Rahmatallah
- Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, 72205, USA
| | - Lakshmi Pillai
- Neurology, University of Arkansas for Medical Sciences, Little Rock, 72205, USA
| | - Aliyah Glover
- Neurology, University of Arkansas for Medical Sciences, Little Rock, 72205, USA
| | - Fred Prior
- Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, 72205, USA
| | - Linda Larson-Prior
- Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, 72205, USA
- Neurology, University of Arkansas for Medical Sciences, Little Rock, 72205, USA
- Neurobiology and Developmental Sciences, University of Arkansas for Medical Sciences, Little Rock, 72205, USA
| | - Tuhin Virmani
- Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, 72205, USA
- Neurology, University of Arkansas for Medical Sciences, Little Rock, 72205, USA
| |
Collapse
|
7
|
Ibarra EJ, Arias-Londoño JD, Zañartu M, Godino-Llorente JI. Towards a Corpus (and Language)-Independent Screening of Parkinson's Disease from Voice and Speech through Domain Adaptation. Bioengineering (Basel) 2023; 10:1316. [PMID: 38002440 PMCID: PMC10669342 DOI: 10.3390/bioengineering10111316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 11/03/2023] [Accepted: 11/10/2023] [Indexed: 11/26/2023] Open
Abstract
End-to-end deep learning models have shown promising results for the automatic screening of Parkinson's disease by voice and speech. However, these models often suffer degradation in their performance when applied to scenarios involving multiple corpora. In addition, they also show corpus-dependent clusterings. These facts indicate a lack of generalisation or the presence of certain shortcuts in the decision, and also suggest the need for developing new corpus-independent models. In this respect, this work explores the use of domain adversarial training as a viable strategy to develop models that retain their discriminative capacity to detect Parkinson's disease across diverse datasets. The paper presents three deep learning architectures and their domain adversarial counterparts. The models were evaluated with sustained vowels and diadochokinetic recordings extracted from four corpora with different demographics, dialects or languages, and recording conditions. The results showed that the space distribution of the embedding features extracted by the domain adversarial networks exhibits a higher intra-class cohesion. This behaviour is supported by a decrease in the variability and inter-domain divergence computed within each class. The findings suggest that domain adversarial networks are able to learn the common characteristics present in Parkinsonian voice and speech, which are supposed to be corpus, and consequently, language independent. Overall, this effort provides evidence that domain adaptation techniques refine the existing end-to-end deep learning approaches for Parkinson's disease detection from voice and speech, achieving more generalizable models.
Collapse
Affiliation(s)
- Emiro J. Ibarra
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Avenida España 1680, Casilla 110-V, Valparaíso 2390123, Chile; (E.J.I.); (M.Z.)
| | - Julián D. Arias-Londoño
- Escuela Técnica Superior de Ingeneiros de Telecomunicación, Universidad Politécnica de Madrid, Avda, Ciudad Universitaria, 30, 28040 Madrid, Spain;
| | - Matías Zañartu
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Avenida España 1680, Casilla 110-V, Valparaíso 2390123, Chile; (E.J.I.); (M.Z.)
| | - Juan I. Godino-Llorente
- Escuela Técnica Superior de Ingeneiros de Telecomunicación, Universidad Politécnica de Madrid, Avda, Ciudad Universitaria, 30, 28040 Madrid, Spain;
| |
Collapse
|
8
|
Hireš M, Drotár P, Pah ND, Ngo QC, Kumar DK. On the inter-dataset generalization of machine learning approaches to Parkinson's disease detection from voice. Int J Med Inform 2023; 179:105237. [PMID: 37801807 DOI: 10.1016/j.ijmedinf.2023.105237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 09/20/2023] [Accepted: 09/24/2023] [Indexed: 10/08/2023]
Abstract
BACKGROUND AND OBJECTIVE Parkinson's disease is the second-most-common neurodegenerative disorder that affects motor skills, cognitive processes, mood, and everyday tasks such as speaking and walking. The voices of people with Parkinson's disease may become weak, breathy, or hoarse and may sound emotionless, with slurred words and mumbling. Algorithms for computerized voice analysis have been proposed and have shown highly accurate results. However, these algorithms were developed on single, limited datasets, with participants possessing similar demographics. Such models are prone to overfitting and are unsuitable for generalization, which is essential in real-world applications. METHODS We evaluated the computerized Parkinson's disease diagnosis performance of various machine learning models and showed that these models degraded rapidly when used on different datasets. We evaluated two mainstream state-of-the-art approaches, one based on deep convolutional neural networks and another based on voice feature extraction followed by a shallow classifier (i.e., extreme gradient boosting (XGBoost)). RESULTS An investigation with four datasets (CzechPD, PC-GITA, ITA, and RMIT-PD) proved that even if the algorithms yielded excellent performance on a single dataset, the results obtained on new data or even a mix of datasets were very unsatisfactory. CONCLUSIONS More work needs to be done to make computerized voice analysis methods for Parkinson's disease diagnosis suitable for real-world applications.
Collapse
Affiliation(s)
- Máté Hireš
- Intelligent Information Systems Lab, Technical University of Kosice, Letna 9, 42001 Kosice, Slovakia
| | - Peter Drotár
- Intelligent Information Systems Lab, Technical University of Kosice, Letna 9, 42001 Kosice, Slovakia.
| | - Nemuel Daniel Pah
- Biosignals Lab, RMIT University, Melbourne, Australia; Universitas Surabaya, Surabaya, Indonesia
| | | | | |
Collapse
|
9
|
Kim JA, Jang H, Choi Y, Min YG, Hong YH, Sung JJ, Choi SJ. Subclinical articulatory changes of vowel parameters in Korean amyotrophic lateral sclerosis patients with perceptually normal voices. PLoS One 2023; 18:e0292460. [PMID: 37831677 PMCID: PMC10575489 DOI: 10.1371/journal.pone.0292460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Accepted: 09/21/2023] [Indexed: 10/15/2023] Open
Abstract
The available quantitative methods for evaluating bulbar dysfunction in patients with amyotrophic lateral sclerosis (ALS) are limited. We aimed to characterize vowel properties in Korean ALS patients, investigate associations between vowel parameters and clinical features of ALS, and analyze subclinical articulatory changes of vowel parameters in those with perceptually normal voices. Forty-three patients with ALS (27 with dysarthria and 16 without dysarthria) and 20 healthy controls were prospectively collected in the study. Dysarthria was assessed using the ALS Functional Rating Scale-Revised (ALSFRS-R) speech subscores, with any loss of 4 points indicating the presence of dysarthria. The structured speech samples were recorded and analyzed using Praat software. For three corner vowels (/a/, /i/, and /u/), data on the vowel duration, fundamental frequency, frequencies of the first two formants (F1 and F2), harmonics-to-noise ratio, vowel space area (VSA), and vowel articulation index (VAI) were extracted from the speech samples. Corner vowel durations were significantly longer in ALS patients with dysarthria than in healthy controls. The F1 frequency of /a/, F2 frequencies of /i/ and /u/, the VSA, and the VAI showed significant differences between ALS patients with dysarthria and healthy controls. The area under the curve (AUC) was 0.912. The F1 frequency of /a/ and the VSA were the major determinants for differentiating ALS patients who had not yet developed apparent dysarthria from healthy controls (AUC 0.887). In linear regression analyses, as the ALSFRS-R speech subscore decreased, both the VSA and VAI were reduced. In contrast, vowel durations were found to be rather prolonged. The analyses of vowel parameters provided a useful metric correlated with disease severity for detecting subclinical bulbar dysfunction in ALS patients.
Collapse
Affiliation(s)
- Jin-Ah Kim
- Department of Neurology, Seoul National University Hospital, Seoul, Republic of Korea
- Department of Translational Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
- Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Republic of Korea
| | - Hayeun Jang
- Division of English, Busan University of Foreign Studies, Busan, Republic of Korea
| | - Yoonji Choi
- Department of Korean Language and Literature, Seoul National University, Seoul, Republic of Korea
| | - Young Gi Min
- Department of Neurology, Seoul National University Hospital, Seoul, Republic of Korea
- Department of Translational Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Yoon-Ho Hong
- Department of Neurology, Seoul Metropolitan Government-Seoul National University Boramae Medical Center, Seoul, Republic of Korea
| | - Jung-Joon Sung
- Department of Neurology, Seoul National University Hospital, Seoul, Republic of Korea
- Neuroscience Research Institute, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Seok-Jin Choi
- Department of Neurology, Seoul National University Hospital, Seoul, Republic of Korea
- Center for Hospital Medicine, Seoul National University Hospital, Seoul, Republic of Korea
| |
Collapse
|
10
|
Favaro A, Tsai YT, Butala A, Thebaud T, Villalba J, Dehak N, Moro-Velázquez L. Interpretable speech features vs. DNN embeddings: What to use in the automatic assessment of Parkinson's disease in multi-lingual scenarios. Comput Biol Med 2023; 166:107559. [PMID: 37852107 DOI: 10.1016/j.compbiomed.2023.107559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 10/07/2023] [Accepted: 10/09/2023] [Indexed: 10/20/2023]
Abstract
Speech-based approaches for assessing Parkinson's Disease (PD) often rely on feature extraction for automatic classification or detection. While many studies prioritize accuracy by using non-interpretable embeddings from Deep Neural Networks, this work aims to explore the predictive capabilities and language robustness of both feature types in a systematic fashion. As interpretable features, prosodic, linguistic, and cognitive descriptors were adopted, while x-vectors, Wav2Vec 2.0, HuBERT, and TRILLsson representations were used as non-interpretable features. Mono-lingual, multi-lingual, and cross-lingual machine learning experiments were conducted leveraging six data sets comprising speech recordings from various languages: American English, Castilian Spanish, Colombian Spanish, Italian, German, and Czech. For interpretable feature-based models, the mean of the best F1-scores obtained from each language was 81% in mono-lingual, 81% in multi-lingual, and 71% in cross-lingual experiments. For non-interpretable feature-based models, instead, they were 85% in mono-lingual, 88% in multi-lingual, and 79% in cross-lingual experiments. Firstly, models based on non-interpretable features outperformed interpretable ones, especially in cross-lingual experiments. Specifically, TRILLsson provided the most stable and accurate results across tasks and data sets. Conversely, the two types of features adopted showed some level of language robustness in multi-lingual and cross-lingual experiments. Overall, these results suggest that interpretable feature-based models can be used by clinicians to evaluate the deterioration of the speech of patients with PD, while non-interpretable feature-based models can be leveraged to achieve higher detection accuracy.
Collapse
Affiliation(s)
- Anna Favaro
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, 21218, MD, United States of America.
| | - Yi-Ting Tsai
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, 21218, MD, United States of America
| | - Ankur Butala
- Department of Neurology, The Johns Hopkins University, Baltimore, 21218, MD, United States of America; Department of Psychiatry and Behavioral Sciences, The Johns Hopkins University, Baltimore, 21218, MD, United States of America
| | - Thomas Thebaud
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, 21218, MD, United States of America
| | - Jesús Villalba
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, 21218, MD, United States of America
| | - Najim Dehak
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, 21218, MD, United States of America
| | - Laureano Moro-Velázquez
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, 21218, MD, United States of America
| |
Collapse
|
11
|
Convey RB, Ihalainen T, Liu Y, Räsänen O, Ylinen S, Penttilä N. A comparative study of automatic vowel articulation index and auditory-perceptual assessments of speech intelligibility in Parkinson's disease. Int J Speech Lang Pathol 2023:1-11. [PMID: 37800979 DOI: 10.1080/17549507.2023.2251725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/07/2023]
Abstract
PURPOSE The purpose of this study was to analyse the relationship between automatic vowel articulation index (aVAI) and direct magnitude estimation (DME) among speakers with Parkinson's disease (PD) and healthy controls. We further analysed the potential of aVAI to serve as an objective measure of speech impairment in the clinical setting. METHOD Speech samples from native Finnish speakers were utilised. Expert raters utilised DME to scale the intelligibility of speech samples. aVAI scores for PD speakers and healthy control speakers were analysed in relationship to DME speech intelligibility ratings and, among PD speakers, disease stage utilising nonparametric statistical analysis. RESULT Mean DME intelligibility ratings were lower among PD speakers compared to healthy controls. Mean aVAI scores were nearly the same between speaker groups. DME intelligibility ratings and aVAI were strongly correlated within the PD speaker group. aVAI and DME intelligibility ratings were moderately correlated with disease stage as measured by the Hoehn and Yahr scale. CONCLUSION aVAI was observed to be a promising tool for analysing vowel articulation in PD speakers. Further research is warranted on the application of aVAI as an objective measure of severity of speech impairment in the clinical setting, with varying patient populations and speech samples.
Collapse
Affiliation(s)
- Rachel B Convey
- Faculty of Social Sciences, Tampere University, Tampere, Finland
| | - Tiina Ihalainen
- Faculty of Social Sciences, Tampere University, Tampere, Finland
| | - Yuanyuan Liu
- Information Technology and Communication Sciences, Tampere University, Tampere, Finland
| | - Okko Räsänen
- Information Technology and Communication Sciences, Tampere University, Tampere, Finland
| | - Sari Ylinen
- Faculty of Social Sciences, Tampere University, Tampere, Finland
| | - Nelly Penttilä
- Faculty of Social Sciences, Tampere University, Tampere, Finland
| |
Collapse
|
12
|
Kang K, Nunes AS, Sharma M, Hall AJ, Mishra RK, Casado J, Cole R, Derhammer M, Barchard G, Najafi B, Vaziri A, Wills AM, Pantelyat A. Utilizing speech analysis to differentiate progressive supranuclear palsy from Parkinson's disease. Parkinsonism Relat Disord 2023; 115:105835. [PMID: 37678101 PMCID: PMC10591790 DOI: 10.1016/j.parkreldis.2023.105835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 08/25/2023] [Accepted: 08/26/2023] [Indexed: 09/09/2023]
Abstract
INTRODUCTION Distinguishing Parkinson's disease (PD) from Progressive supranuclear palsy (PSP) at early disease stages is important for clinical trial enrollment and clinical care/prognostication. METHODS We recruited 21 participants with PSP(n = 11) or PD(n = 10) with reliable caregivers. Standardized passage reading, counting, and sustained phonation were recorded on the BioDigit Home tablet (BioSensics LLC, Newton, MA USA), and speech features from the assessments were analyzed using the BioDigit Speech platform (BioSensics LLC, Newton, MA USA). An independent t-test was performed to compare each speech feature between PSP and PD participants. We also performed Spearman's correlations to evaluate associations between speech measures and clinical scores (e.g., PSP rating scales and MoCA). In addition, the model's performance in classifying PSP and PD was evaluated using Rainbow passage reading analysis. RESULTS During Rainbow passage reading, PSP participants had a significantly slower articulation rate (2.45(0.49) vs 3.60(0.47) words/minute), lower speech-to-pause ratio (2.33(1.08) vs 3.67(1.18)), intelligibility dynamic time warping (DTW, 0.26(0.19) vs 0.53(0.26)), and similarity DTW (0.43(0.27) vs 0.67(0.13)) compared to PD participants. PSP participants also had longer pause times (17.24(5.47) vs 8.45(3.13) sec) and longer total signal times (52.44(6.67) vs (36.67(6.73) sec) when reading the passage. In terms of the phonation 'a', PSP participants showed a significant higher spectral entropy, spectral centroid, and spectral spread compared to PD participants and no differences were found for phonation 'e'. PD participants had more accurate reverse number counts than PSP participants (14.89(3.86) vs 7.36(4.67)). PSP Rating Scale (PSPRS) dysarthria (r = 0.79, p = 0.004) and bulbar item scores (r = 0.803, p = 0.005) were positively correlated with articulation rate in reverse number counts. Correct reverse number counts were positively correlated with total Montreal Cognitive Assessment scores (r = 0.703, p = 0.016). Machine learning models using passage reading-derived measures obtained an AUC of 0.93, and the sensitivity/specificity in correctly classifying PSP and PD participants were 0.95 and 0.90, respectively. CONCLUSION Our study demonstrates the feasibility of differentiating PSP from PD using a digital health technology platform. Further multi-center studies are needed to expand and validate our initial findings.
Collapse
Affiliation(s)
- Kyurim Kang
- Johns Hopkins University School of Medicine, Department of Neurology, Baltimore, MD, 21287, USA
| | | | - Mansi Sharma
- Massachusetts General Hospital, Harvard Medical School, Department of Neurology, Boston, MA, USA
| | - A J Hall
- Johns Hopkins University School of Medicine, Department of Neurology, Baltimore, MD, 21287, USA
| | | | | | | | | | | | - Bijan Najafi
- Interdisciplinary Consortium for Advanced Motion Performance (iCAMP), Michael E. DeBakey Department of Surgery, Baylor College of Medicine, Houston, TX, 77030, USA
| | | | - Anne-Marie Wills
- Massachusetts General Hospital, Harvard Medical School, Department of Neurology, Boston, MA, USA
| | - Alexander Pantelyat
- Johns Hopkins University School of Medicine, Department of Neurology, Baltimore, MD, 21287, USA.
| |
Collapse
|
13
|
García AM, Johann F, Echegoyen R, Calcaterra C, Riera P, Belloli L, Carrillo F. Toolkit to Examine Lifelike Language (TELL): An app to capture speech and language markers of neurodegeneration. Behav Res Methods 2023:10.3758/s13428-023-02240-z. [PMID: 37759106 DOI: 10.3758/s13428-023-02240-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/11/2023] [Indexed: 09/29/2023]
Abstract
Automated speech and language analysis (ASLA) is a promising approach for capturing early markers of neurodegenerative diseases. However, its potential remains underexploited in research and translational settings, partly due to the lack of a unified tool for data collection, encryption, processing, download, and visualization. Here we introduce the Toolkit to Examine Lifelike Language (TELL) v.1.0.0, a web-based app designed to bridge such a gap. First, we outline general aspects of its development. Second, we list the steps to access and use the app. Third, we specify its data collection protocol, including a linguistic profile survey and 11 audio recording tasks. Fourth, we describe the outputs the app generates for researchers (downloadable files) and for clinicians (real-time metrics). Fifth, we survey published findings obtained through its tasks and metrics. Sixth, we refer to TELL's current limitations and prospects for expansion. Overall, with its current and planned features, TELL aims to facilitate ASLA for research and clinical aims in the neurodegeneration arena. A demo version can be accessed here: https://demo.sci.tellapp.org/ .
Collapse
Affiliation(s)
- Adolfo M García
- Global Brain Health Institute, University of California, 505 Parnassus Ave, San Francisco, CA, 94143, USA.
- Cognitive Neuroscience Center, Universidad de San Andrés, Buenos Aires, Argentina.
- Departamento de Lingüística y Literatura, Facultad de Humanidades, Universidad de Santiago de Chile, Santiago, Chile.
- TELL Toolkit SA, Beethovenstraat, Netherlands.
| | - Fernando Johann
- Cognitive Neuroscience Center, Universidad de San Andrés, Buenos Aires, Argentina
- TELL Toolkit SA, Beethovenstraat, Netherlands
| | - Raúl Echegoyen
- Cognitive Neuroscience Center, Universidad de San Andrés, Buenos Aires, Argentina
- TELL Toolkit SA, Beethovenstraat, Netherlands
| | - Cecilia Calcaterra
- Cognitive Neuroscience Center, Universidad de San Andrés, Buenos Aires, Argentina
- TELL Toolkit SA, Beethovenstraat, Netherlands
| | - Pablo Riera
- Instituto de Investigación en Ciencias de la Computación (ICC), CONICET-Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Laouen Belloli
- Instituto de Investigación en Ciencias de la Computación (ICC), CONICET-Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Facundo Carrillo
- Instituto de Investigación en Ciencias de la Computación (ICC), CONICET-Universidad de Buenos Aires, Buenos Aires, Argentina
| |
Collapse
|
14
|
Milella G, Sciancalepore D, Cavallaro G, Piccirilli G, Nanni AG, Fraddosio A, D’Errico E, Paolicelli D, Fiorella ML, Simone IL. Acoustic Voice Analysis as a Useful Tool to Discriminate Different ALS Phenotypes. Biomedicines 2023; 11:2439. [PMID: 37760880 PMCID: PMC10525613 DOI: 10.3390/biomedicines11092439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Revised: 08/24/2023] [Accepted: 08/29/2023] [Indexed: 09/29/2023] Open
Abstract
Approximately 80-96% of people with amyotrophic lateral sclerosis (ALS) become unable to speak during the disease progression. Assessing upper and lower motor neuron impairment in bulbar regions of ALS patients remains challenging, particularly in distinguishing spastic and flaccid dysarthria. This study aimed to evaluate acoustic voice parameters as useful biomarkers to discriminate ALS clinical phenotypes. Triangular vowel space area (tVSA), alternating motion rates (AMRs), and sequential motion rates (SMRs) were analyzed in 36 ALS patients and 20 sex/age-matched healthy controls (HCs). tVSA, AMR, and SMR values significantly differed between ALS and HCs, and between ALS with prevalent upper (pUMN) and lower motor neuron (pLMN) impairment. tVSA showed higher accuracy in discriminating pUMN from pLMN patients. AMR and SMR were significantly lower in patients with bulbar onset than those with spinal onset, both with and without bulbar symptoms. Furthermore, these values were also lower in patients with spinal onset associated with bulbar symptoms than in those with spinal onset alone. Additionally, AMR and SMR values correlated with the degree of dysphagia. Acoustic voice analysis may be considered a useful prognostic tool to differentiate spastic and flaccid dysarthria and to assess the degree of bulbar involvement in ALS.
Collapse
Affiliation(s)
- Giammarco Milella
- Neurology Unit, Department of Translational Biomedicine and Neurosciences, 70121 Bari, Italy; (G.M.); (G.P.); (A.G.N.); (A.F.); (E.D.); (D.P.)
| | - Diletta Sciancalepore
- Otolaryngology Unit, Department of Translational Biomedicine and Neurosciences (DiBraiN), University of Bari Aldo Moro, 70121 Bari, Italy; (D.S.); (G.C.); (M.L.F.)
| | - Giada Cavallaro
- Otolaryngology Unit, Department of Translational Biomedicine and Neurosciences (DiBraiN), University of Bari Aldo Moro, 70121 Bari, Italy; (D.S.); (G.C.); (M.L.F.)
| | - Glauco Piccirilli
- Neurology Unit, Department of Translational Biomedicine and Neurosciences, 70121 Bari, Italy; (G.M.); (G.P.); (A.G.N.); (A.F.); (E.D.); (D.P.)
| | - Alfredo Gabriele Nanni
- Neurology Unit, Department of Translational Biomedicine and Neurosciences, 70121 Bari, Italy; (G.M.); (G.P.); (A.G.N.); (A.F.); (E.D.); (D.P.)
| | - Angela Fraddosio
- Neurology Unit, Department of Translational Biomedicine and Neurosciences, 70121 Bari, Italy; (G.M.); (G.P.); (A.G.N.); (A.F.); (E.D.); (D.P.)
| | - Eustachio D’Errico
- Neurology Unit, Department of Translational Biomedicine and Neurosciences, 70121 Bari, Italy; (G.M.); (G.P.); (A.G.N.); (A.F.); (E.D.); (D.P.)
| | - Damiano Paolicelli
- Neurology Unit, Department of Translational Biomedicine and Neurosciences, 70121 Bari, Italy; (G.M.); (G.P.); (A.G.N.); (A.F.); (E.D.); (D.P.)
| | - Maria Luisa Fiorella
- Otolaryngology Unit, Department of Translational Biomedicine and Neurosciences (DiBraiN), University of Bari Aldo Moro, 70121 Bari, Italy; (D.S.); (G.C.); (M.L.F.)
| | | |
Collapse
|
15
|
Roland V, Huet K, Harmegnies B, Piccaluga M, Verhaegen C, Delvaux V. Vowel production: a potential speech biomarker for early detection of dysarthria in Parkinson's disease. Front Psychol 2023; 14:1129830. [PMID: 37701868 PMCID: PMC10493417 DOI: 10.3389/fpsyg.2023.1129830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Accepted: 07/26/2023] [Indexed: 09/14/2023] Open
Abstract
Objectives Our aim is to detect early, subclinical speech biomarkers of dysarthria in Parkinson's disease (PD), i.e., systematic atypicalities in speech that remain subtle, are not easily detectible by the clinician, so that the patient is labeled "non-dysarthric." Based on promising exploratory work, we examine here whether vowel articulation, as assessed by three acoustic metrics, can be used as early indicator of speech difficulties associated with Parkinson's disease. Study design This is a prospective case-control study. Methods Sixty-three individuals with PD and 35 without PD (healthy controls-HC) participated in this study. Out of 63 PD patients, 43 had been diagnosed with dysarthria (DPD) and 20 had not (NDPD). Sustained vowels were recorded for each speaker and formant frequencies were measured. The analyses focus on three acoustic metrics: individual vowel triangle areas (tVSA), vowel articulation index (VAI) and the Phi index. Results tVSA were found to be significantly smaller for DPD speakers than for HC. The VAI showed significant differences between these two groups, indicating greater centralization and lower vowel contrasts in the DPD speakers with dysarhtria. In addition, DPD and NDPD speakers had lower Phi values, indicating a lower organization of their vowel system compared to the HC. Results also showed that the VAI index was the most efficient to distinguish between DPD and NDPD whereas the Phi index was the best acoustic metric to discriminate NDPD and HC. Conclusion This acoustic study identified potential subclinical vowel-related speech biomarkers of dysarthria in speakers with Parkinson's disease who have not been diagnosed with dysarthria.
Collapse
Affiliation(s)
- Virginie Roland
- Metrology and Language Sciences Unit, Mons, Belgium
- Research Institute for Language Science and Technology, University of Mons, Mons, Belgium
| | - Kathy Huet
- Metrology and Language Sciences Unit, Mons, Belgium
- Research Institute for Language Science and Technology, University of Mons, Mons, Belgium
| | - Bernard Harmegnies
- Research Institute for Language Science and Technology, University of Mons, Mons, Belgium
| | - Myriam Piccaluga
- Metrology and Language Sciences Unit, Mons, Belgium
- Research Institute for Language Science and Technology, University of Mons, Mons, Belgium
| | - Clémence Verhaegen
- Metrology and Language Sciences Unit, Mons, Belgium
- Research Institute for Language Science and Technology, University of Mons, Mons, Belgium
| | - Véronique Delvaux
- Metrology and Language Sciences Unit, Mons, Belgium
- Research Institute for Language Science and Technology, University of Mons, Mons, Belgium
- National Fund for Scientific Research, Brussels, Belgium
| |
Collapse
|
16
|
Illner V, Tykalova T, Skrabal D, Klempir J, Rusz J. Automated Vowel Articulation Analysis in Connected Speech Among Progressive Neurological Diseases, Dysarthria Types, and Dysarthria Severities. J Speech Lang Hear Res 2023:1-22. [PMID: 37499137 DOI: 10.1044/2023_jslhr-22-00526] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
PURPOSE Although articulatory impairment represents distinct speech characteristics in most neurological diseases affecting movement, methods allowing automated assessments of articulation deficits from the connected speech are scarce. This study aimed to design a fully automated method for analyzing dysarthria-related vowel articulation impairment and estimate its sensitivity in a broad range of neurological diseases and various types and severities of dysarthria. METHOD Unconstrained monologue and reading passages were acquired from 459 speakers, including 306 healthy controls and 153 neurological patients. The algorithm utilized a formant tracker in combination with a phoneme recognizer and subsequent signal processing analysis. RESULTS Articulatory undershoot of vowels was presented in a broad spectrum of progressive neurodegenerative diseases, including Parkinson's disease, progressive supranuclear palsy, multiple-system atrophy, Huntington's disease, essential tremor, cerebellar ataxia, multiple sclerosis, and amyotrophic lateral sclerosis, as well as in related dysarthria subtypes including hypokinetic, hyperkinetic, ataxic, spastic, flaccid, and their mixed variants. Formant ratios showed a higher sensitivity to vowel deficits than vowel space area. First formants of corner vowels were significantly lower for multiple-system atrophy than cerebellar ataxia. Second formants of vowels /a/ and /i/ were lower in ataxic compared to spastic dysarthria. Discriminant analysis showed a classification score of up to 41.0% for disease type, 39.3% for dysarthria type, and 49.2% for dysarthria severity. Algorithm accuracy reached an F-score of 0.77. CONCLUSIONS Distinctive vowel articulation alterations reflect underlying pathophysiology in neurological diseases. Objective acoustic analysis of vowel articulation has the potential to provide a universal method to screen motor speech disorders. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.23681529.
Collapse
Affiliation(s)
- Vojtech Illner
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Czech Republic
| | - Tereza Tykalova
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Czech Republic
| | - Dominik Skrabal
- Department of Neurology and Centre of Clinical Neuroscience, First Faculty of Medicine, Charles University and General University Hospital, Prague, Czech Republic
| | - Jiri Klempir
- Department of Neurology and Centre of Clinical Neuroscience, First Faculty of Medicine, Charles University and General University Hospital, Prague, Czech Republic
| | - Jan Rusz
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Czech Republic
- Department of Neurology and Centre of Clinical Neuroscience, First Faculty of Medicine, Charles University and General University Hospital, Prague, Czech Republic
- Department of Neurology and ARTORG Center, Inselspital, Bern University Hospital, University of Bern, Switzerland
| |
Collapse
|
17
|
Gisladottir RS, Helgason A, Halldorsson BV, Helgason H, Borsky M, Chien YR, Gudnason J, Gudjonsson SA, Moisik S, Dediu D, Thorleifsson G, Tragante V, Bustamante M, Jonsdottir GA, Stefansdottir L, Rutsdottir G, Magnusson SH, Hardarson M, Ferkingstad E, Halldorsson GH, Rognvaldsson S, Skuladottir A, Ivarsdottir EV, Norddahl G, Thorgeirsson G, Jonsdottir I, Ulfarsson MO, Holm H, Stefansson H, Thorsteinsdottir U, Gudbjartsson DF, Sulem P, Stefansson K. Sequence variants affecting voice pitch in humans. Sci Adv 2023; 9:eabq2969. [PMID: 37294764 PMCID: PMC10256171 DOI: 10.1126/sciadv.abq2969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 05/04/2023] [Indexed: 06/11/2023]
Abstract
The genetic basis of the human vocal system is largely unknown, as are the sequence variants that give rise to individual differences in voice and speech. Here, we couple data on diversity in the sequence of the genome with voice and vowel acoustics in speech recordings from 12,901 Icelanders. We show how voice pitch and vowel acoustics vary across the life span and correlate with anthropometric, physiological, and cognitive traits. We found that voice pitch and vowel acoustics have a heritable component and discovered correlated common variants in ABCC9 that associate with voice pitch. The ABCC9 variants also associate with adrenal gene expression and cardiovascular traits. By showing that voice and vowel acoustics are influenced by genetics, we have taken important steps toward understanding the genetics and evolution of the human vocal system.
Collapse
Affiliation(s)
- Rosa S. Gisladottir
- deCODE Genetics/Amgen Inc., Sturlugata 8, 101 Reykjavik, Iceland
- Department of Icelandic and Comparative Cultural Studies, University of Iceland, Saemundargata 2, 102 Reykjavik, Iceland
| | - Agnar Helgason
- deCODE Genetics/Amgen Inc., Sturlugata 8, 101 Reykjavik, Iceland
- Department of Anthropology, University of Iceland, Saemundargata 10, 102 Reykjavik, Iceland
| | - Bjarni V. Halldorsson
- deCODE Genetics/Amgen Inc., Sturlugata 8, 101 Reykjavik, Iceland
- Department of Engineering, Reykjavik University, Menntavegur 1, 101 Reykjavik, Iceland
| | - Hannes Helgason
- deCODE Genetics/Amgen Inc., Sturlugata 8, 101 Reykjavik, Iceland
| | - Michal Borsky
- Department of Engineering, Reykjavik University, Menntavegur 1, 101 Reykjavik, Iceland
| | - Yu-Ren Chien
- Department of Engineering, Reykjavik University, Menntavegur 1, 101 Reykjavik, Iceland
| | - Jon Gudnason
- Department of Engineering, Reykjavik University, Menntavegur 1, 101 Reykjavik, Iceland
| | | | - Scott Moisik
- Division of Linguistics and Multilingual Studies, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798, Singapore
| | - Dan Dediu
- Department of Catalan Philology and General Linguistics, University of Barcelona, Gran Via 585, Barcelona 08007, Spain
- University of Barcelona Institute for Complex Systems (UBICS), Martí Franquès 1, Barcelona 08028, Spain
- Catalan Institute for Research and Advanced Studies (ICREA), Passeig Lluís Companys 23, Barcelona 08010, Spain
| | | | | | | | | | | | | | | | | | - Egil Ferkingstad
- deCODE Genetics/Amgen Inc., Sturlugata 8, 101 Reykjavik, Iceland
| | - Gisli H. Halldorsson
- deCODE Genetics/Amgen Inc., Sturlugata 8, 101 Reykjavik, Iceland
- School of Engineering and Natural Sciences, University of Iceland, Dunhagi 5, 107 Reykjavik, Iceland
| | | | | | | | | | - Gudmundur Thorgeirsson
- deCODE Genetics/Amgen Inc., Sturlugata 8, 101 Reykjavik, Iceland
- Faculty of Medicine, University of Iceland, Vatnsmyrarvegur 16, 101 Reykjavik, Iceland
| | - Ingileif Jonsdottir
- deCODE Genetics/Amgen Inc., Sturlugata 8, 101 Reykjavik, Iceland
- Faculty of Medicine, University of Iceland, Vatnsmyrarvegur 16, 101 Reykjavik, Iceland
| | - Magnus O. Ulfarsson
- deCODE Genetics/Amgen Inc., Sturlugata 8, 101 Reykjavik, Iceland
- School of Engineering and Natural Sciences, University of Iceland, Dunhagi 5, 107 Reykjavik, Iceland
| | - Hilma Holm
- deCODE Genetics/Amgen Inc., Sturlugata 8, 101 Reykjavik, Iceland
| | | | - Unnur Thorsteinsdottir
- deCODE Genetics/Amgen Inc., Sturlugata 8, 101 Reykjavik, Iceland
- Faculty of Medicine, University of Iceland, Vatnsmyrarvegur 16, 101 Reykjavik, Iceland
| | - Daniel F. Gudbjartsson
- deCODE Genetics/Amgen Inc., Sturlugata 8, 101 Reykjavik, Iceland
- School of Engineering and Natural Sciences, University of Iceland, Dunhagi 5, 107 Reykjavik, Iceland
| | - Patrick Sulem
- deCODE Genetics/Amgen Inc., Sturlugata 8, 101 Reykjavik, Iceland
| | - Kari Stefansson
- deCODE Genetics/Amgen Inc., Sturlugata 8, 101 Reykjavik, Iceland
- Faculty of Medicine, University of Iceland, Vatnsmyrarvegur 16, 101 Reykjavik, Iceland
| |
Collapse
|
18
|
Alku P, Kadiri SR, Gowda D. Refining a deep learning-based formant tracker using linear prediction methods. COMPUT SPEECH LANG 2023. [DOI: 10.1016/j.csl.2023.101515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
19
|
Shellikeri S, Cho S, Ash S, Gonzalez-Recober C, McMillan CT, Elman L, Quinn C, Amado DA, Baer M, Irwin DJ, Massimo L, Olm C, Liberman M, Grossman M, Nevler N. Digital markers of motor speech impairments in natural speech of patients with ALS-FTD spectrum disorders. medRxiv 2023:2023.04.29.23289308. [PMID: 37205390 PMCID: PMC10187360 DOI: 10.1101/2023.04.29.23289308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Background and objectives Patients with ALS-FTD spectrum disorders (ALS-FTSD) have mixed motor and cognitive impairments and require valid and quantitative assessment tools to support diagnosis and tracking of bulbar motor disease. This study aimed to validate a novel automated digital speech tool that analyzes vowel acoustics from natural, connected speech as a marker for impaired articulation due to bulbar motor disease in ALS-FTSD. Methods We used an automatic algorithm called Forced Alignment Vowel Extraction (FAVE) to detect spoken vowels and extract vowel acoustics from 1 minute audio-recorded picture descriptions. Using automated acoustic analysis scripts, we derived two articulatory-acoustic measures: vowel space area (VSA, in Bark 2 ) which represents tongue range-of-motion (size), and average second formant slope of vowel trajectories (F2 slope) which represents tongue movement speed. We compared vowel measures between ALS with and without clinically-evident bulbar motor disease (ALS+bulbar vs. ALS-bulbar), behavioral variant frontotemporal dementia (bvFTD) without a motor syndrome, and healthy controls (HC). We correlated impaired vowel measures with bulbar disease severity, estimated by clinical bulbar scores and perceived listener effort, and with MRI cortical thickness of the orobuccal part of the primary motor cortex innervating the tongue (oralPMC). We also tested correlations with respiratory capacity and cognitive impairment. Results Participants were 45 ALS+bulbar (30 males, mean age=61±11), 22 ALS-nonbulbar (11 males, age=62±10), 22 bvFTD (13 males, age=63±7), and 34 HC (14 males, age=69±8). ALS+bulbar had smaller VSA and shallower average F2 slopes than ALS-bulbar (VSA: | d |=0.86, p =0.0088; F2 slope: | d |=0.98, p =0.0054), bvFTD (VSA: | d |=0.67, p =0.043; F2 slope: | d |=1.4, p <0.001), and HC (VSA: | d |=0.73, p =0.024; F2 slope: | d |=1.0, p <0.001). Vowel measures declined with worsening bulbar clinical scores (VSA: R=0.33, p =0.033; F2 slope: R=0.25, p =0.048), and smaller VSA was associated with greater listener effort (R=-0.43, p =0.041). Shallower F2 slopes were related to cortical thinning in oralPMC (R=0.50, p =0.03). Neither vowel measure was associated with respiratory nor cognitive test scores. Conclusions Vowel measures extracted with automatic processing from natural speech are sensitive to bulbar motor disease in ALS-FTD and are robust to cognitive impairment.
Collapse
|
20
|
Faragó P, Ștefănigă SA, Cordoș CG, Mihăilă LI, Hintea S, Peștean AS, Beyer M, Perju-Dumbravă L, Ileșan RR. CNN-Based Identification of Parkinson's Disease from Continuous Speech in Noisy Environments. Bioengineering (Basel) 2023; 10:bioengineering10050531. [PMID: 37237601 DOI: 10.3390/bioengineering10050531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 04/21/2023] [Accepted: 04/24/2023] [Indexed: 05/28/2023] Open
Abstract
Parkinson's disease is a progressive neurodegenerative disorder caused by dopaminergic neuron degeneration. Parkinsonian speech impairment is one of the earliest presentations of the disease and, along with tremor, is suitable for pre-diagnosis. It is defined by hypokinetic dysarthria and accounts for respiratory, phonatory, articulatory, and prosodic manifestations. The topic of this article targets artificial-intelligence-based identification of Parkinson's disease from continuous speech recorded in a noisy environment. The novelty of this work is twofold. First, the proposed assessment workflow performed speech analysis on samples of continuous speech. Second, we analyzed and quantified Wiener filter applicability for speech denoising in the context of Parkinsonian speech identification. We argue that the Parkinsonian features of loudness, intonation, phonation, prosody, and articulation are contained in the speech, speech energy, and Mel spectrograms. Thus, the proposed workflow follows a feature-based speech assessment to determine the feature variation ranges, followed by speech classification using convolutional neural networks. We report the best classification accuracies of 96% on speech energy, 93% on speech, and 92% on Mel spectrograms. We conclude that the Wiener filter improves both feature-based analysis and convolutional-neural-network-based classification performances.
Collapse
Affiliation(s)
- Paul Faragó
- Bases of Electronics Department, Faculty of Electronics, Telecommunications and Information Technology, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, Romania
| | - Sebastian-Aurelian Ștefănigă
- Department of Computer Science, Faculty of Mathematics and Computer Science, West University of Timisoara, 300223 Timisoara, Romania
| | - Claudia-Georgiana Cordoș
- Bases of Electronics Department, Faculty of Electronics, Telecommunications and Information Technology, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, Romania
| | - Laura-Ioana Mihăilă
- Bases of Electronics Department, Faculty of Electronics, Telecommunications and Information Technology, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, Romania
| | - Sorin Hintea
- Bases of Electronics Department, Faculty of Electronics, Telecommunications and Information Technology, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, Romania
| | - Ana-Sorina Peștean
- Department of Neurology and Pediatric Neurology, Faculty of Medicine, University of Medicine and Pharmacy "Iuliu Hatieganu" Cluj-Napoca, 400012 Cluj-Napoca, Romania
| | - Michel Beyer
- Clinic of Oral and Cranio-Maxillofacial Surgery, University Hospital Basel, CH-4031 Basel, Switzerland
- Medical Additive Manufacturing Research Group (Swiss MAM), Department of Biomedical Engineering, University of Basel, CH-4123 Allschwil, Switzerland
| | - Lăcrămioara Perju-Dumbravă
- Department of Neurology and Pediatric Neurology, Faculty of Medicine, University of Medicine and Pharmacy "Iuliu Hatieganu" Cluj-Napoca, 400012 Cluj-Napoca, Romania
| | - Robert Radu Ileșan
- Department of Neurology and Pediatric Neurology, Faculty of Medicine, University of Medicine and Pharmacy "Iuliu Hatieganu" Cluj-Napoca, 400012 Cluj-Napoca, Romania
- Clinic of Oral and Cranio-Maxillofacial Surgery, University Hospital Basel, CH-4031 Basel, Switzerland
| |
Collapse
|
21
|
Shamei A, Liu Y, Gick B. Reduction of vowel space in Alzheimer's disease. JASA Express Lett 2023; 3:035202. [PMID: 37003703 DOI: 10.1121/10.0017438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Reduced vowel space area (VSA) is a known effect of neurodegenerative diseases such as Parkinson's disease (PD). Using large publicly available corpuses, two experiments were conducted comparing the vowel space of speakers with and without Alzheimer's disease (AD) during spontaneous and read speech. First, a comparison of vowel distance found reduced distance in AD for English spontaneous speech, but not Spanish read speech. Findings were then verified using an unsupervised learning approach to quantify VSA through cluster center detection. These results corroborate observations for PD that VSA reduction is task-dependent, but further experiments are necessary to quantify the effect of language.
Collapse
Affiliation(s)
- Arian Shamei
- Department of Linguistics, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada , ,
| | - Yadong Liu
- Department of Linguistics, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada , ,
| | - Bryan Gick
- Department of Linguistics, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada , ,
| |
Collapse
|
22
|
Ge W, Lueck C, Suominen H, Apthorp D. Has machine learning over-promised in healthcare? A critical analysis and a proposal for improved evaluation, with evidence from Parkinson’s disease. Artif Intell Med 2023; 139:102524. [PMID: 37100503 DOI: 10.1016/j.artmed.2023.102524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Revised: 02/22/2023] [Accepted: 02/28/2023] [Indexed: 03/17/2023]
Abstract
Adoption of artificial intelligence (AI) by the medical community has long been anticipated, endorsed by a stream of machine learning literature showcasing AI systems that yield extraordinary performance. However, many of these systems are likely over-promising and will under-deliver in practice. One key reason is the community's failure to acknowledge and address the presence of inflationary effects in the data. These simultaneously inflate evaluation performance and prevent a model from learning the underlying task, thus severely misrepresenting how that model would perform in the real world. This paper investigated the impact of these inflationary effects on healthcare tasks, as well as how these effects can be addressed. Specifically, we defined three inflationary effects that occur in medical data sets and allow models to easily reach small training losses and prevent skillful learning. We investigated two data sets of sustained vowel phonation from participants with and without Parkinson's disease, and revealed that published models which have achieved high classification performances on these were artificially enhanced due to the inflationary effects. Our experiments showed that removing each inflationary effect corresponded with a decrease in classification accuracy, and that removing all inflationary effects reduced the evaluated performance by up to 30%. Additionally, the performance on a more realistic test set increased, suggesting that the removal of these inflationary effects enabled the model to better learn the underlying task and generalize. Source code is available at https://github.com/Wenbo-G/pd-phonation-analysis under the MIT license.
Collapse
|
23
|
Favaro A, Moro-Velázquez L, Butala A, Motley C, Cao T, Stevens RD, Villalba J, Dehak N. Multilingual evaluation of interpretable biomarkers to represent language and speech patterns in Parkinson's disease. Front Neurol 2023; 14:1142642. [PMID: 36937510 PMCID: PMC10017962 DOI: 10.3389/fneur.2023.1142642] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 02/08/2023] [Indexed: 03/06/2023] Open
Abstract
Motor impairments are only one aspect of Parkinson's disease (PD), which also include cognitive and linguistic impairments. Speech-derived interpretable biomarkers may help clinicians diagnose PD at earlier stages and monitor the disorder's evolution over time. This study focuses on the multilingual evaluation of a composite array of biomarkers that facilitate PD evaluation from speech. Hypokinetic dysarthria, a motor speech disorder associated with PD, has been extensively analyzed in previously published studies on automatic PD evaluation, with a relative lack of inquiry into language and task variability. In this study, we explore certain acoustic, linguistic, and cognitive information encoded within the speech of several cohorts with PD. A total of 24 biomarkers were analyzed from American English, Italian, Castilian Spanish, Colombian Spanish, German, and Czech by conducting a statistical analysis to evaluate which biomarkers best differentiate people with PD from healthy participants. The study leverages conceptual robustness as a criterion in which a biomarker behaves the same, independent of the language. Hence, we propose a set of speech-based biomarkers that can effectively help evaluate PD while being language-independent. In short, the best acoustic and cognitive biomarkers permitting discrimination between experimental groups across languages were fundamental frequency standard deviation, pause time, pause percentage, silence duration, and speech rhythm standard deviation. Linguistic biomarkers representing the length of the narratives and the number of nouns and auxiliaries also provided discrimination between groups. Altogether, in addition to being significant, these biomarkers satisfied the robustness requirements.
Collapse
Affiliation(s)
- Anna Favaro
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD, United States
- *Correspondence: Anna Favaro
| | - Laureano Moro-Velázquez
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD, United States
| | - Ankur Butala
- Department of Neurology, The Johns Hopkins University, Baltimore, MD, United States
- Department of Psychiatry and Behavioral Sciences, The Johns Hopkins University, Baltimore, MD, United States
| | - Chelsie Motley
- Department of Neurology, The Johns Hopkins University, Baltimore, MD, United States
| | - Tianyu Cao
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD, United States
| | - Robert David Stevens
- Department of Anesthesiology and Critical Care, The Johns Hopkins University, Baltimore, MD, United States
| | - Jesús Villalba
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD, United States
| | - Najim Dehak
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD, United States
| |
Collapse
|
24
|
Jacinto-Scudeiro LA, Rothe-Neves R, Dos Santos VB, Machado GD, Burguêz D, Padovani MMP, Ayres A, Rech RS, González-Salazar C, Junior MCF, Saute JAM, Olchik MR. Dysarthria in hereditary spastic paraplegia type 4. Clinics (Sao Paulo) 2023; 78:100128. [PMID: 36473366 PMCID: PMC9723923 DOI: 10.1016/j.clinsp.2022.100128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 09/13/2022] [Accepted: 09/29/2022] [Indexed: 12/12/2022] Open
Abstract
OBJECTIVE To describe the speech pattern of patients with hereditary Spastic Paraplegia type 4 (SPG4) and correlated it with their clinical data. METHODS Cross-sectional study was carried out in two university hospitals in Brazil. Two groups participated in the study: the case group (n = 28) with a confirmed genetic diagnosis for SPG4 and a control group (n = 17) matched for sex and age. The speech assessment of both groups included: speech task recording, acoustic analysis, and auditory-perceptual analysis. In addition, disease severity was assessed with the Spastic Paraplegia Rating Scale (SPRS). RESULTS In the auditory-perceptual analysis, 53.5% (n = 15) of individuals with SPG4 were dysarthric, with mild to moderate changes in the subsystems of phonation and articulation. On acoustic analysis, SPG4 subjects' performances were worse in measurements related to breathing (maximum phonation time) and articulation (speech rate, articulation rate). The articulation variables (speech rate, articulation rate) are related to the age of onset of the first motor symptom. CONCLUSION Dysarthria in SPG4 is frequent and mild, and it did not evolve in conjunction with more advanced motor diseases. This data suggest that diagnosed patients should be screened and referred for speech therapy evaluation and those pathophysiological mechanisms of speech involvement may differ from the length-dependent degeneration of the corticospinal tract.
Collapse
Affiliation(s)
- Lais Alves Jacinto-Scudeiro
- Postgraduate Program in Medicine, Medical Sciences, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
| | - Rui Rothe-Neves
- Phonetics Laboratory of the Faculty of Letters, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | | | - Gustavo Dariva Machado
- Medical Genetics Service, Hospital de Clínicas de Porto Alegre, Porto Alegre, RS, Brazil
| | - Daniela Burguêz
- Medical Genetics Service, Hospital de Clínicas de Porto Alegre, Porto Alegre, RS, Brazil
| | | | - Annelise Ayres
- Universidade Federal de Ciências da Saúde de Porto Alegre, Porto Alegre, RS, Brazil
| | - Rafaela Soares Rech
- Universidade Federal de Ciências da Saúde de Porto Alegre, Porto Alegre, RS, Brazil
| | - Carelis González-Salazar
- Postgraduate Program in Medical Pathophysiology, Universidade Estadual de Campinas, São Paulo, SP, Brazil
| | | | - Jonas Alex Morales Saute
- Postgraduate Program in Medicine, Medical Sciences, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil; Medical Genetics Service, Hospital de Clínicas de Porto Alegre, Porto Alegre, RS, Brazil; Internal Medicine Department, Faculdade de Medicina Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
| | - Maira Rozenfeld Olchik
- Postgraduate Program in Medicine, Medical Sciences, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil; Medical Genetics Service, Hospital de Clínicas de Porto Alegre, Porto Alegre, RS, Brazil; Department of Surgery and Orthopedics, Faculdade de Odontologia, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil.
| |
Collapse
|
25
|
Wang Q, Fu Y, Shao B, Chang L, Ren K, Chen Z, Ling Y. Early detection of Parkinson’s disease from multiple signal speech: Based on Mandarin language dataset. Front Aging Neurosci 2022; 14:1036588. [DOI: 10.3389/fnagi.2022.1036588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Accepted: 10/20/2022] [Indexed: 11/11/2022] Open
Abstract
Parkinson’s disease (PD) is a neurodegenerative disorder that negatively affects millions of people. Early detection is of vital importance. As recent researches showed dysarthria level provides good indicators to the computer-assisted diagnosis and remote monitoring of patients at the early stages. It is the goal of this study to develop an automatic detection method based on newest collected Chinese dataset. Unlike English, no agreement was reached on the main features indicating language disorders due to vocal organ dysfunction. Thus, one of our approaches is to classify the speech phonation and articulation with a machine learning-based feature selection model. Based on a relatively big sample, three feature selection algorithms (LASSO, mRMR, Relief-F) were tested to select the vocal features extracted from speech signals collected in a controlled setting, followed by four classifiers (Naïve Bayes, K-Nearest Neighbor, Logistic Regression and Stochastic Gradient Descent) to detect the disorder. The proposed approach shows an accuracy of 75.76%, sensitivity of 82.44%, specificity of 73.15% and precision of 76.57%, indicating the feasibility and promising future for an automatic and unobtrusive detection on Chinese PD. The comparison among the three selection algorithms reveals that LASSO selector has the best performance regardless types of vocal features. The best detection accuracy is obtained by SGD classifier, while the best resulting sensitivity is obtained by LR classifier. More interestingly, articulation features are more representative and indicative than phonation features among all the selection and classifying algorithms. The most prominent articulation features are F1, F2, DDF1, DDF2, BBE and MFCC.
Collapse
|
26
|
Marczyk A, O'Brien B, Tremblay P, Woisard V, Ghio A. Correlates of vowel clarity in the spectrotemporal modulation domain: Application to speech impairment evaluation. J Acoust Soc Am 2022; 152:2675. [PMID: 36456260 DOI: 10.1121/10.0015024] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 10/13/2022] [Indexed: 06/17/2023]
Abstract
This article reports on vowel clarity metrics based on spectrotemporal modulations of speech signals. Motivated by previous findings on the relevance of modulation-based metrics for speech intelligibility assessment and pathology classification, the current study used factor analysis to identify regions within a bi-dimensional modulation space, the magnitude power spectrum, as in Elliott and Theunissen [(2009). PLoS Comput. Biol. 5(3), e1000302] by relating them to a set of conventional acoustic metrics of vowel space area and vowel distinctiveness. Two indices based on the energy ratio between high and low modulation rates across temporal and spectral dimensions of the modulation space emerged from the analyses. These indices served as input for measurements of central tendency and classification analyses that aimed to identify vowel-related speech impairments in French native speakers with head and neck cancer (HNC) and Parkinson dysarthria (PD). Following the analysis, vowel-related speech impairment was identified in HNC speakers, but not in PD. These results were consistent with findings based on subjective evaluations of speech intelligibility. The findings reported are consistent with previous studies indicating that impaired speech is associated with attenuation in energy in higher spectrotemporal modulation bands.
Collapse
Affiliation(s)
- Anna Marczyk
- Aix-Marseille Université, CNRS, LPL, UMR 7309, Aix-en-Provence, France
| | - Benjamin O'Brien
- Aix-Marseille Université, CNRS, LPL, UMR 7309, Aix-en-Provence, France
| | - Pascale Tremblay
- Universite Laval, Faculte de Medecine, Departement de Readaptation, Quebec City, Quebec G1V 0A6, Canada
| | | | - Alain Ghio
- Aix-Marseille Université, CNRS, LPL, UMR 7309, Aix-en-Provence, France
| |
Collapse
|
27
|
Wee Shin Lim, Shu-I Chiu, Meng-Ciao Wu, Shu-Fen Tsai, Pu-He Wang, Kun-Pei Lin, Yung-Ming Chen, Pei-Ling Peng, Yung-Yaw Chen, Jyh-Shing Roger Jang, Chin-Hsien Lin. An integrated biometric voice and facial features for early detection of Parkinson’s disease. NPJ Parkinsons Dis 2022; 8:145. [PMID: 36309501 DOI: 10.1038/s41531-022-00414-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 10/12/2022] [Indexed: 01/24/2023] Open
Abstract
Hypomimia and voice changes are soft signs preceding classical motor disability in patients with Parkinson's disease (PD). We aim to investigate whether an analysis of acoustic and facial expressions with machine-learning algorithms assist early identification of patients with PD. We recruited 371 participants, including a training cohort (112 PD patients during "on" phase, 111 controls) and a validation cohort (74 PD patients during "off" phase, 74 controls). All participants underwent a smartphone-based, simultaneous recording of voice and facial expressions, while reading an article. Nine different machine learning classifiers were applied. We observed that integrated facial and voice features could discriminate early-stage PD patients from controls with an area under the receiver operating characteristic (AUROC) diagnostic value of 0.85. In the validation cohort, the optimal diagnostic value (0.90) maintained. We concluded that integrated biometric features of voice and facial expressions could assist the identification of early-stage PD patients from aged controls.
Collapse
|
28
|
Skrabal D, Rusz J, Novotny M, Sonka K, Ruzicka E, Dusek P, Tykalova T. Articulatory undershoot of vowels in isolated REM sleep behavior disorder and early Parkinson's disease. NPJ Parkinsons Dis 2022; 8:137. [PMID: 36266347 PMCID: PMC9584921 DOI: 10.1038/s41531-022-00407-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Accepted: 10/04/2022] [Indexed: 11/09/2022] Open
Abstract
Imprecise vowels represent a common deficit associated with hypokinetic dysarthria resulting from a reduced articulatory range of motion in Parkinson's disease (PD). It is not yet unknown whether the vowel articulation impairment is already evident in the prodromal stages of synucleinopathy. We aimed to assess whether vowel articulation abnormalities are present in isolated rapid eye movement sleep behaviour disorder (iRBD) and early-stage PD. A total of 180 male participants, including 60 iRBD, 60 de-novo PD and 60 age-matched healthy controls performed reading of a standardized passage. The first and second formant frequencies of the corner vowels /a/, /i/, and /u/ extracted from predefined words, were utilized to construct articulatory-acoustic measures of Vowel Space Area (VSA) and Vowel Articulation Index (VAI). Compared to controls, VSA was smaller in both iRBD (p = 0.01) and PD (p = 0.001) while VAI was lower only in PD (p = 0.002). iRBD subgroup with abnormal olfactory function had smaller VSA compared to iRBD subgroup with preserved olfactory function (p = 0.02). In PD patients, the extent of bradykinesia and rigidity correlated with VSA (r = -0.33, p = 0.01), while no correlation between axial gait symptoms or tremor and vowel articulation was detected. Vowel articulation impairment represents an early prodromal symptom in the disease process of synucleinopathy. Acoustic assessment of vowel articulation may provide a surrogate marker of synucleinopathy in scenarios where a single robust feature to monitor the dysarthria progression is needed.
Collapse
Affiliation(s)
- Dominik Skrabal
- grid.411798.20000 0000 9100 9940Department of Neurology and Centre of Clinical Neuroscience, First Faculty of Medicine, Charles University and General University Hospital, Prague, Czech Republic
| | - Jan Rusz
- grid.411798.20000 0000 9100 9940Department of Neurology and Centre of Clinical Neuroscience, First Faculty of Medicine, Charles University and General University Hospital, Prague, Czech Republic ,grid.6652.70000000121738213Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic ,grid.5734.50000 0001 0726 5157Department of Neurology & ARTORG Center, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
| | - Michal Novotny
- grid.6652.70000000121738213Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic
| | - Karel Sonka
- grid.411798.20000 0000 9100 9940Department of Neurology and Centre of Clinical Neuroscience, First Faculty of Medicine, Charles University and General University Hospital, Prague, Czech Republic
| | - Evzen Ruzicka
- grid.411798.20000 0000 9100 9940Department of Neurology and Centre of Clinical Neuroscience, First Faculty of Medicine, Charles University and General University Hospital, Prague, Czech Republic
| | - Petr Dusek
- grid.411798.20000 0000 9100 9940Department of Neurology and Centre of Clinical Neuroscience, First Faculty of Medicine, Charles University and General University Hospital, Prague, Czech Republic
| | - Tereza Tykalova
- grid.6652.70000000121738213Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic
| |
Collapse
|
29
|
Wang M, Wen Y, Mo S, Yang L, Chen X, Luo M, Yu H, Xu F, Zou X. Distinctive acoustic changes in speech in Parkinson's disease. COMPUT SPEECH LANG 2022. [DOI: 10.1016/j.csl.2022.101384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
30
|
Terriza M, Navarro J, Retuerta I, Alfageme N, San-Segundo R, Kontaxakis G, Garcia-Martin E, Marijuan PC, Panetsos F. Use of Laughter for the Detection of Parkinson's Disease: Feasibility Study for Clinical Decision Support Systems, Based on Speech Recognition and Automatic Classification Techniques. Int J Environ Res Public Health 2022; 19:10884. [PMID: 36078600 PMCID: PMC9518165 DOI: 10.3390/ijerph191710884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 08/25/2022] [Accepted: 08/27/2022] [Indexed: 06/15/2023]
Abstract
Parkinson's disease (PD) is an incurable neurodegenerative disorder which affects over 10 million people worldwide. Early detection and correct evaluation of the disease is critical for appropriate medication and to slow the advance of the symptoms. In this scenario, it is critical to develop clinical decision support systems contributing to an early, efficient, and reliable diagnosis of this illness. In this paper we present a feasibility study for a clinical decision support system for the diagnosis of PD based on the acoustic characteristics of laughter. Our decision support system is based on laugh analysis with speech recognition methods and automatic classification techniques. We evaluated different cepstral coefficients to identify laugh characteristics of healthy and ill subjects combined with machine learning classification models. The decision support system reached 83% accuracy rate with an AUC value of 0.86 for PD-healthy laughs classification in a database of 20,000 samples randomly generated from a pool of 120 laughs from healthy and PD subjects. Laughter could be employed for the efficient and reliable detection of PD; such a detection system can be achieved using speech recognition and automatic classification techniques; a clinical decision support system can be built using the above techniques. Significance: PD clinical decision support systems for the early detection of the disease will help to improve the efficiency of available and upcoming therapeutic treatments which, in turn, would improve life conditions of the affected people and would decrease costs and efforts in public and private healthcare systems.
Collapse
Affiliation(s)
- Miguel Terriza
- Neuro-Computing & Neuro-Robotics Research Group, Complutense University of Madrid, 28040 Madrid, Spain
- Innovation Group, Institute for Health Research San Carlos Clinical Hospital (IdISSC), 28040 Madrid, Spain
| | - Jorge Navarro
- Department of Economic Structure, CASETEM Research Group, Faculty of Economy, University of Zaragoza, 50009 Zaragoza, Spain
| | - Irene Retuerta
- Independent Researchers, Affiliated to Bioinformation and Systems Biology Group, Aragon Health Sciences Institute (IACS-IIS Aragon), 50009 Zaragoza, Spain
| | - Nuria Alfageme
- Neuro-Computing & Neuro-Robotics Research Group, Complutense University of Madrid, 28040 Madrid, Spain
- Innovation Group, Institute for Health Research San Carlos Clinical Hospital (IdISSC), 28040 Madrid, Spain
| | - Ruben San-Segundo
- Speech Technology Group, Information Processing and Telecommunications Center, 28040 Madrid, Spain
| | - George Kontaxakis
- Biomedical Image Technologies Group, Information Processing and Telecommunications Center, Universidad Politécnica de Madrid, 28040 Madrid, Spain
| | - Elena Garcia-Martin
- Department of Ophthalmology, Miguel Servet University Hospital, 50009 Zaragoza, Spain
- Miguel Servet Ophthalmology Research Group (GIMSO), Aragon Health Research Institute (IIS Aragón), University of Zaragoza, 50009 Zaragoza, Spain
| | - Pedro C. Marijuan
- Independent Researchers, Affiliated to Bioinformation and Systems Biology Group, Aragon Health Sciences Institute (IACS-IIS Aragon), 50009 Zaragoza, Spain
| | - Fivos Panetsos
- Neuro-Computing & Neuro-Robotics Research Group, Complutense University of Madrid, 28040 Madrid, Spain
- Innovation Group, Institute for Health Research San Carlos Clinical Hospital (IdISSC), 28040 Madrid, Spain
| |
Collapse
|
31
|
Dai G, Wang M, Li Y, Guo Z, Jones JA, Li T, Chang Y, Wang EQ, Chen L, Liu P, Chen X, Liu H. Continuous theta burst stimulation over left supplementary motor area facilitates auditory-vocal integration in individuals with Parkinson’s disease. Front Aging Neurosci 2022; 14:948696. [PMID: 36051304 PMCID: PMC9426458 DOI: 10.3389/fnagi.2022.948696] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Accepted: 07/27/2022] [Indexed: 11/26/2022] Open
Abstract
Accumulating evidence suggests that impairment in auditory-vocal integration characterized by abnormally enhanced vocal compensations for auditory feedback perturbations contributes to hypokinetic dysarthria in Parkinson’s disease (PD). However, treatment of this abnormality remains a challenge. The present study examined whether abnormalities in auditory-motor integration for vocal pitch regulation in PD can be modulated by neuronavigated continuous theta burst stimulation (c-TBS) over the left supplementary motor area (SMA). After receiving active or sham c-TBS over left SMA, 16 individuals with PD vocalized vowel sounds while hearing their own voice unexpectedly pitch-shifted two semitones upward or downward. A group of pairwise-matched healthy participants was recruited as controls. Their vocal responses and event-related potentials (ERPs) were measured and compared across the conditions. The results showed that applying c-TBS over left SMA led to smaller vocal responses paralleled by smaller P1 and P2 responses and larger N1 responses in individuals with PD. Major neural generators of reduced P2 responses were located in the right inferior and medial frontal gyrus, pre- and post-central gyrus, and insula. Moreover, suppressed vocal compensations were predicted by reduced P2 amplitudes and enhanced N1 amplitudes. Notably, abnormally enhanced vocal and P2 responses in individuals with PD were normalized by c-TBS over left SMA when compared to healthy controls. Our results provide the first causal evidence that abnormalities in auditory-motor control of vocal pitch production in PD can be modulated by c-TBS over left SMA, suggesting that it may be a promising non-invasive treatment for speech motor disorders in PD.
Collapse
Affiliation(s)
- Guangyan Dai
- Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Meng Wang
- Department of Radiology, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Yongxue Li
- Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Zhiqiang Guo
- School of Computer, Zhuhai College of Science and Technology, Zhuhai, China
| | - Jeffery A. Jones
- Psychology Department and Laurier Centre for Cognitive Neuroscience, Wilfrid Laurier University, Waterloo, ON, Canada
| | - Tingni Li
- Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Yichen Chang
- Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Emily Q. Wang
- Department of Communication Disorders and Sciences, RUSH University Medical Center, Chicago, IL, United States
| | - Ling Chen
- Department of Neurology, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Peng Liu
- Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
- *Correspondence: Peng Liu,
| | - Xi Chen
- Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
- Xi Chen,
| | - Hanjun Liu
- Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Brain Function and Disease, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
- Hanjun Liu,
| |
Collapse
|
32
|
Martínez-Cifuentes R, Soto-Barba J. Desempeño fonético-acústico de vocales en hablantes del español chileno con enfermedad de Parkinson en estadios iniciales. Rev investig logop 2022. [DOI: 10.5209/rlog.79132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
La articulación de los sonidos lingüísticos consonánticos y vocálicos se afecta en la enfermedad de Parkinson (EP). En el caso de las vocales, esta alteración se manifiesta acústicamente en la estructura formántica y en el área de espacio vocálico. Debido a que no se ha explorado esta temática en Chile, la investigación tuvo por objetivo contrastar el desempeño fonético-acústico de vocales entre hablantes del español chileno con EP inicial y sin la enfermedad. Se efectuó un estudio cuantitativo, cuasiexperimental y correlacional. 15 hablantes con EP (M=69.6 años, DE=7.46) y 15 sin EP (M=70.07 años, DE=7.75) leyeron 30 frases que contenían las cinco vocales del español de Chile. Se analizaron los centros de frecuencia (F1 y F2) y los anchos de banda (B1 y B2) de los formantes vocálicos, y cinco índices del área de espacio vocálico. Se evidenciaron diferencias en el B2 de /i/ y /u/ entre personas con y sin EP; en el F1 de /e/ y /u/, el F2 de /u/, el B1 de /e/ y el B2 de /o/ entre hombres con y sin EP; y en el B2 de /i/ entre mujeres con y sin EP (p<.05). De esta forma, se reporta el desempeño acústico de las vocales en hablantes del español chileno con enfermedad de Parkinson.
Collapse
|
33
|
Kouba T, Illner V, Rusz J. Study protocol for using a smartphone application to investigate speech biomarkers of Parkinson's disease and other synucleinopathies: SMARTSPEECH. BMJ Open 2022; 12:e059871. [PMID: 35772829 PMCID: PMC9247696 DOI: 10.1136/bmjopen-2021-059871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
INTRODUCTION Early identification of Parkinson's disease (PD) in its prodromal stage has fundamental implications for the future development of neuroprotective therapies. However, no sufficiently accurate biomarkers of prodromal PD are currently available to facilitate early identification. The vocal assessment of patients with isolated rapid eye movement sleep behaviour disorder (iRBD) and PD appears to have intriguing potential as a diagnostic and progressive biomarker of PD and related synucleinopathies. METHODS AND ANALYSIS Speech patterns in the spontaneous speech of iRBD, early PD and control participants' voice calls will be collected from data acquired via a developed smartphone application over a period of 2 years. A significant increase in several aspects of PD-related speech disorders is expected, and is anticipated to reflect the underlying neurodegeneration processes. ETHICS AND DISSEMINATION The study has been approved by the Ethics Committee of the General University Hospital in Prague, Czech Republic and all the participants will provide written, informed consent prior to their inclusion in the research. The application satisfies the General Data Protection Regulation law requirements of the European Union. The study findings will be published in peer-reviewed journals and presented at international scientific conferences.
Collapse
Affiliation(s)
- Tomáš Kouba
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic
| | - Vojtěch Illner
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic
| | - Jan Rusz
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic
| |
Collapse
|
34
|
Pah ND, Motin MA, Kumar DK. Phonemes based detection of parkinson's disease for telehealth applications. Sci Rep 2022; 12:9687. [PMID: 35690657 DOI: 10.1038/s41598-022-13865-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2021] [Accepted: 05/30/2022] [Indexed: 12/22/2022] Open
Abstract
Dysarthria is an early symptom of Parkinson’s disease (PD) which has been proposed for detection and monitoring of the disease with potential for telehealth. However, with inherent differences between voices of different people, computerized analysis have not demonstrated high performance that is consistent for different datasets. The aim of this study was to improve the performance in detecting PD voices and test this with different datasets. This study has investigated the effectiveness of three groups of phoneme parameters, i.e. voice intensity variation, perturbation of glottal vibration, and apparent vocal tract length (VTL) for differentiating people with PD from healthy subjects using two public databases. The parameters were extracted from five sustained phonemes; /a/, /e/, /i/, /o/, and /u/, recorded from 50 PD patients and 50 healthy subjects of PC-GITA dataset. The features were statistically investigated, and then classified using Support Vector Machine (SVM). This was repeated on Viswanathan dataset with smartphone-based recordings of /a/, /o/, and /m/ of 24 PD and 22 age-matched healthy people. VTL parameters gave the highest difference between voices of people with PD and healthy subjects; classification accuracy with the five vowels of PC-GITA dataset was 84.3% while the accuracy for other features was between 54% and 69.2%. The accuracy for Viswanathan’s dataset was 96.0%. This study has demonstrated that VTL obtained from the recording of phonemes using smartphone can accurately identify people with PD. The analysis was fully computerized and automated, and this has the potential for telehealth diagnosis for PD.
Collapse
|
35
|
Hampsey E, Meszaros M, Skirrow C, Strawbridge R, Taylor RH, Chok L, Aarsland D, Al-Chalabi A, Chaudhuri R, Weston J, Fristed E, Podlewska A, Awogbemila O, Young AH. Protocol for Rhapsody: a longitudinal observational study examining the feasibility of speech phenotyping for remote assessment of neurodegenerative and psychiatric disorders. BMJ Open 2022; 12:e061193. [PMID: 35667724 PMCID: PMC9171270 DOI: 10.1136/bmjopen-2022-061193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
INTRODUCTION Neurodegenerative and psychiatric disorders (NPDs) confer a huge health burden, which is set to increase as populations age. New, remotely delivered diagnostic assessments that can detect early stage NPDs by profiling speech could enable earlier intervention and fewer missed diagnoses. The feasibility of collecting speech data remotely in those with NPDs should be established. METHODS AND ANALYSIS The present study will assess the feasibility of obtaining speech data, collected remotely using a smartphone app, from individuals across three NPD cohorts: neurodegenerative cognitive diseases (n=50), other neurodegenerative diseases (n=50) and affective disorders (n=50), in addition to matched controls (n=75). Participants will complete audio-recorded speech tasks and both general and cohort-specific symptom scales. The battery of speech tasks will serve several purposes, such as measuring various elements of executive control (eg, attention and short-term memory), as well as measures of voice quality. Participants will then remotely self-administer speech tasks and follow-up symptom scales over a 4-week period. The primary objective is to assess the feasibility of remote collection of continuous narrative speech across a wide range of NPDs using self-administered speech tasks. Additionally, the study evaluates if acoustic and linguistic patterns can predict diagnostic group, as measured by the sensitivity, specificity, Cohen's kappa and area under the receiver operating characteristic curve of the binary classifiers distinguishing each diagnostic group from each other. Acoustic features analysed include mel-frequency cepstrum coefficients, formant frequencies, intensity and loudness, whereas text-based features such as number of words, noun and pronoun rate and idea density will also be used. ETHICS AND DISSEMINATION The study received ethical approval from the Health Research Authority and Health and Care Research Wales (REC reference: 21/PR/0070). Results will be disseminated through open access publication in academic journals, relevant conferences and other publicly accessible channels. Results will be made available to participants on request. TRIAL REGISTRATION NUMBER NCT04939818.
Collapse
Affiliation(s)
- Elliot Hampsey
- Institute of Psychiatry, Psychology, & Neuroscience, King's College London, London, UK
| | | | | | - Rebecca Strawbridge
- Institute of Psychiatry, Psychology, & Neuroscience, King's College London, London, UK
| | - Rosie H Taylor
- Institute of Psychiatry, Psychology, & Neuroscience, King's College London, London, UK
| | | | - Dag Aarsland
- Institute of Psychiatry, Psychology, & Neuroscience, King's College London, London, UK
| | - Ammar Al-Chalabi
- Institute of Psychiatry, Psychology, & Neuroscience, King's College London, London, UK
| | - Ray Chaudhuri
- Institute of Psychiatry, Psychology, & Neuroscience, King's College London, London, UK
- Parkinson's Foundation Centre of Excellence, King's College Hospital NHS Foundation Trust, London, UK
| | | | | | - Aleksandra Podlewska
- Institute of Psychiatry, Psychology, & Neuroscience, King's College London, London, UK
- Parkinson's Foundation Centre of Excellence, King's College Hospital NHS Foundation Trust, London, UK
| | - Olabisi Awogbemila
- Parkinson's Foundation Centre of Excellence, King's College Hospital NHS Foundation Trust, London, UK
| | - Allan H Young
- Institute of Psychiatry, Psychology, & Neuroscience, King's College London, London, UK
| |
Collapse
|
36
|
Fan P. Random Forest Algorithm Based on Speech for Early Identification of Parkinson's Disease. Comput Intell Neurosci 2022; 2022:3287068. [PMID: 35586090 DOI: 10.1155/2022/3287068] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/09/2022] [Accepted: 02/04/2022] [Indexed: 11/17/2022]
Abstract
To investigate the effectiveness of identifying patients with Parkinson's disease (PD) from speech signals, various acoustic parameters including prosodic and segmental features are extracted from speech and then the random forest classification (RF) algorithm based on these acoustic parameters is applied to diagnose early-stage PD patients. To validate the proposed method of RF algorithm in early-stage PD identification, this study compares the accuracy rate of RF with that of neurologists' judgments based on auditory test outcomes, and the results clearly show the superiority of the proposed method over its rival. Random forest algorithm based on speech can improve the accuracy of patients' identification, which provides an efficient auxiliary method in the early diagnosis of PD patients.
Collapse
|
37
|
Medina CA, Vargas E, Munger SJ, Miller JE. Vocal changes in a zebra finch model of Parkinson's disease characterized by alpha-synuclein overexpression in the song-dedicated anterior forebrain pathway. PLoS One 2022; 17:e0265604. [PMID: 35507553 PMCID: PMC9067653 DOI: 10.1371/journal.pone.0265604] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 03/06/2022] [Indexed: 11/18/2022] Open
Abstract
Deterioration in the quality of a person's voice and speech is an early marker of Parkinson's disease (PD). In humans, the neural circuit that supports vocal motor control consists of a cortico-basal ganglia-thalamo-cortico loop. The basal ganglia regions, striatum and globus pallidus, in this loop play a role in modulating the acoustic features of vocal behavior such as loudness, pitch, and articulatory rate. In PD, this area is implicated in pathogenesis. In animal models of PD, the accumulation of toxic aggregates containing the neuronal protein alpha-synuclein (αsyn) in the midbrain and striatum result in limb and vocal motor impairments. It has been challenging to study vocal impairments given the lack of well-defined cortico-basal ganglia circuitry for vocalization in rodent models. Furthermore, whether deterioration of voice quality early in PD is a direct result of αsyn-induced neuropathology is not yet known. Here, we take advantage of the well-characterized vocal circuits of the adult male zebra finch songbird to experimentally target a song-dedicated pathway, the anterior forebrain pathway, using an adeno-associated virus expressing the human wild-type αsyn gene, SNCA. We found that overexpression of αsyn in this pathway coincides with higher levels of insoluble, monomeric αsyn compared to control finches. Impairments in song production were also detected along with shorter and poorer quality syllables, which are the most basic unit of song. These vocal changes are similar to the vocal abnormalities observed in individuals with PD.
Collapse
Affiliation(s)
- Cesar A. Medina
- Graduate Interdisciplinary Program in Neuroscience, University of Arizona, Tucson, Arizona, United State of America
- Department of Neuroscience, University of Arizona, Tucson, Arizona, United States of America
| | - Eddie Vargas
- Department of Neuroscience, University of Arizona, Tucson, Arizona, United States of America
| | - Stephanie J. Munger
- Department of Neuroscience, University of Arizona, Tucson, Arizona, United States of America
| | - Julie E. Miller
- Graduate Interdisciplinary Program in Neuroscience, University of Arizona, Tucson, Arizona, United State of America
- Department of Neuroscience, University of Arizona, Tucson, Arizona, United States of America
- Department of Speech, Language, and Hearing Sciences, University of Arizona, Tucson, Arizona, United States of America
- Department of Neurology, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| |
Collapse
|
38
|
Illner V, Tykalová T, Novotný M, Klempíř J, Dušek P, Rusz J. Toward Automated Articulation Rate Analysis via Connected Speech in Dysarthrias. J Speech Lang Hear Res 2022; 65:1386-1401. [PMID: 35302874 DOI: 10.1044/2021_jslhr-21-00549] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
PURPOSE This study aimed to evaluate the reliability of different approaches for estimating the articulation rates in connected speech of Parkinsonian patients with different stages of neurodegeneration compared to healthy controls. METHOD Monologues and reading passages were obtained from 25 patients with idiopathic rapid eye movement sleep behavior disorder (iRBD), 25 de novo patients with Parkinson's disease (PD), 20 patients with multiple system atrophy (MSA), and 20 healthy controls. The recordings were subsequently evaluated using eight syllable localization algorithms, and their performances were compared to a manual transcript used as a reference. RESULTS The Google & Pyphen method, based on automatic speech recognition followed by hyphenation, outperformed the other approaches (automated vs. hand transcription: r > .87 for monologues and r > .91 for reading passages, p < .001) in precise feature estimates and resilience to dysarthric speech. The Praat script algorithm achieved sufficient robustness (automated vs. hand transcription: r > .65 for monologues and r > .78 for reading passages, p < .001). Compared to the control group, we detected a slow rate in patients with MSA and a tendency toward a slower rate in patients with iRBD, whereas the articulation rate was unchanged in patients with early untreated PD. CONCLUSIONS The state-of-the-art speech recognition tool provided the most precise articulation rate estimates. If speech recognizer is not accessible, the freely available Praat script based on simple intensity thresholding might still provide robust properties even in severe dysarthria. Automated articulation rate assessment may serve as a natural, inexpensive biomarker for monitoring disease severity and a differential diagnosis of Parkinsonism.
Collapse
Affiliation(s)
- Vojtěch Illner
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Czech Republic
| | - Tereza Tykalová
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Czech Republic
| | - Michal Novotný
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Czech Republic
| | - Jiří Klempíř
- Department of Neurology and Centre of Clinical Neuroscience, First Faculty of Medicine, Charles University and General University Hospital, Prague, Czech Republic
| | - Petr Dušek
- Department of Neurology and Centre of Clinical Neuroscience, First Faculty of Medicine, Charles University and General University Hospital, Prague, Czech Republic
| | - Jan Rusz
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Czech Republic
- Department of Neurology and Centre of Clinical Neuroscience, First Faculty of Medicine, Charles University and General University Hospital, Prague, Czech Republic
| |
Collapse
|
39
|
Narayana S, Franklin C, Peterson E, Hunter EJ, Robin DA, Halpern A, Spielman J, Fox PT, Ramig LO. Immediate and long-term effects of speech treatment targets and intensive dosage on Parkinson's disease dysphonia and the speech motor network: Randomized controlled trial. Hum Brain Mapp 2022; 43:2328-2347. [PMID: 35141971 PMCID: PMC8996348 DOI: 10.1002/hbm.25790] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Revised: 12/16/2021] [Accepted: 01/07/2022] [Indexed: 11/07/2022] Open
Abstract
This study compared acoustic and neural changes accompanying two treatments matched for intensive dosage but having two different treatment targets (voice or articulation) to dissociate the effects of treatment target and intensive dosage in speech therapies. Nineteen participants with Parkinsonian dysphonia (11 F) were randomized to three groups: intensive treatment targeting voice (voice group, n = 6), targeting articulation (articulation group, n = 7), or an untreated group (no treatment, n = 6). The severity of dysphonia was assessed by the smoothed cepstral peak prominence (CPPS) and neuronal changes were evaluated by cerebral blood flow (CBF) recorded at baseline, posttreatment, and 7-month follow-up. Only the voice treatment resulted in significant posttreatment improvement in CPPS, which was maintained at 7 months. Following voice treatment, increased activity in left premotor and bilateral auditory cortices was observed at posttreatment, and in the left motor and auditory cortices at 7-month follow-up. Articulation treatment resulted in increased activity in bilateral premotor and left insular cortices that were sustained at a 7-month follow-up. Activation in the auditory cortices and a significant correlation between the CPPS and CBF in motor and auditory cortices was observed only in the voice group. The intensive dosage resulted in long-lasting behavioral and neural effects as the no-treatment group showed a progressive decrease in activity in areas of the speech motor network out to a 7-month follow-up. These results indicate that dysphonia and the speech motor network can be differentially modified by treatment targets, while intensive dosage contributes to long-lasting effects of speech treatments.
Collapse
Affiliation(s)
- Shalini Narayana
- Department of Pediatrics, Division of Neurology, University of Tennessee Health Science Center, Memphis, Tennessee, USA.,Department of Anatomy and Neurobiology, University of Tennessee Health Science Center, Memphis, Tennessee, USA.,Neuroscience Institute, Le Bonheur Children's Hospital, Memphis, Tennessee, USA
| | - Crystal Franklin
- Research Imaging Institute, University of Texas Health Science Center, San Antonio, Texas, USA
| | | | - Eric J Hunter
- Department of Communicative Sciences and Disorders, Michigan State University, Lansing, Michigan, USA
| | - Donald A Robin
- Department of Communication Sciences and Disorders, University of New Hampshire, Durham, New Hampshire, USA
| | - Angela Halpern
- LSVT Global Inc, Tucson, Arizona, USA.,National Center for Voice and Speech and Department of Speech-Language and Hearing Sciences, University of Colorado-Boulder, Boulder, Colorado, USA
| | - Jennifer Spielman
- National Center for Voice and Speech and Department of Speech-Language and Hearing Sciences, University of Colorado-Boulder, Boulder, Colorado, USA.,Front Range Voice Care, Denver, Colorado, USA
| | - Peter T Fox
- Research Imaging Institute, University of Texas Health Science Center, San Antonio, Texas, USA.,Audie L. Murphy South Texas Veterans Administration Medical Center, San Antonio, Texas, USA
| | - Lorraine O Ramig
- LSVT Global Inc, Tucson, Arizona, USA.,National Center for Voice and Speech and Department of Speech-Language and Hearing Sciences, University of Colorado-Boulder, Boulder, Colorado, USA.,Columbia University, New York, New York, USA
| |
Collapse
|
40
|
Šimek M, Rusz J. Validation of cepstral peak prominence in assessing early voice changes of Parkinson's disease: Effect of speaking task and ambient noise. J Acoust Soc Am 2021; 150:4522. [PMID: 34972306 DOI: 10.1121/10.0009063] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Accepted: 12/03/2021] [Indexed: 06/14/2023]
Abstract
Although the cepstral peak prominence (CPP) and its variant, the cepstral peak prominence smooth (CPPS), are considered to be robust acoustic measures for the evaluation of dysphonia, whether they are sensitive to capture early voice changes in Parkinson's disease (PD) has not yet been explored. This study aimed to investigate the voice changes via the CPP measures in the idiopathic rapid eye movement sleep behavior disorder (iRBD), a special case of prodromal neurodegeneration, and recently diagnosed and advanced-stage Parkinson's disease (AS-PD) patients using different speaking tasks across noise-free and noisy environments. The sustained vowel phonation, reading of passages, and monologues of 60 early stage untreated PD, 30 advanced-stage Parkinson's disease, 60 iRBD, and 60 healthy control (HC) participants were evaluated. Significant differences were found between the PD groups and controls in sustained phonation via the CPP (p < 0.05) and CPPS (p < 0.01) and the monologue via the CPP (p < 0.01), although neither the CPP nor CPPS measures were sufficiently sensitive to capture the possible prodromal dysphonia in the iRBD. The quality of the CPP and CPPS measures was influenced substantially by the addition of ambient noise. It was anticipated that the CPP measures might serve as a promising digital biomarker in assessing the dysphonia from the early stages of PD.
Collapse
Affiliation(s)
- Michal Šimek
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic
| | - Jan Rusz
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic
| |
Collapse
|
41
|
Hireš M, Gazda M, Drotár P, Pah ND, Motin MA, Kumar DK. Convolutional neural network ensemble for Parkinson's disease detection from voice recordings. Comput Biol Med 2021; 141:105021. [PMID: 34799077 DOI: 10.1016/j.compbiomed.2021.105021] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Revised: 11/02/2021] [Accepted: 11/03/2021] [Indexed: 11/03/2022]
Abstract
The computerized detection of Parkinson's disease (PD) will facilitate population screening and frequent monitoring and provide a more objective measure of symptoms, benefiting both patients and healthcare providers. Dysarthria is an early symptom of the disease and examining it for computerized diagnosis and monitoring has been proposed. Deep learning-based approaches have advantages for such applications because they do not require manual feature extraction, and while this approach has achieved excellent results in speech recognition, its utilization in the detection of pathological voices is limited. In this work, we present an ensemble of convolutional neural networks (CNNs) for the detection of PD from the voice recordings of 50 healthy people and 50 people with PD obtained from PC-GITA, a publicly available database. We propose a multiple-fine-tuning method to train the base CNN. This approach reduces the semantical gap between the source task that has been used for network pretraining and the target task by expanding the training process by including training on another dataset. Training and testing were performed for each vowel separately, and a 10-fold validation was performed to test the models. The performance was measured by using accuracy, sensitivity, specificity and area under the ROC curve (AUC). The results show that this approach was able to distinguish between the voices of people with PD and those of healthy people for all vowels. While there were small differences between the different vowels, the best performance was when/a/was considered; we achieved 99% accuracy, 86.2% sensitivity, 93.3% specificity and 89.6% AUC. This shows that the method has potential for use in clinical practice for the screening, diagnosis and monitoring of PD, with the advantage that vowel-based voice recordings can be performed online without requiring additional hardware.
Collapse
Affiliation(s)
- Máté Hireš
- Intelligent Information Systems Lab, Technical University of Košice, Letná 9, 42001, Košice, Slovakia
| | - Matej Gazda
- Intelligent Information Systems Lab, Technical University of Košice, Letná 9, 42001, Košice, Slovakia
| | - Peter Drotár
- Intelligent Information Systems Lab, Technical University of Košice, Letná 9, 42001, Košice, Slovakia.
| | | | | | | |
Collapse
|
42
|
Knowles T, Adams SG, Jog M. Speech Rate Mediated Vowel and Stop Voicing Distinctiveness in Parkinson's Disease. J Speech Lang Hear Res 2021; 64:4096-4123. [PMID: 34582276 DOI: 10.1044/2021_jslhr-21-00160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Purpose The purpose of this study was to quantify changes in acoustic distinctiveness in two groups of talkers with Parkinson's disease as they modify across a wide range of speaking rates. Method People with Parkinson's disease with and without deep brain stimulation and older healthy controls read 24 carrier phrases at different speech rates. Target nonsense words in the carrier phrases were designed to elicit stop consonants and corner vowels. Participants spoke at seven self-selected speech rates from very slow to very fast, elicited via magnitude production. Speech rate was measured in absolute words per minute and as a proportion of each talker's habitual rate. Measures of segmental distinctiveness included a temporal consonant measure, namely, voice onset time, and a spectral vowel measure, namely, vowel articulation index. Results All talkers successfully modified their rate of speech from slow to fast. Talkers with Parkinson's disease and deep brain stimulation demonstrated greater baseline speech impairment and produced smaller proportional changes at the fast end of the continuum. Increasingly slower speaking rates were associated with increased temporal contrasts (voice onset time) but not spectral contrasts (vowel articulation). Faster speech was associated with decreased contrasts in both domains. Talkers with deep brain stimulation demonstrated more aberrant productions across all speaking rates. Conclusions Findings suggest that temporal and spectral segmental distinctiveness are asymmetrically affected by speaking rate modifications in Parkinson's disease. Talkers with deep brain stimulation warrant further investigation with regard to speech changes they make as they adjust their speaking rate.
Collapse
Affiliation(s)
- Thea Knowles
- Department of Communicative Disorders and Sciences, University at Buffalo, NY
| | - Scott G Adams
- School of Communication Sciences and Disorders, Western University, London, Ontario, Canada
- Health & Rehabilitation Sciences, Western University, London, Ontario, Canada
- Department of Clinical Neurological Sciences, University Hospital, London, Ontario, Canada
| | - Mandar Jog
- Department of Clinical Neurological Sciences, University Hospital, London, Ontario, Canada
| |
Collapse
|
43
|
Vásquez-Correa JC, Rios-Urrego CD, Arias-Vergara T, Schuster M, Rusz J, Nöth E, Orozco-Arroyave JR. Transfer learning helps to improve the accuracy to classify patients with different speech disorders in different languages. Pattern Recognit Lett 2021. [DOI: 10.1016/j.patrec.2021.04.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
44
|
Roldan-Vasco S, Orozco-Duque A, Suarez-Escudero JC, Orozco-Arroyave JR. Machine learning based analysis of speech dimensions in functional oropharyngeal dysphagia. Comput Methods Programs Biomed 2021; 208:106248. [PMID: 34260973 DOI: 10.1016/j.cmpb.2021.106248] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 06/15/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND AND OBJECTIVE The normal swallowing process requires a complex coordination of anatomical structures driven by sensory and cranial nerves. Alterations in such coordination cause swallowing malfunctions, namely dysphagia. The dysphagia screening methods are quite subjective and experience dependent. Bearing in mind that the swallowing process and speech production share some anatomical structures and mechanisms of neurological control, this work aims to evaluate the suitability of automatic speech processing and machine learning techniques for screening of functional dysphagia. METHODS Speech recordings were collected from 46 patients with functional oropharyngeal dysphagia produced by neurological causes, and 46 healthy controls. The dimensions of speech including phonation, articulation, and prosody were considered through different speech tasks. Specific features per dimension were extracted and analyzed using statistical tests. Machine learning models were applied per dimension via nested cross-validation. Hyperparameters were selected using the AUC - ROC as optimization criterion. RESULTS The Random Forest in the articulation related speech tasks retrieved the highest performance measures (AUC=0.86±0.10, sensitivity=0.91±0.12) for individual analysis of dimensions. In addition, the combination of speech dimensions with a voting ensemble improved the results, which suggests a contribution of information from different feature sets extracted from speech signals in dysphagia conditions. CONCLUSIONS The proposed approach based on speech related models is suitable for the automatic discrimination between dysphagic and healthy individuals. These findings seem to have potential use in the screening of functional oropharyngeal dysphagia in a non-invasive and inexpensive way.
Collapse
Affiliation(s)
- Sebastian Roldan-Vasco
- Faculty of Engineering, Instituto Tecnológico Metropolitano, Medellín, Colombia; Faculty of Engineering, Universidad de Antioquia, Medellín, Colombia.
| | - Andres Orozco-Duque
- Faculty of Pure and Applied Sciences, Instituto Tecnológico Metropolitano, Medellín, Colombia
| | - Juan Camilo Suarez-Escudero
- School of Health Sciences, Faculty of Medicine, Universidad Pontificia Bolivariana, Medellín, Colombia; Faculty of Pure and Applied Sciences, Instituto Tecnológico Metropolitano, Medellín, Colombia
| | - Juan Rafael Orozco-Arroyave
- Faculty of Engineering, Universidad de Antioquia, Medellín, Colombia; Pattern Recognition Lab, Friedrich-Alexander-Universität, Erlangen-Nürnberg, Germany.
| |
Collapse
|
45
|
Banks RE, Beal DS, Hunter EJ. Sports Related Concussion Impacts Speech Rate and Muscle Physiology. Brain Inj 2021; 35:1275-1283. [PMID: 34499576 PMCID: PMC8610105 DOI: 10.1080/02699052.2021.1972150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Accepted: 08/09/2021] [Indexed: 10/20/2022]
Abstract
OBJECTIVE Establish objective and subjective speech rate and muscle function differences between athletes with and without sports related concussion (SRC) histories and provide potential motor speech evaluation in SRC. METHODS Over 1,110 speech samples were obtained from 30, 19-22 year-old athletes who had sustained an SRC within the past 2 years and 30 pair-wise matched control athletes with no history of SRC. Speech rate was measured via average time per syllable, average unvoiced time per syllable, and expert perceptual judgment. Speech muscle function was measured via surface electromyography over the obicularis oris, masseter, and segmental triangle. Group differences were assessed using MANOVA, bootstrapping and predictive ROC analyses. RESULTS Athletes with SRC had slower speech rates during DDK tasks than controls as evidenced by longer average time per syllable longer average unvoiced time per syllable and expert judgment of slowed rate. Rate measures were predictive of concussion history. Further, athletes with SRC required more speech muscle activation than controls to complete DDK tasks. CONCLUSION Clear evidence of slowed speech and increased muscle activation during the completion of DDK tasks in athletes with SRC histories relative to controls. Future work should examine speech rate in acute concussion.
Collapse
Affiliation(s)
- Russell E Banks
- Michigan State University, Department of Communicative Sciences and Disorders, East Lansing, MI, USA
- Massachusetts General Hospital, Institute of Health Professions, Boston, MA, USA
| | - Deryk S Beal
- Massachusetts General Hospital, Institute of Health Professions, Boston, MA, USA
- Department of Speech Language Pathology, Rehabilitation Sciences Institute, Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | - Eric J Hunter
- Michigan State University, Department of Communicative Sciences and Disorders, East Lansing, MI, USA
| |
Collapse
|
46
|
García AM, Arias-Vergara T, C Vasquez-Correa J, Nöth E, Schuster M, Welch AE, Bocanegra Y, Baena A, Orozco-Arroyave JR. Cognitive Determinants of Dysarthria in Parkinson's Disease: An Automated Machine Learning Approach. Mov Disord 2021; 36:2862-2873. [PMID: 34390508 DOI: 10.1002/mds.28751] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 07/20/2021] [Accepted: 07/23/2021] [Indexed: 11/06/2022] Open
Abstract
BACKGROUND Dysarthric symptoms in Parkinson's disease (PD) vary greatly across cohorts. Abundant research suggests that such heterogeneity could reflect subject-level and task-related cognitive factors. However, the interplay of these variables during motor speech remains underexplored, let alone by administering validated materials to carefully matched samples with varying cognitive profiles and combining automated tools with machine learning methods. OBJECTIVE We aimed to identify which speech dimensions best identify patients with PD in cognitively heterogeneous, cognitively preserved, and cognitively impaired groups through tasks with low (reading) and high (retelling) processing demands. METHODS We used support vector machines to analyze prosodic, articulatory, and phonemic identifiability features. Patient groups were compared with healthy control subjects and against each other in both tasks, using each measure separately and in combination. RESULTS Relative to control subjects, patients in cognitively heterogeneous and cognitively preserved groups were best discriminated by combined dysarthric signs during reading (accuracy = 84% and 80.2%). Conversely, patients with cognitive impairment were maximally discriminated from control subjects when considering phonemic identifiability during retelling (accuracy = 86.9%). This same pattern maximally distinguished between cognitively spared and impaired patients (accuracy = 72.1%). Also, cognitive (executive) symptom severity was predicted by prosody in cognitively preserved patients and by phonemic identifiability in cognitively heterogeneous and impaired groups. No measure predicted overall motor dysfunction in any group. CONCLUSIONS Predominant dysarthric symptoms appear to be best captured through undemanding tasks in cognitively heterogeneous and preserved cohorts and through cognitively loaded tasks in patients with cognitive impairment. Further applications of this framework could enhance dysarthria assessments in PD. © 2021 International Parkinson and Movement Disorder Society.
Collapse
Affiliation(s)
- Adolfo M García
- Cognitive Neuroscience Center, Universidad de San Andrés, Buenos Aires, Argentina.,National Scientific and Technical Research Council (CONICET), Buenos Aires, Argentina.,Departamento de Lingüística y Literatura, Facultad de Humanidades, Universidad de Santiago de Chile, Santiago, Chile.,Global Brain Health Institute, University of California, San Francisco, California, USA
| | - Tomás Arias-Vergara
- GITA Lab, Faculty of Engineering, Universidad de Antioquia UdeA, Medellín, Colombia.,Pattern Recognition Lab, Friedrich-Alexander University, Erlangen, Nürnberg, Germany.,Department of Otorhinolaryngology, Head and Neck Surgery, Ludwig-Maximilians University, Munich, Germany
| | - Juan C Vasquez-Correa
- GITA Lab, Faculty of Engineering, Universidad de Antioquia UdeA, Medellín, Colombia.,Pattern Recognition Lab, Friedrich-Alexander University, Erlangen, Nürnberg, Germany
| | - Elmar Nöth
- Friedrich-Alexander University Erlangen-Nuremberg
| | - Maria Schuster
- Department of Otorhinolaryngology, Head and Neck Surgery, Ludwig-Maximilians University, Munich, Germany
| | - Ariane E Welch
- Memory and Aging Center, University of California, San Francisco, California, USA
| | - Yamile Bocanegra
- Grupo de Neurociencias de Antioquia, Facultad de Medicina, Universidad de Antioquia, Medellín, Colombia
| | - Ana Baena
- Grupo de Neurociencias de Antioquia, Facultad de Medicina, Universidad de Antioquia, Medellín, Colombia
| | - Juan R Orozco-Arroyave
- GITA Lab, Faculty of Engineering, Universidad de Antioquia UdeA, Medellín, Colombia.,Pattern Recognition Lab, Friedrich-Alexander University, Erlangen, Nürnberg, Germany
| |
Collapse
|
47
|
Ge S, Wan Q, Yin M, Wang Y, Huang Z. Quantitative acoustic metrics of vowel production in mandarin-speakers with post-stroke spastic dysarthria. Clin Linguist Phon 2021; 35:779-792. [PMID: 32985269 DOI: 10.1080/02699206.2020.1827295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2020] [Revised: 09/16/2020] [Accepted: 09/19/2020] [Indexed: 06/11/2023]
Abstract
Impairment of vowel production in dysarthria has been highly valued. This study aimed to explore the vowel production of Mandarin-speakers with post-stroke spastic dysarthria in connected speech and to explore the influence of gender and tone on the vowel production. Multiple vowel acoustic metrics, including F1 range, F2 range, vowel space area (VSA), vowel articulation index (VAI) and formant centralization ratio (FCR), were analyzed from vowel tokens embedded in connected speech produced. The participants included 25 clients with spastic dysarthria secondary to stroke (15 males, 10 females) and 25 speakers with no history of neurological disease (15 males, 10 females). Variance analyses were conducted and the results showed that the main effects of population, gender, and tone on F2 range, VSA, VAI, and FCR were all significant. Vowel production became centralized in the clients with post-stroke spastic dysarthria. Vowel production was found to be more centralized in males compared to females. Vowels in neutral tone (T0) were the most centralized among the other tones. The quantitative acoustic metrics of F2 range, VSA, VAI, and FCR were effective in predicting vowel production in Mandarin-speaking clients with post-stroke spastic dysarthria, and hence may be used as powerful tools to assess the speech performance for this population.
Collapse
Affiliation(s)
- Shengnan Ge
- Department of Education and Rehabilitation, Faculty of Education, East China Normal University, Shanghai, China
| | - Qin Wan
- Department of Education and Rehabilitation, Faculty of Education, East China Normal University, Shanghai, China
| | - Minmin Yin
- Department of Education and Rehabilitation, Faculty of Education, East China Normal University, Shanghai, China
| | - Yongli Wang
- Department of Education and Rehabilitation, Faculty of Education, East China Normal University, Shanghai, China
| | - Zhaoming Huang
- Department of Education and Rehabilitation, Faculty of Education, East China Normal University, Shanghai, China
| |
Collapse
|
48
|
Stehr DA, Hickok G, Ferguson SH, Grossman ED. Examining vocal attractiveness through articulatory working space. J Acoust Soc Am 2021; 150:1548. [PMID: 34470280 DOI: 10.1121/10.0005730] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Accepted: 07/04/2021] [Indexed: 06/13/2023]
Abstract
Robust gender differences exist in the acoustic correlates of clearly articulated speech, with females, on average, producing speech that is acoustically and phonetically more distinct than that of males. This study investigates the relationship between several acoustic correlates of clear speech and subjective ratings of vocal attractiveness. Talkers were recorded producing vowels in /bVd/ context and sentences containing the four corner vowels. Multiple measures of working vowel space were computed from continuously sampled formant trajectories and were combined with measures of speech timing known to co-vary with clear articulation. Partial least squares regression (PLS-R) modeling was used to predict ratings of vocal attractiveness for male and female talkers based on the acoustic measures. PLS components that loaded on size and shape measures of working vowel space-including the quadrilateral vowel space area, convex hull area, and bivariate spread of formants-along with measures of speech timing were highly successful at predicting attractiveness in female talkers producing /bVd/ words. These findings are consistent with a number of hypotheses regarding human attractiveness judgments, including the role of sexual dimorphism in mate selection, the significance of traits signalling underlying health, and perceptual fluency accounts of preferences.
Collapse
Affiliation(s)
- Daniel A Stehr
- Department of Cognitive Sciences, University of California Irvine, 3151 Social Sciences Plaza, Irvine, California 92697, USA
| | - Gregory Hickok
- Department of Cognitive Sciences, University of California Irvine, 3151 Social Sciences Plaza, Irvine, California 92697, USA
| | - Sarah Hargus Ferguson
- Department of Communication Sciences and Disorders, University of Utah, 390 South 1530 East, Room 1201, Salt Lake City, Utah 84112, USA
| | - Emily D Grossman
- Department of Cognitive Sciences, University of California Irvine, 3151 Social Sciences Plaza, Irvine, California 92697, USA
| |
Collapse
|
49
|
Martel-Sauvageau V, Breton M, Chabot A, Langlois M. The Impact of Clear Speech on the Perceptual and Acoustic Properties of Fricative-Vowel Sequences in Speakers With Dysarthria. Am J Speech Lang Pathol 2021; 30:1410-1428. [PMID: 33784184 DOI: 10.1044/2021_ajslp-20-00157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Purpose Studies have reported that clear speech has the potential to influence suprasegmental and segmental aspects of speech, in both healthy and dysarthric speakers. While the impact of clear speech has been studied on the articulation of individual segments, few studies have investigated its effects on coarticulation with multisegment sequences such as fricative-vowel. Objectives The goals of this study are to investigate, in healthy and dysarthric speech, the impact of clear speech on (a) the perception of anticipatory vowel coarticulation in fricatives and (b) the acoustic characteristics of this effect. Method Ten speakers with dysarthria secondary to idiopathic Parkinson's disease were recruited as well as 10 age- and sex-matched healthy speakers. A sentence reading task was performed in natural and clear speaking conditions. The sentences contained words with the initial fricatives /s/ and /ʃ/ preceded by /ə/ and followed by the vowels /i/, /y/, /u/, or /a/. For the perceptual measurements, five listeners were recruited and were asked to predict the upcoming word by listening only to the isolated fricative. Acoustic analyses consisted of spectral moment analysis (M1-M4) on averaged time series. Results Perceptual findings report that identification rates were improved with clear speech for the speakers with dysarthria, but only for the fricative-/i/ sequences. Error pattern analysis indicates that this improvement is associated with an increase in the roundness parameter (lip spreading) identification. Acoustic results are unclear for M1 and M3 but suggest that M2 and M4 differentiation between the rounded versus unrounded vowel contexts is increased with clear speech for the speakers with dysarthria. Discussion Taken together, these findings suggest that clear speech may improve lip coordination in dysarthric speakers with Parkinson's disease. However, the impact of clear speech on the acoustic measures of fricative spectral moments is somewhat limited. This suggests that these metrics, when taken individually, do not capture the entire complexity of fricative-vowel coarticulation.
Collapse
Affiliation(s)
| | - Myriam Breton
- Rehabilitation Department, Université Laval, Québec City, Québec, Canada
| | - Alexandra Chabot
- Rehabilitation Department, Université Laval, Québec City, Québec, Canada
| | | |
Collapse
|
50
|
Maffia M, De Micco R, Pettorino M, Siciliano M, Tessitore A, De Meo A. Speech Rhythm Variation in Early-Stage Parkinson's Disease: A Study on Different Speaking Tasks. Front Psychol 2021; 12:668291. [PMID: 34194369 PMCID: PMC8236634 DOI: 10.3389/fpsyg.2021.668291] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Accepted: 05/17/2021] [Indexed: 11/25/2022] Open
Abstract
Patients with Parkinson's disease (PD) usually reveal speech disorders and, among other symptoms, the alteration of speech rhythm. The purpose of this study is twofold: (1) to test the validity of two acoustic parameters-%V, vowel percentage and VtoV, the mean interval between two consecutive vowel onset points-for the identification of rhythm variation in early-stage PD speech and (2) to analyze the effect of PD on speech rhythm in two different speaking tasks: reading passage and monolog. A group of 20 patients with early-stage PD was involved in this study and compared with 20 age- and sex-matched healthy controls (HCs). The results of the acoustic analysis confirmed that %V is a useful cue for early-stage PD speech characterization, having significantly higher values in the production of patients with PD than the values in HC speech. A simple speaking task, such as the reading task, was found to be more effective than spontaneous speech in the detection of rhythmic variations.
Collapse
Affiliation(s)
- Marta Maffia
- Department of Literary, Linguistics and Comparative Studies, University “L'Orientale, ” Naples, Italy
| | - Rosa De Micco
- Department of Advanced Medical and Surgical Sciences, University of Campania “Luigi Vanvitelli, ” Naples, Italy
| | - Massimo Pettorino
- Department of Literary, Linguistics and Comparative Studies, University “L'Orientale, ” Naples, Italy
| | - Mattia Siciliano
- Department of Advanced Medical and Surgical Sciences, University of Campania “Luigi Vanvitelli, ” Naples, Italy
- Department of Psychology, University of Campania “Luigi Vanvitelli, ” Caserta, Italy
| | - Alessandro Tessitore
- Department of Advanced Medical and Surgical Sciences, University of Campania “Luigi Vanvitelli, ” Naples, Italy
| | - Anna De Meo
- Department of Literary, Linguistics and Comparative Studies, University “L'Orientale, ” Naples, Italy
| |
Collapse
|